What do users think about mental health chatbots?

Digital health technologies have been hailed as holding great potential to augment traditional mental health care and help deliver treatment to this who need it (Torous et al 2021). Our smartphones, for example, now provide access to cognitive behavioural therapy and ‘just-in-time therapies’, and virtual reality can help with most anxiety disorders (Meyerbroker and Morina 2021). Chatbots – that is “systems that are able to converse and interact with human users using spoken, written, and visual languages” (Abd-alrazaqa et al 2020:1) – are now being incorporated into many of these mental health support tools.

According to some experts, these Chatbots look set to play an increasing role in future mental healthcare (Ward 2021). Whilst considerable research has been carried out on the efficacy, feasibility and ease of use of mental health chatbots (see for example, Vaidyam et al 2019 for a review), less attention has been paid to users’ everyday experiences of using them.

Clearly, while these innovative digital technologies have great promise, we need careful examination of how they are received by users. This is why the recent paper by Alaa A Abd-Alrazaq and colleagues (Abd-Alrazaq et al 2021) caught our eye. Their study aimed to conduct a scoping review of the perceptions and opinions of patients about chatbots for mental health – and highlight research and practical implications.

Despite chatbots being increasingly introduced to the field of mental health we know relatively little about how patients use and experience them.

Methods

The authors used PRISMA guidelines to carry out a systematic and transparent scoping review. Search sources included – inter alia– Medline, Embase, PyscINFO, Scopus and Google Scholar (with backward and forward reference checking). Search terms were arrived at through checking previous literature reviews and discussions with informatics experts. The authors excluded chatbots which were integrated into robotics, serious games, short message service (SMS), telephone system or those that depended on a human generating text. Reviews, proposals, editorial and conference abstracts were also excluded. Two reviewers independently screened titles and abstracts for eligibility. The authors state that a narrative approach was used to synthesise, and that thematic analysis was conducted following the method detailed in Braun and Clarke (2006).

Results

Initial searches returned 1,072 citations. Of these, 429 were duplicates, 514 were excluded after scanning titles and abstracts and a further 98 were excluded following a scan of full text. Across the different stages, reasons for exclusion included the paper being deemed irrelevant (n=389) and type of publication (n=100). Of the 37 studies ultimately included in the analysis two thirds were journal articles, 62% were published between 2015 and 2019 and the most common design was cross-sectional. Whilst the majority of studies involved rule based chatbots, there was considerable heterogeneity in what the chatbots were being used for: self-management (n=6), therapeutic purposes (n=12), training (n=9), counselling (n=5), screening (n=4) and diagnosis (n=1).

Thematic analysis of the 37 papers generated 10 themes: (i) usefulness, (ii) ease of use; (iii) responsiveness; (iv) understandability; (v) acceptability; (vi) attractiveness; (vii) trustworthiness; (viii) enjoyability; (ix) content; (x) comparisons. Usefulness and ease of use appeared most comprehensively across the analysed papers. Components which were considered useful by patients included real time feedback, diaries and weekly summaries. Whilst participants across studies tended to rate usability and ease of use of chatbots as high, there were also reports of difficulties in navigating, technical glitches and not having enough options to reply to.

The ‘responsiveness’ theme brings together findings on speed, friendliness, realism, repetitiveness and empathy. These appear equivocal, with mixed results on perceived realism of responses and speed of responses being considered variously as appropriate, too fast and too slow. Whilst patients generally believe that chatbots are able to provide friendly and emotional responses, there were mixed perceptions about whether they could in turn generate friendly and emotional responses from their users. Conversations could also be perceived as shallow, confusing or too short.

Across the other themes, key things noted include how interactional enjoyment and perceived trust are significant mediators of chatbot interaction. ‘Comparisons’ also appear important. According to Brown and Halpern (2021), from an ethical standpoint, humans and not chatbots should be available as front-line providers. Clinical empathy needs to be maximised in every medical caregiving interaction and the principle of beneficence (which involves tailoring care to an individuals’ circumstances) remains a key challenge for chatbots. Findings in the review are again equivocal. Whilst one of the papers reviewed suggested that patients preferred interaction with a chatbot rather than a human for their health care another found that participants report greater rapport with a real expert than with a rule-based chatbot. Far more attention also appears to have been paid to comparisons across chatbots than whether chatbots are in themselves a desirable solution. Empathetic, embodied virtual agents are preferred, but studies also suggest that participants are more willing to disclose information to a text-based chatbot that to an (non)empathetic chatbot or human counsellor.

This review suggests that mental health patient perceptions and opinions of chatbots are complex and equivocal.

Conclusions

The authors concluded that:

the results demonstrated that there are overall positive perceptions and opinions of patients about mental health chatbots, although there is some skepticism toward trustworthiness and usefulness.

It is also claimed that there:

are features of chatbots that health care providers cannot deliver over a long period.

However, most attention is given to issues to be addressed and implications for research and practice. We discuss these further below.

Chatbots are seen as having some advantages over human healthcare providers, but there are issues around trustworthiness and usefulness.

Strengths and limitations

In discussing strengths and limitations the authors note that following PRISMA guidelines ensured that the review was of high-quality. Selection bias was also reduced through 2 reviewers working independently of each other. Limitations noted by the authors include only focusing on particular types of chatbots, restricting studies to English and also choosing to not search interdisciplinary databases, such as Web of Science and ProQuest. This is a fair assessment. Using Google Scholar and carrying out backward and forward referencing checks would have helped identify considerable amounts of grey literature. However, coverage would equally have been impacted by not conducting ‘interdisciplinary searches’, conducting manual searches or contacting experts (page 11).

There are also some challenges beyond those noted by the authors. The authors do not really tackle the issue of differing disciplinary perspectives on mental health and on chatbots. On the one hand, the authors do recognise difference across studies; for example, when they write that “scoping reviews are generally accepted as more appropriate when diversity of study designs is expected.” At the same time, this heterogeneity seems to become lost in the analysis and discussion, and we do not really hear about the methods used to define and capture participant experience in each case. It would be helpful to know whether we are looking at short-term use or longer-term studies and whether users have freely selected chatbots or been pre-assigned to specific tools. It is also not clear why the authors decided against ‘interdisciplinary searches’. Disciplinary heterogeneity, and stepping outside of our traditional silos, would seem important if we are to understand patient experiences of everyday chatbot use.

Whilst these points should not detract from the robustness of extraction, selection and summary, we do need to be mindful of the analysis. Thematic analysis quite rightly privileges the analyst’s interpretation. This does mean that the authors could have done more to show how they “followed the highly recommended guidelines” and, in particular, how the overall story was arrived at (step 5 of Braun and Clarke). The authors could also have reflected a little more on the subjective nature of thematic analysis. What we consider to be key themes, findings and implications for practice, for example, will likely be influenced by the disciplines we work within in and across. To us, a core conclusion would seem to be that findings across the papers are equivocal.

We need to step outside of disciplinary silos if we are to really begin to understand patient experiences of everyday chatbot use.

Implications for practice

The authors identify a range of implications for practice and research. For example, it is suggested that we need to create high-quality chatbots that are able to respond to the user in multiple ways – and that a mental health chatbot must be empathetic. Recommendations also include making sure that the chatbot-patient relationship is trustworthy. Abd-Alrazaq (2021) suggests that blended therapy and physician recommendation will help. Others are more reticent, with Brown and Halpern (2021) stating that a chatbot cannot be programmed to account for the reciprocal human experience of trust, social understanding and validation; and Torous et al (2021) suggesting that trust from both patients and clinicians remains low with digital health technologies generally. With respect to research implications, the authors suggest that there is still a need to improve the linguistic capabilities of mental health chatbots. Also methods have to be developed to deal with unexpected user input and benchmarks have to be created to enable evaluation.

Whilst these are solid points, two further recommendations arise from what is left unsaid.

From a practical point of view, when making recommendations which touch on implementation it is useful to make clear what implementation science framework is being used. For example, Torous et al (2021) draw on the Integrated Promoting Action on Research Implementation in Health Services framework which focuses on the innovation, the recipients of the innovation and the context surrounding the innovation.

From a research point of view, it is notable that none of the implications put forward by the authors involve learning more about patient perceptions and opinions about chatbots. Yet this gap in current research would appear to be one of the clearest outcomes from the review. It remains unclear who is using chatbots, why they are being downloaded and how they are being incorporated into everyday life. It would also appear that research needs to consider the technology-human dynamic in more complex ways. For one thing, values and politics are incorporated into the design of technologies and chatbots are likely coded with ideologies which reflect and refract contemporary knowledge claims about mental health. Technologies are also performative (Halford et al 2016). Understanding perceptions and opinions, then, involves understanding how users make sense of technologies in context and how effects emerge as these technologies are brought into use.

This is the starting point for our own study, funded by the British Academy. This study seeks to explore the everyday use of mental health chatbots; paying particular attention to why people use these technologies, how the technology gets incorporated into everyday life and how the technology works with and against meanings of mental health/recovery. Anyone interested in taking part can find more details here: Online Survey Software | Qualtrics Survey Solutions

We still have a long way to go before we fully understand perceptions and opinions of mental health chatbots.

Statement of interests

As noted above, Robert Meadows and Christine Hine are currently carrying out a British Academy funded project titled: “Chatbots and the shaping of mental health recovery”.

The everyday use of mental health chatbots: A sociological study

Are you currently using a chatbot to help with your mental health and wellbeing?
Have you been using this for at least three weeks?
Are you aged 18 or over?
Are you currently living in the United Kingdom?

If you would be interested in taking part in a research project exploring the everyday use of chatbots for mental health, please read on.

Chatbots are machines which can interact like a human – engaging people in daily conversations, offering self-help, mood tracking, meditation and mental health exercises.

They are increasingly being developed to assist with mental health and wellbeing. Examples include Woebot, Wysa, Tess, Joyable and Youper.

As social researchers, we are interested in how and why people use these and what they think of them.

How to take part

Join an interview conversation:

We are looking for about 30 people (aged 18 and over) who are currently using a chatbot for mental health and wellbeing.
You will be asked to talk about how you came to use the chatbot and your experiences of using them.
We will also discuss what you think the future of these technologies will look like.
All participants will receive a gift card for taking part.

Find out more and register to take part.

Links

Primary paper

Abd-Alrazaq A, Alajlani M, Ali N, Denecke K, Bewick B, Househ M. (2021) Perceptions and Opinions of Patients About Mental Health Chatbots: Scoping Review. J Med Internet Res 2021;23(1):e17828. DOI: 10.2196/17828

Other references

Abd-alrazaq, A.A., Alajlani, M., Alalwan, A.A., Bewick, B.M., Gardner, P., Househ, M. 2020. An overview of the features of chatbots in mental health: A scoping review. International Journal of Medical Informatics 132, 103978 https://doi.org/10.1016/j.ijmedinf.2019.103978

Braun, V. and Clarke, V., 2006. Using thematic analysis in psychology. Qualitative research in psychology, 3(2), pp.77-101.

Brown, J.E. and Halpern, J., 2021. AI chatbots cannot replace human interactions in the pursuit of more inclusive mental healthcare. SSM-Mental Health, 1, p.100017.

Halford, S., Weal, M., Tinati, R., Carr, L. and Pope, C., 2016. Digital Data Infrastructures: interrogating the social media data pipeline. AoIR Selected Papers of Internet Research. View of Digital Data Infrastructures: interrogating the social media data pipeline (aoir.org)

Meyerbröker, K. and Morina, N., 2021. The use of virtual reality in assessment and treatment of anxiety and related disorders. Clinical Psychology & Psychotherapy, 28(3), pp.466-476.

Torous, J., Bucci, S., Bell, I.H., Kessing, L.V., Faurholt‐Jepsen, M., Whelan, P., Carvalho, A.F., Keshavan, M., Linardon, J. and Firth, J., 2021. The growing field of digital psychiatry: current evidence and the future of apps, social media, chatbots, and virtual reality. World Psychiatry, 20(3), pp.318-335.

Vaidyam, A.N., Wisniewski, H., Halamka, J.D., Kashavan, M.S. and Torous, J.B., 2019. Chatbots and conversational agents in mental health: a review of the psychiatric landscape. The Canadian Journal of Psychiatry, p.0706743719828977

Ward, L. (2021). “Can artificial intelligence replace human therapists?”. Wall Street

Journal, 2021, March 27 edition Can Artificial Intelligence Replace Human Therapists? – WSJ

The Mental Elf