Large AI-based language processing models such as OpenAI’s GPT-3 have been criticized for conjuring up racist or offensive text, for making up false information and for requiring enormous amounts of computing power to build and deploy. Still, countless companies have used these foundational models to form the basis of tech products such as customer service chatbots.
The Trevor Project, a nonprofit that provides counseling to LGBTQ+ youth at risk of suicide, is also an adopter of these processing models. However, while the group has found value in customizing these deep-learning neural networks to help expand its ability to help kids in crisis, the organization recognizes where to draw the line.
In fact, while The Trevor Project has used open-source AI models including OpenAI’s GPT-2 and Google’s ALBERT, it does not use tools built with them to carry on conversations directly with troubled kids. Instead, the group has deployed those models to build tools it has used internally to train more than 1,000 volunteer crisis counselors, and to help triage web chats and texts from people in order to prioritize higher-risk contacts and connect them faster to real-life counselors.
The Trevor Project fine-tuned GPT-2 to create a crisis contact simulator featuring two AI-based personas. Named Riley and Drew, these AI-based personas communicate internally with counselor trainees, helping them prepare for the sorts of conversations they will have with actual kids and teens.
Each persona represents a different life situation, background, sexual orientation, gender identity and suicide risk level. Riley mimics a teen in North Carolina who feels depressed and anxious, while Drew is in their early 20s, lives in California and deals with bullying and harassment.
Launched in 2021, Riley was the first of the two personas. Rather than simply using GPT-2 models out of the box, the organization tailored the deep-learning model for its specific purpose by training it using hundreds of role-playing discussions between actual staff counselors and an initial set of data reflecting what someone like Riley might say.
“We trained Riley on many hundreds of past Riley role-plays,” said Dan Fichter, head of AI and Engineering at The Trevor Project, which developed the Riley persona through a partnership with Google’s grant program, Google.org. “The model needs to remember everything that is said and you have asked so far. When we trained GPT on those conversations, we got something that is very reliably responsive in a way our trainers would respond [to],” he said.
The Trevor Project, which has a tech team of 30 people — including some dedicated to machine-learning-related work — later developed the Drew persona on their own.
“When youth reach out, they are always served by a trained and caring human being who is ready to listen and support them no matter what they’re going through,” said Fichter.
Retraining AI models for code-switching, and the Texas effect
While he said the persona models are relatively stable, Fichter said the organization may need to re-train them with new data as the casual language used by kids and teens evolves to incorporate new acronyms, and as current events such as a new law in Texas defining gender-affirming medical care as “child abuse” becomes a topic of conversation, he said.
“There’s a lot of code-switching that happens because they know that they are reaching out to an adult [so] it could mean that there’s a benefit from regular re-training,” Fichter said.
The Trevor Project released data from a 2021 national survey that found that more than 52% of transgender and nonbinary youth "seriously considered suicide" in the past year, and of those, one in five attempted it.
“Health care is a people-focused industry, and when machine learning intersects with people, I think we have to be careful,” said Evan Peterson, a machine-learning engineer at health and wellness tech company LifeOmic who has used open-source language models such as Hugging Face and RoBERTa, a version of BERT developed at Facebook, to build chatbots.
To gauge performance, fairness and equity when it came to certain identity groups, The Trevor Project evaluated a variety of large natural-language-processing and linguistic deep-learning models before deciding which best suited particular tasks. It turned out that when it came to holding a simulated conversation and generating the sort of long, coherent sentence required for a 60-90 minute counselor training session, GPT-2 performed best.
AI for hotline triage and prioritizing risk
But ALBERT performed better than others when testing and validating models for a separate machine-learning system The Trevor Project built to help assess the risk level of people texting or chatting with its suicide prevention hotline. The risk assessment model is deployed when people in crisis contact the hotline. Based on responses to basic intake questions about someone’s state of mind and history with suicidality, the model assesses their level of risk for suicide, classifying it with a numerical score.
Tailoring large language models for particular purposes with highly specific training data sets is one way users such as The Trevor Project have taken advantage of their benefits while taking care not to facilitate more troubling digital conversations.
Photo: The Trevor Project
The model performs the evaluations according to a wide range of statements with varying levels of detail. While it may be difficult for humans — and deep-learning models — to gauge suicide risk if someone simply says, “I’m not feeling great,” the ALBERT-based model is “pretty good” at learning emotional terms that correlate with suicide risk such as language describing ideation or details of a plan, Fichter said. When configuring the model to categorize risk, the group erred on the side of caution by scoring someone as higher risk when it wasn’t entirely clear, he said.
To prepare data to train its risk assessment model, the organization looked to real-world assessments performed during full crisis conversations. Relying on people’s subjective opinions out of context can introduce bias when labeling training data, but using real-world clinical risk assessments that can be mapped back to whether a young person should or should not have been placed in the priority queue helped reduce that potential bias.
In the past, human counselors used a heuristic rules-based system to triage callers, said Fichter, who said he believes the AI-based process provides “a much more accurate prediction.”
Mocking TV shows (but evading worse problems)
The Trevor Project balances benefits of large language models against potential problems by limiting how they are used, Fichter said. He pointed to the strictly internal use of the GPT-2-based persona models for generating language for counselor training purposes, and use of the ALBERT-based risk assessment model only to prioritize how soon a counselor should speak to a contact.
Still, open-source, large natural-language processing models including various iterations of OpenAI’s GPT — generative pre-trained transformer — have generated a reputation as toxic language factories. They have been criticized for producing text that perpetuates stereotypes and spews nasty language, in part because they were trained using data gleaned from an internet where such language is commonplace. Groups including OpenAI are continuously working to improve toxicity and accuracy problems associated with large language models.
“There is ongoing research to ground them to ‘be good citizen models’” said Peterson. However, he said that machine -learning systems “can make mistakes [and] there are situations in which that is not acceptable.”
Meanwhile, large language models regularly burst on the scene. Microsoft on Tuesday introduced new AI models it said it has deployed to improve common language understanding tasks such as name entity recognition, text summarization, custom text classification and key phrase extraction.
Tailoring those models for particular purposes with highly specific training data sets is one way users such as The Trevor Project have worked to take advantage of their benefits while taking care to ensure they do not facilitate more troubling digital conversations.
“Because we were able to fine-tune it to perform very specific work, and purely for our internal [Riley and Drew personas], our model has not generated any offensive output,” Fichter said.
When developing both its crisis contact simulator and risk assessment model, the organization removed names or other personally identifiable information from data it used to train the persona models.
But privacy protection wasn’t the only reason, said Fichter. His team did not want the machine-learning models to draw conclusions about people with certain names, which could result in model bias. For example, they didn’t want them to conclude that someone with the name “Jane” was always a bully just because a teen in crisis in a role-playing scenario complained about someone with that name.
So far, Fichter said the crisis contact simulator personas have not used any inappropriate or odd words. In general, they might simply respond, “I don’t know,” if they cannot generate relevant language.
Still, he said that Drew — the 20-something Californian — has mocked Netflix’s social-media competition show “The Circle.” “Drew has made fun of some TV shows he’s been watching,” Fichter said.
This story was updated to clarify how the Trevor Project prepared data for its training models and how people in crisis use its services.