Preprint

Gender and Racial Fairness in Depression Research using Social Media

Authors:
Preprints and early-stage research may not have been peer reviewed yet.
To read the file of this research, you can request a copy directly from the authors.

Abstract

Multiple studies have demonstrated that behavior on internet-based social media platforms can be indicative of an individual's mental health status. The widespread availability of such data has spurred interest in mental health research from a computational lens. While previous research has raised concerns about possible biases in models produced from this data, no study has quantified how these biases actually manifest themselves with respect to different demographic groups, such as gender and racial/ethnic groups. Here, we analyze the fairness of depression classifiers trained on Twitter data with respect to gender and racial demographic groups. We find that model performance systematically differs for underrepresented groups and that these discrepancies cannot be fully explained by trivial data representation issues. Our study concludes with recommendations on how to avoid these biases in future research.

No file available

Request Full-text Paper PDF

To read the file of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
On March 11, 2020, the World Health Organization declared the coronavirus disease 2019 (COVID-19) outbreak as a pandemic, with over 720,000 cases reported in more than 203 countries as of 31 March. The response strategy included early diagnosis, patient isolation, symptomatic monitoring of contacts as well as suspected and confirmed cases, and public health quarantine. In this context, telemedicine, particularly video consultations, has been promoted and scaled up to reduce the risk of transmission, especially in the United Kingdom and the United States of America. Based on a literature review, the first conceptual framework for telemedicine implementation during outbreaks was published in 2015. An updated framework for telemedicine in the COVID-19 pandemic has been defined. This framework could be applied at a large scale to improve the national public health response. Most countries, however, lack a regulatory framework to authorize, integrate, and reimburse telemedicine services, including in emergency and outbreak situations. In this context, Italy does not include telemedicine in the essential levels of care granted to all citizens within the National Health Service, while France authorized, reimbursed, and actively promoted the use of telemedicine. Several challenges remain for the global use and integration of telemedicine into the public health response to COVID-19 and future outbreaks. All stakeholders are encouraged to address the challenges and collaborate to promote the safe and evidence-based use of telemedicine during the current pandemic and future outbreaks. For countries without integrated telemedicine in their national health care system, the COVID-19 pandemic is a call to adopt the necessary regulatory frameworks for supporting wide adoption of telemedicine.
Article
Importance No US national data are available on the prevalence and correlates of DSM-5–defined major depressive disorder (MDD) or on MDD specifiers as defined in DSM-5. Objective To present current nationally representative findings on the prevalence, correlates, psychiatric comorbidity, functioning, and treatment of DSM-5 MDD and initial information on the prevalence, severity, and treatment of DSM-5 MDD severity, anxious/distressed specifier, and mixed-features specifier, as well as cases that would have been characterized as bereavement in DSM-IV. Design, Setting, and Participants In-person interviews with a representative sample of US noninstitutionalized civilian adults (≥18 years) (n = 36 309) who participated in the 2012-2013 National Epidemiologic Survey on Alcohol and Related Conditions III (NESARC-III). Data were collected from April 2012 to June 2013 and were analyzed in 2016-2017. Main Outcomes and Measures Prevalence of DSM-5 MDD and the DSM-5 specifiers. Odds ratios (ORs), adjusted ORs (aORs), and 95% CIs indicated associations with demographic characteristics and other psychiatric disorders. Results Of the 36 309 adult participants in NESARC-III, 12-month and lifetime prevalences of MDD were 10.4% and 20.6%, respectively. Odds of 12-month MDD were significantly lower in men (OR, 0.5; 95% CI, 0.46-0.55) and in African American (OR, 0.6; 95% CI, 0.54-0.68), Asian/Pacific Islander (OR, 0.6; 95% CI, 0.45-0.67), and Hispanic (OR, 0.7; 95% CI, 0.62-0.78) adults than in white adults and were higher in younger adults (age range, 18-29 years; OR, 3.0; 95% CI, 2.48-3.55) and those with low incomes ($19 999 or less; OR, 1.7; 95% CI, 1.49-2.04). Associations of MDD with psychiatric disorders ranged from an aOR of 2.1 (95% CI, 1.84-2.35) for specific phobia to an aOR of 5.7 (95% CI, 4.98-6.50) for generalized anxiety disorder. Associations of MDD with substance use disorders ranged from an aOR of 1.8 (95% CI, 1.63-2.01) for alcohol to an aOR of 3.0 (95% CI, 2.57-3.55) for any drug. Most lifetime MDD cases were moderate (39.7%) or severe (49.5%). Almost 70% with lifetime MDD had some type of treatment. Functioning among those with severe MDD was approximately 1 SD below the national mean. Among 12.9% of those with lifetime MDD, all episodes occurred just after the death of someone close and lasted less than 2 months. The anxious/distressed specifier characterized 74.6% of MDD cases, and the mixed-features specifier characterized 15.5%. Controlling for severity, both specifiers were associated with early onset, poor course and functioning, and suicidality. Conclusions and Relevance Among US adults, DSM-5 MDD is highly prevalent, comorbid, and disabling. While most cases received some treatment, a substantial minority did not. Much remains to be learned about the DSM-5 MDD specifiers in the general population.
Article
LGBTQ adolescents experience higher rates of mental health disorders than their heterosexual peers. The purpose of this systematic review of the literature was to examine studies evaluating social support and its effects on mental health in the LGBTQ adolescent population. Higher levels of social support were associated with positive self-esteem. Lack of social support (or low social support) was associated with higher levels of depression, anxiety, alcohol or drug misuse, risky sexual behaviors, shame, and low self-esteem. Interdisciplinary research teams from multiple and diverse professions could provide valuable insight supporting the development of inclusive and comprehensive interventions programs for this population.
Article
Mental illnesses adversely affect a significant proportion of the population worldwide. However, the methods traditionally used for estimating and characterizing the prevalence of mental health conditions are time-consuming and expensive. Consequently, best-available estimates concerning the prevalence of mental health conditions are often years out of date. Automated approaches to supplement these survey methods with broad, aggregated information derived from social media content provides a potential means for near real-time estimates at scale. These may, in turn, provide grist for supporting, evaluating and iteratively improving upon public health programs and interventions. We propose a novel model for automated mental health status quantification that incorporates user embeddings. This builds upon recent work exploring representation learning methods that induce embeddings by leveraging social media post histories. Such embeddings capture latent characteristics of individuals (e.g., political leanings) and encode a soft notion of homophily. In this paper, we investigate whether user embeddings learned from twitter post histories encode information that correlates with mental health statuses. To this end, we estimated user embeddings for a set of users known to be affected by depression and post-traumatic stress disorder (PTSD), and for a set of demographically matched `control' users. We then evaluated these embeddings with respect to: (i) their ability to capture homophilic relations with respect to mental health status; and (ii) the performance of downstream mental health prediction models based on these features. Our experimental results demonstrate that the user embeddings capture similarities between users with respect to mental conditions, and are predictive of mental health.
Article
There is growing interest in using social networking sites such as Twitter to gather real-time data on the reactions and opinions of a region's population, including locations in the developing world where social media has played an important role in recent events, such as the 2011 Arab Spring. However, many interesting and important opinions and reactions may differ significantly within a given region depending on the demographics of the subpopulation, including such categories as gender and ethnicity. This information may not be explicitly available in user content or metadata, however, and automated methods are required to infer such hidden attributes. In this paper we describe a method to infer the gender of Twitter users from only the content of their tweets. Looking at Twitter users from the West African nation of Nigeria, we applied supervised machine learning using features derived from the content of user tweets to train a classifier. Using unigram features alone, we obtained an accuracy of 80% for predicting gender, suggesting that content alone can be a good predictor of gender. An analysis of the highest weighted features shows some interesting distinctions between men and women both topically and emotionally. We argue that approaches such as the one described here can give us a clearer picture of who is utilizing social media when certain user attributes are unreliable or not available Copyright © 2012, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
Conference Paper
We present TweetMotif, an exploratory search applica- tion for Twitter. Unlike traditional approaches to in- formation retrieval, which present a simple list of mes- sages, TweetMotif groups messages by frequent signif- icant terms — a result set's subtopics — which facili- tate navigation and drilldown through a faceted search interface. The topic extraction system is based on syn- tactic filtering, language modeling, near-duplicate de- tection, and set cover heuristics. We have used Tweet- Motif to deflate rumors, uncover scams, summarize sentiment, and track political protests in real-time. A demo of TweetMotif, plus its source code, is available at http://tweetmotif.com.
Article
Behavioral scientists routinely publish broad claims about human psychology and behavior in the world's top journals based on samples drawn entirely from Western, Educated, Industrialized, Rich, and Democratic (WEIRD) societies. Researchers - often implicitly - assume that either there is little variation across human populations, or that these "standard subjects" are as representative of the species as any other population. Are these assumptions justified? Here, our review of the comparative database from across the behavioral sciences suggests both that there is substantial variability in experimental results across populations and that WEIRD subjects are particularly unusual compared with the rest of the species - frequent outliers. The domains reviewed include visual perception, fairness, cooperation, spatial reasoning, categorization and inferential induction, moral reasoning, reasoning styles, self-concepts and related motivations, and the heritability of IQ. The findings suggest that members of WEIRD societies, including young children, are among the least representative populations one could find for generalizing about humans. Many of these findings involve domains that are associated with fundamental aspects of psychology, motivation, and behavior - hence, there are no obvious a priori grounds for claiming that a particular behavioral phenomenon is universal based on sampling from a single subpopulation. Overall, these empirical patterns suggests that we need to be less cavalier in addressing questions of human nature on the basis of data drawn from this particularly thin, and rather unusual, slice of humanity. We close by proposing ways to structurally re-organize the behavioral sciences to best tackle these challenges.
and Stefan health content in social media
  • Alina Arseniev-Koehler
  • Sharon Mozgai
Alina Arseniev-Koehler, Sharon Mozgai, and Stefan health content in social media. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, CHI '16, page 2098-2110, New York, NY, USA. Association for Computing Machinery.
Lipstick on a pig: Debiasing methods cover up systematic gender biases in word embeddings but do not remove them
  • Hila Gonen
  • Yoav Goldberg
Hila Gonen and Yoav Goldberg. 2019. Lipstick on a pig: Debiasing methods cover up systematic gender biases in word embeddings but do not remove them. arXiv preprint arXiv:1903.03862.
Equality of opportunity in supervised learning
  • Moritz Hardt
  • Eric Price
  • Nathan Srebro
Moritz Hardt, Eric Price, and Nathan Srebro. 2016. Equality of opportunity in supervised learning. arXiv preprint arXiv:1610.02413.
The epidemiology of major depressive disorder: results from the national comorbidity survey replication (ncs-r)
  • C Ronald
  • Patricia Kessler
  • Olga Berglund
  • Robert Demler
  • Doreen Jin
  • Kathleen R Koretz
  • John Merikangas
  • Ellen E Rush
  • Philip S Walters
  • Wang
Ronald C Kessler, Patricia Berglund, Olga Demler, Robert Jin, Doreen Koretz, Kathleen R Merikangas, A John Rush, Ellen E Walters, and Philip S Wang. 2003. The epidemiology of major depressive disorder: results from the national comorbidity survey replication (ncs-r). Jama, 289(23):3095-3105.
Twitter's glass ceiling: The effect of perceived gender on online visibility
  • Shirin Nilizadeh
  • Anne Groggel
  • Peter Lista
  • Srijita Das
  • Yong-Yeol Ahn
  • Apu Kapadia
  • Fabio Rojas
Shirin Nilizadeh, Anne Groggel, Peter Lista, Srijita Das, Yong-Yeol Ahn, Apu Kapadia, and Fabio Rojas. 2016. Twitter's glass ceiling: The effect of perceived gender on online visibility. In Tenth International AAAI Conference on Web and Social Media.