Psychological wellbeing concerns and mental health challenges are known to be characterized by latent processes that include negative perspectives, self-focused attention, loss of self-worth and self-esteem, and social disengagement. Today, with social media, many of these latent processes, such as the context and content as one’s affective, behavioral, and cognitive reactions, are recorded in the present and can be observed longitudinally and unobtrusively. At the same time, individuals are increasingly appropriating social media platforms to engage in candid disclosures of various wellbeing challenges.
De Choudhury and colleagues found that social media-based linguistic and psychological measures from the prenatal period among new mothers can characterize and predict, based on a Support Vector Machine classifier, postpartum depression (PPD) in new mothers.1Munmun De Choudhury, Scott Counts, and Eric Horvitz, “Predicting Postpartum Changes in Emotion and Behavior via Social Media,” in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’13) (New York, NY: Association for Computing Machinery, 2013), 3267–3276, https://doi.org/10.1145/2470654.2466447; Munmun De Choudhury et al., “Characterizing and Predicting Postpartum Depression from Shared Facebook Data,” in Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work & Social Computing, CSCW ’14 (New York, NY: Association for Computing Machinery, 2014), 626–638, https://doi.org/10.1145/2531602.2531675. These correlates include lowered positive affect and raised negativity, and greater use of first-person pronouns indicating higher self-attentional focus: a prominent symptom in PPD. Other work of De Choudhury has also successfully predicted with 71 percent accuracy, based on machine learning methods, risk to depression, prior to reported onset, from individuals’ Twitter data.2Munmun De Choudhury et al., “Predicting Depression via Social Media,” in Proceedings of the Seventh International Association for the Advancement of Artificial Intelligence (AAAI) Conference on Weblogs and Social Media (Association for the Advancement of Artificial Intelligence, 2013), 1431–42; Munmun De Choudhury, Scott Counts, and Eric Horvitz, “Social Media as a Measurement Tool of Depression in Populations,” in Proceedings of the 5th Annual ACM Web Science Conference, WebSci ’13 (New York, NY: Association for Computing Machinery, 2013), 47–56, https://doi.org/10.1145/2464464.2464480. Various measures spanning language, social media engagement, and social media activity were significant predictors of: decreases in social activity, increases in negative affect, and greater religiosity. Additional notable efforts include: (a) employing social media (Twitter) data as a way to computationally detect which individuals are likely to have schizophrenia, by combining the use of machine learning techniques and clinical appraisals with 88 percent accuracy3Michael L. Birnbaum et al., “A Collaborative Approach to Identifying Social Media Markers of Schizophrenia by Employing Machine Learning and Clinical Appraisals,” Journal of Medical Internet Research 19, no. 8 (2017): e289, https://doi.org/10.2196/jmir.7956. and the risk of relapse in patients with early psychosis;4Michael L. Birnbaum et al., “Detecting Relapse in Youth with Psychotic Disorders Utilizing Patient-Generated and Patient-Contributed Digital Data from Facebook,” Npj Schizophrenia 5, no. 1 (October 7, 2019): 1–9, https://doi.org/10.1038/s41537-019-0085-9. and (b) leveraging Facebook and ecological momentary assessments in a semi-supervised learning framework to predict with 96 percent accuracy mood instability in individuals with bipolar or borderline personality disorders.5Koustuv Saha et al., “Inferring Mood Instability on Social Media by Leveraging Ecological Momentary Assessments,” Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, no. 3 (September 11, 2017): 95:1–95:27, https://doi.org/10.1145/3130960. This body of work situates social media as a viable, unobtrusive source to help assess forthcoming risk factors and risk likelihood relating to psychological health and wellbeing.
Most relevant to this study, De Choudhury et al. examined the notion of desensitization in response to persistent violence during the Mexican drug war using affective expression on Twitter,6Munmun De Choudhury, Andrés Monroy-Hernández, and Gloria Mark, “‘Narco’ Emotions: Affect and Desensitization in Social Media during the Mexican Drug War,” Proceedings of the 32nd Annual ACM Conference on Human Factors in Computing Systems, CHI ’14, 2014, 3563–72, https://doi.org/10.1145/2556288.2557197. and with Delgado Valdes and Eisenstein, she analyzed the psychological effects of crime and violence by analyzing psycholinguistic language on Twitter shared in different geographical neighborhoods.7José Manuel Delgado Valdes, Jacob Eisenstein, and Munmun De Choudhury, “Psychological Effects of Urban Crime Gleaned from Social Media,” Proceedings of the 9th International AAAI Conference on Weblogs and Social Media. International AAAI Conference on Weblogs and Social Media 2015 (May 2015): 598–601. In more recent research, Saha and De Choudhury,8Koustuv Saha and Munmun De Choudhury, “Modeling Stress with Social Media Around Incidents of Gun Violence on College Campuses,” Proceedings of the ACM on Human-Computer Interaction 1, no. 2 (December 2017): 92:1–92:27, https://doi.org/10.1145/3134727. leveraging social media as a passive sensor of stress, proposed novel computational and time series techniques to quantify and examine stress responses after gun violence on college campuses. This research found that psycholinguistic characterization of the high-stress posts indicates that campus populations exhibit reduced cognitive processing and greater self-attention and social orientation, and that they participate more in death-related conversations. Additionally, a lexical analysis of high-stress posts showed distinctive temporal trends in the use of incident-specific words on each campus, providing further evidence of the impact of the incidents on the stress responses of campus populations.
This prior research and the study, broadly speaking, are situated in research on how crisis informatics has utilized language as a tool and lens to understand how major crisis events unfold in affected populations, and how they are covered on traditional media,9Kirsten A. Foot and Steven M. Schneider, “Online Structure for Civic Engagement in the September 11 Web Sphere.” Electronic Journal of Communication 14 (2004): 3–4. online media such as blogs,10Gloria Mark et al., “Blogs as a Collective War Diary,” in Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work, CSCW ’12 (New York, NY: Association for Computing Machinery, 2012), 37–46, https://doi.org/10.1145/2145204.2145215. and social media.11Leysia Palen, “Online Social Media in Crisis Events,” EDUCAUSE Quarterly 31, no. 3 (2008): 76–78; Kate Starbird et al., “Chatter on the Red: What Hazards Threat Reveals about the Social Life of Microblogged Information,” in Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work, CSCW ’10 (New York, NY: Association for Computing Machinery, 2010), 241–250, https://doi.org/10.1145/1718918.1718965; Michael Salter, “Justice and Revenge in Online Counter-Publics: Emerging Responses to Sexual Violence in the Age of Social Media,” Crime, Media, Culture 9, no. 3 (2013): 225–42, https://doi.org/10.1177/1741659013493918. Essentially, this work has consistently shown that online platforms emerge as a safe haven for people, enabling them to interact and express themselves during times of upheavals in their environment. This study builds upon this rich empirical evidence in crisis informatics research.
K–12 schools in the United States conduct lockdown and active shooter drills with various frequencies and using a variety of techniques and simulations. In this study, researchers have collaborated with Everytown, Moms Demand Action for Gun Sense in America, and Students Demand Action for Gun Sense in America to collect data on the names and locations of 138 schools that conducted a lockdown or active shooter drill between April 2018 and November 2019. Researchers obtained information regarding these drills in the past year, including their exact dates, the school grade (elementary, middle, or high school), and if simulations mimicking an active shooter were involved in the drill. Out of 138 total schools, 130 schools are public, public-charter or pre-K, and 8 are private. Thirty-three states were included in the sample overall, with 13 percent of schools geographically located on the West Coast, 40 percent in the central US, and 47 percent on the East Coast.12ElSherief, M., Saha, K., Gupta, P. et al. Impacts of school shooter drills on the psychological well-being of American K-12 school communities: a social media study. Humanit Soc Sci Commun 8, 315 (2021). https://doi.org/10.1057/s41599-021-00993-6
Social Media Datasets
Researchers then assemble a diverse set of public social media data posts belonging to the school communities in the dataset. This linguistic dataset spans a 90-day period (approximately 3 months) before and a 90-day period (approximately 3 months) after each drill to investigate and computationally quantify the psychological effects of these drills over a longer time period. Researchers selected three distinct social media data collection procedures to assess the psychological outcomes of the drills on populations belonging to these schools at large.
Official Twitter School Accounts Dataset
For this dataset, researchers adopt the Homophily principle that structures network ties of every nature including work, support, and other relationships.13Miller McPherson, Lynn Smith-Lovin, and James M Cook, “Birds of a Feather: Homophily in Social Networks,” Annual Review of Sociology 27, no. 1 (2001): 415–44, https://doi.org/10.1146/annurev.soc.27.1.415. Based on cursory observations, usually, parents, teachers, students, and other members of a school community follow the Twitter account of a school. This can be due to multiple reasons, such as to stay updated with the content posted by the school or engage in discussions with other members of the community. School followers are also more likely to share posts that might be relevant to the school, the events related to the school, or the neighborhood.
Researchers manually identify and visit the official websites of the schools in the sample and pull information related to the official Twitter account of the school. Researchers then use the Twitter API to collect a set of followers for these official Twitter school accounts. After that, researchers obtain the public Twitter posts of these followers spanning the 90 days before and after their corresponding drill date. Researchers associate a post with a school if it is posted by one of the followers of the given school. Researchers were able to retrieve approximately 266K followers for the schools that had a Twitter account and a total of approximately 18.2M posts for those followers.
District-Based Twitter Dataset
For this dataset, researchers leverage the fact that, in the United States, public schools belong to specific school districts. So it is highly likely that the communities at stake in the study (those likely to be directly impacted by the drills) reside with the bounds of the associated school districts for a given school. Researchers utilize this observation and manually curate the corresponding school districts for each school in the dataset using a combination of lookup through Google listings, the official school websites, and school information databases such as schooldigger.com. Researchers then use the Google Geocoding API14Google Geocoding API, https://developers.google.com/maps/documentation/geocoding/start to obtain bounding boxes for these official school districts. Through these bounding boxes’ coordinates and the geo-tagging information provided in a Twitter post, researchers were able to filter Twitter posts based on their coordinates being inside one of the school district bounding boxes. This resulted in the curation of a dataset consisting of 9.5M posts. Combining posts from this dataset and the “Official Twitter School Accounts Dataset” (see above), researchers were able to obtain coverage of 87 percent of the schools in the dataset, that is, 114 schools out of the total 138.
Researchers queried for online presence for the schools in the dataset on Reddit. However, researchers found that the majority of the schools do not have independent standalone subreddits. Researchers also observed that school-related topics exist in subreddits representing a neighborhood, a town, or a city. Researchers then identify the corresponding subreddits for each school representing the neighborhood or the city in which the school resides.
To accurately detect if a post is discussing a topic related to a school community in the dataset, researchers curate a set of names/acronyms that correspond to each school from their social media presence. Researchers extract these informal names and acronyms from digital usernames and handles from different social media school accounts on platforms such as Instagram, Facebook, Twitter, and YouTube (usually displayed on official school websites). For example, a school system officially named X Public Schools could be referenced to as XPS or XHigh (for high school) on social media for brevity. Researchers then leverage the set of official names of the schools in the dataset and the augmented acronyms and search the posts in a given neighborhood subreddit for expressions that are close lexically to the names and acronyms curated. Researchers implement this by computing the Levenshtein distance (a measure of edit distance).15V. I. Levenshtein, “Binary Codes Capable of Correcting Deletions, Insertions, and Reversals,” Doklady Physics 10, no. 8 (1966): 707–10. Researchers compare unigrams and bigrams in a given post to the school name using the Levenshtein distance. If researchers find a lexical unit in a post that when compared to a school name does not exceed a certain distance threshold, they classify that post as relevant to the school dataset. Upon applying this process, researchers ended up identifying 2,517 posts from 23 subreddits corresponding to 23 school communities.
In all of the next analyses, unless otherwise specified, researchers combine the posts curated from all the three priorly discussed datasets.
Methods and Analysis
Mental Health Symptomatic Expressions (Stress, Anxiety, and Depression)
Crises are known to psychologically affect individuals in several ways,16Mark et al., “Blogs as a Collective War Diary.” and prior work has shown that social media expressions may provide insights into attributes such as affect17De Choudhury, Monroy-Hernandez, and Mark, “‘Narco’ Emotions.” and stress18Saha and De Choudhury, “Modeling Stress with Social Media.” following crisis events. The work examines the effects in terms of psychological expressions in social media language. In particular, researchers study symptomatic expressions of stress, anxiety, and depression using machine learning classifiers that were built in prior work. These are essentially n-gram based binary SVM models trained using transfer learning methodologies. The positive class of the training datasets stems from appropriate Reddit communities (r/depression for depression, r/anxiety for anxiety, r/stress for stress), and the negative class of training datasets comes from non-mental-health-related content on Reddit. These classifiers show a high-performance accuracy of approximately 0.90 on average and transfer well on Twitter with an 87 percent agreement between machine-predicted labels and expert appraisal.19Koustuv Saha et al., “A Social Media Study on the Effects of Psychiatric Medication Use,” Proceedings of the International AAAI Conference on Web and Social Media 13 (July 6, 2019): 440–51.
Researchers use the above classifiers to obtain the pervasiveness of these expressions in the dataset. Researchers next obtain the relative differences in the proportion of high stress or anxiety and depression posts three months before and three months after the active shooter drill event. This allows researchers to estimate the trend of psychological expressions in terms of increase or decrease following the shooting drill. In fact, given the high likelihood of comorbidity of stress and anxiety in these contexts, researchers conduct analyses that combine these expressions for the ease of exposition, that is, any post that expresses either or both of stress or anxiety (according to the classifiers) are labeled as high mental health symptomatic expression.
Psycholinguistics literature reveals that people’s affective, cognitive, perceptual, and other psychological processes can be reflected in their language and changes in language over a period of time or around specific events (active shooter drills in this case).20James W. Pennebaker, Matthias R. Mehl, and Kate G. Niederhoffer, “Psychological Aspects of Natural Language Use: Our Words, Our Selves,” Annual Review of Psychology 54, no. 1 (2003): 547–77, https://doi.org/10.1146/annurev.psych.54.101601.145041. As researchers seek to understand psycholinguistic expressions, they employ the well-validated psycholinguistic lexicon, Linguistic Inquiry and Word Count (LIWC),21Yla R. Tausczik and James W. Pennebaker, “The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods,” Journal of Language and Social Psychology 29, no. 1 (2010): 24–54, https://doi.org/10.1177/0261927X09351676. which is known to work well with short text and social media data, as revealed in a large body of social computing and crisis informatics literature.22Yu-Ru Lin and Drew Margolin, “The Ripple of Fear, Sympathy and Solidarity during the Boston Bombings,” EPJ Data Science 3, no. 1 (2014): 31, https://doi.org/10.1140/epjds/s13688-014-0031-z; Mark et al., “Blogs as a Collective War Diary”; Saha and De Choudhury, “Modeling Stress with Social Media.” In this particular case, LIWC allows us to characterize a Twitter post into 50 psycholinguistic attributes spanning across (1) affect (anger, anxiety, negative and positive affect, sadness, swear); (2) cognition (causation, inhibition, cognitive mechanics, discrepancies, negation, tentativeness); (3) perception (feel, hear, insight, see); (4) interpersonal focus (first person singular, second person plural, third person plural, indefinite pronoun); (5) temporal references (future tense, past tense, present tense); (6) lexical density and awareness (adverbs, verbs, article, exclusive, inclusive, preposition, quantifier); (7) biological concerns (bio, body, death, health, sexual); (8) personal concerns (achievement, home, money, religion); and (9) social concerns (family, friends, humans, social).
Normalization and Temporal Analysis
Quantifying Change and Statistical Analysis
In this work, researchers investigate two types of change in symptomatic outcomes: immediate change and a longer-term change. The immediate change (IC) in outcomes is calculated by measuring the difference in daily symptomatic outcomes immediately after and before the drill. Researchers first normalize the time series spanning the whole time duration under examination (180 days), using the statistical standard score (z-score). This allows us to standardize the raw values of the distribution and understand the values relative to the mean of said distribution. Then, researchers calculate the immediate change by performing an interrupted time series analysis for which they fit a linear function (y1 = a1x + b1) for the 90 days before the drill represented by offsets [-90,0) and a linear function (y2 = a2x + b2) for the 90 days after the drill, represented by offsets (0,90], where the zero-th day denotes the drill date corresponding to each school. Hence, the immediate change in outcomes would be calculated as the interrupted change at time zero (b2–b1).
Conversely, the longer-term change (LC) is scrutinized and measured as the difference between the raw hourly averages across the 90-day period before and after the drill (Before and After Datasets respectively), yielding an average measure over 2,160 observations for each time period. Researchers then proceed to compute the relative change of the outcome of the hourly average for the after period relative to the hourly average for the before period.
To identify if any changes are statistically significant, researchers perform the Mann-Whitney U test, a non-parametric statistical test across two samples of 2,160 observations corresponding to the computed hourly averages.
Researchers measure the immediate changes in psychological outcomes and psycholinguistic traits for a variety of settings as follows:
- Aggregate: taking into account all Twitter and Reddit posts belonging to all school communities in the school dataset.
- Comparative analysis for schools broken down by school grades: elementary, middle, and high schools. These levels were provided by the parents who volunteered to provide the school drill information to Everytown.
- Comparative analysis for professions broken down into three stakeholder strata: teachers and school administrators, parents, and students. The profession associated with each Twitter profile was extracted by analyzing the Twitter bios in the dataset. Researchers then associate a Twitter profile with one of the three categories: teachers and school administrators, parents, and students if the associated profile contains one of the keywords outlined in Table 1. Researchers exclude Reddit data from this analysis since bios were found to be a rarity, given the affordances of the Reddit platform.
- Comparative analysis depending on the type of drill: lockdown vs. active shooter drills. This distinction is made based on the question “Did the drill involve simulations that mimic an active shooter incident?” which was asked by Everytown to the parents who volunteered school drill information. Parents were given three options as replies: “Yes,” “No,” and “Unsure.” Researchers only include the “Yes” and “No” responses in the analysis.
|User Type||Keyword Filter|
|Teachers and School Administrators||teacher(s), educator(s), teaching, principal,|
evaluator(s), educating, school administrator,
coach, school staff
|Parents||parent(s), mother, father, mom, dad, mommy,|
To determine if the change researchers see spanning from 90 days before the drill until 90 days after the drill is really a causal effect of the lockdown/active shooter drill, they employ two methods. These approaches are motivated from prior work, and they essentially minimize confounding effects of social media expressions such as changes due to seasonal, local, and other coincidental factors.23Saha and De Choudhury, “Modeling Stress with Social Media.”
First, when researchers examine the changes for the psychological outcomes, they aggregate all the posts belonging to all the communities in the school dataset. Since the drills are widespread across two calendar years in this case and since different schools have drilled at different times of the school year, we can delineate the effect of the lockdown/active shooter drill by minimizing seasonal and local effects or any other stressors the students might be experiencing.
Second, for each symptomatic outcome, adopting the permutation test framework established in the literature, researchers record the LC (actual LC) and generate a 1,000 different permutations of the 181 values of the outcome (90 days before, the day of the drill, and 90 days after).24Aris Anagnostopoulos, Ravi Kumar, and Mohammad Mahdian, “Influence and Correlation in Social Networks,” in Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’08 (New York, NY: Association for Computing Machinery, 2008), 7–15, https://doi.org/10.1145/1401890.1401897. For each of these permutations, researchers measure the LC (synthetic LC) and record if it was larger than the actual LC. They then compute the probability of the number of records exceeding the actual LC, taking into account all permutations. If this probability is found to be zero, then researchers can deduce that the actual LC is indeed attributed to the effects of the lockdown/active shooter drill. This essentially rejects the null hypothesis that the observed changes are random or caused by chance, providing statistical significance to the analyses.
To examine the lexical differences in the psychological expressions around lockdown and active shooting drills, researchers employ an unsupervised language modeling technique, the Sparse Additive Generative Model (SAGE).25Jacob Eisenstein, Amr Ahmed, and Eric P. Xing, “Sparse Additive Generative Models of Text,” in Proceedings of the 28th International Conference on International Conference on Machine Learning, ICML’11 (Madison, WI: Omnipress, 2011), 1041–1048. Given any two datasets, SAGE selects salient keywords by comparing the parameters of two logistically parameterized-multinomial models using a self-tuned regularization parameter to control the tradeoff between frequent and rare keywords.26Eisenstein, Ahmed, and Xing, “Sparse Additive Generative Models of Text.” Researchers perform SAGE analysis to identify distinguishing n-grams (n=1,2) in the Before and After datasets, where the SAGE magnitude of a n-gram signals the degree of its uniqueness. They build three separate models of SAGE, each corresponding to high stress, high anxiety, and high depression posts in the Before and After Datasets, that is the 90 day period before and after the reported active shooter drills. For each model, researchers examine the language that is unique to the Before data but not the After data and vice versa. Additionally, they investigate language that is common to both the 90 days before and after time periods.
Georgia Tech’s Social Dynamics and WellBeing (SocWeB) Lab, founded and directed by Dr. Munmun De Choudhury, led and conducted the analysis presented in this study that aims to understand the impacts of active shooter and lockdown drills on school communities.
Dr. De Choudhury is a computer scientist who works on problems at the intersection of computer science and social science. Currently, Dr. De Choudhury is a tenured associate professor at the School of Interactive Computing at Georgia Tech. Given the diversity of her intellectual interests, she is also affiliated with the Graphics and Visualization Center (GVU), Institute for People and Technology (IPaT), and the Machine Learning Center (ML@GT).
Dr. De Choudhury’s research program of over a decade built a variety of computational and machine learning approaches harnessing large-scale online data, alongside advances in machine learning and grounding in human-centered approaches to answer fundamental questions relating to our social lives and human behaviors. Since 2013, she has led an innovative research agenda that has situated social media as both a mechanism to understand our mental health, as well as to improve access to mental health care. In particular, Dr. De Choudhury’s research has pioneered the computational use of social media data for mental health, in an ethical and responsible manner. This was first shown in one of her influential papers from 2013, where, along with her collaborators, she was able to identify risk markers of major depression in individuals up to a year before their clinical diagnosis, by developing predictive models based on their Twitter data. This topic now constitutes a full-fledged research direction pursued by scholars and practitioners worldwide, and has been featured in the popular press, including the New York Times, the BBC, and the NPR.
Dr. De Choudhury has been a faculty member at Georgia Tech since 2014, and her lab has continued this line of research over the years, expanding both the types of problems tackled around mental health and the nature of methodological contributions that this research direction can provide. A lot of the research in Dr. De Choudhury’s lab is therefore highly participatory involving multiple stakeholders, as well as interdisciplinary, where the power of social computing, machine learning, and natural language analysis is combined with insights and theories from fields such as psychology, sociology, psychiatry, medicine, and public health. As a consequence of this multidisciplinary approach, the developed computational techniques are evaluated by Dr. De Choudhury’s team using diverse, quantitative and qualitative human-centered methods, deriving conclusions grounded in theories and our existing understanding of human behavior.
The SocWeB Lab’s research has been graciously supported by funds from the National Institutes of Health (NIH), National Science Foundation (NSF), Intelligence Advanced Research Projects Activity (IARPA), Centers for Disease Control and Prevention (CDC), Everytown Gun Safety Support Fund, the United Nations Foundation, Templeton Foundation, Microsoft, Facebook, Mozilla, Yahoo!, and Samsung.
Everytown Research & Policy is a program of Everytown for Gun Safety Support Fund, an independent, non-partisan organization dedicated to understanding and reducing gun violence. Everytown Research & Policy works to do so by conducting methodologically rigorous research, supporting evidence-based policies, and communicating this knowledge to the American public.