Individual Submission Summary
Share...

Direct link:

Poster #124 - Application of Machine Learning: Analyzing Family Emotional Climate with Sentiment Analysis using R packages

Fri, March 22, 12:45 to 2:00pm, Baltimore Convention Center, Floor: Level 1, Exhibit Hall B

Integrative Statement

Children’s emotional development is associated with their family emotional climate, which is constituted by the amount of positive and negative emotions expressed in the family (Morris, Silk, Steinberg, Myers & Robinson, 2007). Children experiencing frequent intense negative emotion in a family context are more likely to have emotion dysregulation and are less emotionally secure (Cummings & Davies, 1996). Traditionally, family emotional climate is studied through parental self-report of their emotional expressivity or laboratory-based task to observe how parents talk about or express emotion, which may not have good ecological validity. Recently, more research utilizes a naturalistic observational approach to study parental behaviors; yet, manual coding of emotion from these naturalistic observation is time consuming. The current study demonstrated how to use sentiment analysis (SA), a popular application of machine learning, to extract emotional components from naturalistic observation of family conversation.
SA is designed to determine the subjective feelings of authors/speakers by classifying the emotional valence into positive or negative. Despite the sophisticated mathematics behind machine learning, including SA, it is now becoming more accessible due to the free and user-friendly packages available online. With these easily accessible online packages, researchers who are interested in this novel way of analyzing their text data could take their first step easily.
In this study, two widely-used R packages for SA, namely Syuzhet (Jockers, 2017) and SentimentAnalysis (Fuerriegel & Pröllochs, 2017), are introduced. Both packages include models trained with multiple sentiment dictionaries developed from past research and allow access to robust and well-developed sentiment extraction tools. This study aims at comparing the performance of two packages. The current study utilizes data drawn from a larger longitudinal study that examined familial factors on children’s emotional development. Daily conversation between mothers, fathers and children in a naturalistic home environment on a typical day was captured with a recording device when children were 3-3.5 years old (N=53). The family conversations were transcribed and the speaker of each utterance was identified by transcribers. Transcriber notes (e.g. giggles, whining) that provide sentiment information were also put into the analysis. To evaluate the performance, outputs from these packages were compared with the manual classification codes (-1 = negative, 0 = neutral, +1 = positive); approximately 30% of the transcripts are coded for inter-coder reliability (average kappa = .73).
The accuracy rate (i.e. the performance) is calculated by the number of correct predictions divided by the total number of predictions. Preliminary results (n=8) showed that the average accuracy rate across transcripts when using Syuzhet is 68.6%, which is an acceptable accuracy rate and similar to that reported in Hladka and Holub (2015) although of different contents; whereas that of SentimentAnalysis is 51.7%. The Syuzhet outperformed SentimentAnalysis in analyzing naturalistic family conversation. Furthermore, the syuzhet package provides plots to visualize the dynamic changes of sentiment over time (see Figure 1 for sample plot), which is useful for data visualization. Researchers could consider using the Syuzhet package in R to extract sentiment from naturalistic conversation for future research.

Authors