Date of Award
Spring 2025
Document Type
Thesis
Degree Name
Master of Science (MS)
Department
Computer Science
Committee Chairperson
Richard Burns, Ph.D.
Committee Member
David Cooper, Ph.D.
Committee Member
Md Amiruzzaman, Ph.D.
Abstract
The rapid evolution of language, driven by technological advancements, has created notable cultural gaps between generations, particularly in how they communicate. This gap is most apparent in the growing use of slang and emojis among younger generations. This study aims to explore whether Reddit comments can be classified by generation based on the usage of slang and emojis, the frequency of their use across generations, and how such features (slang and emojis) might influence the meaning of traditional language. Using Reddit’s API, we collected comments from four generational subreddits and applied various machine learning models, Naïve Bayes, Neural Networks, and Decision Trees to identify the most effective classification method. We compared both standard models and improved models that focus on selective features—slang and emojis—using both imbalanced and balanced datasets. Through this research, we seek to determine if machine learning models can effectively classify social media comments by generation based on certain linguistic features. Our findings show that the Neural Network model outperforms the other two models, making it a promising choice for future work in improving classifying comments by generation.To our knowledge, this is the first work of cross-examining machine learning models for real-world generational classification of text based on specific features (slang and emojis), offering insight for applications in public social media platforms, video games, and general industry communication. It also contributes to human linguistics by helping to show patterns and understand communication differences by generations.
Recommended Citation
Dracup, James T., "Intergenerational Classification of Reddit Comments Based on Slang and Emoji Usage" (2025). West Chester University Master’s Theses. 350.
https://digitalcommons.wcupa.edu/all_theses/350