Lab 4: Conceptual Overview
Part 1: Research Overview
What is Topic Modeling?
What research questions can topic modeling answer?
What are limitations & ethical considerations?
Part 2: R Code-Along
Document Term Matrix
LDA (Latent Dirichlet allocation)
Finding K
Figure source: Silge & Robinson, 2017
Applying Topic Modeling in STEM Education Research
“Topic modeling is a field of natural language processing that aims to extract themes by text mining a set of documents.” (Blei, 2012; Vijayan, 2021)
Figure source: Naskar, n.d.
Literature review (e.g., Chen et al., 2020) - In what research topics were the Computers & Education community interested ? - How did such research topics evolve over time?
Assessment (e.g., Ming & Ming, 2015) - Do the concepts discussed by students as inferred by pLSA (Probabilistic latent semantic analysis) predict their course outcomes? - How does the accuracy of these predictions change over time as more student work is analyzed?
Course/project evaluation (e.g., Akoglu et al., 2019) - What are the similarities and differences between how PLT (professional learning team) members and Non-PLT online participants engage and meet course goals in a MOOC-Ed designed for educators?
Take a look at the dataset located here and consider the following:
- What format is this data set stored as?
- What are some things you notice about this dataset?
- What questions do you have about this dataset?
- What similar dataset do you have?
- What research questions do you want to address with your dataset?
Document Term Matrix, LDA, and Finding K
[Text Mining_Topic Modeling]
Figure source: SPE3DLab, n.d.
Figure source: Ma, 2019
Dr. Shiyan Jiang