Text Mining Capstone Badge

Author

LASER Institute

Published

July 20, 2024

The culminating activity for TM Learning Labs is designed to provide you some space for independent analysis of a self-identified data source. To earn your TM Capstone Badge, you are required to demonstrate your ability to formulate a basic research question appropriate to a text mining context, wrangle and analyze text data, and communicate key findings. Your primary goal for this analysis is to create a simple data product that illustrates key findings by applying the knowledge and skills acquired from the essential readings and case studies.

  1. Identify a data source. For your TM Capstone badge, you are required to identify your own text data source related to an area of professional interest. This may be data that you have already collected prior to the LASER Institute, or data that you may be interested in working with for a future study, such as tweets.

  2. Formulate a question. I recommend keeping this simple and limiting to no more than one or two questions. Your question(s) should be appropriate to your data set and ideally be answered by applying concepts and skills from our essential readings and case studies. For example, you may be interested in examning researchers’ reactions to online conferences by conducting sentiment analysis with X data.

  3. Analyze the data. Create a new R script in the R project you cloned from GitHub text-mining repository to use as you work through data wrangling and analysis. Your R script will likely contain code that doesn’t make it into your final data product since you will experiment with different approaches and figure out code that works and code that does not.

  4. Create a data product. When you feel you’ve wrangled and analyzed the data to your satisfaction, create an R Quarto file that includes a polished sociogram and/or data table along with a brief narrative highlighting your research question, data source, and key findings and potential implications. Your R Markdown file should include a polished sociogram, chart, and/or table; a title and narrative ; and all code necessary to read, wrangle, and explore your data.

  5. Share your findings. Render your data product to a desired output format.

If you have any questions about this badge, or run into any technical issues, don’t hesitate to email your Learning Lab Lead.