Teaching Data Science to (Complete) Newcomers

Joshua Rosenberg

Purpose

  • To discuss the challenges of teaching learning analytics to newcomers
  • To summarize some research on this topic
  • To share some strategies for teaching newcomers (or planning courses for newcomers)

Quick Poll

Some research

Research Questions

RQ #1: How do educational researchers in a doctoral program situate themselves with respect to data science?

RQ #2: How did the design of the workshop help educational researchers overcome barriers to using data science?

Study Overview

  • 44 doctoral students in a general educational research doctoral program
  • 2 one-day workshops using R and RStudio (Posit) Cloud
  • Pre/post surveys + qualitative analysis

Students want to code!

  • Around 75% of students said they were learning R because they wanted to use R/coding in their work or for their doctoral program

Students have some anxieties

  • 84% cited technical factors as primary concern

“I really do not understand the statistical stuff. I might as well be reading Russian.”

Although having two degrees in mathematics-related fields, my experience with true data analysis is limited. As a classroom teacher, I analyze student data regularly, but rarely (really, never) with advanced statistical analysis

I am fearful I will not have enough time to fully understand the programming.

Confidence Growth

  • Students grew in confidence, even through a brief (one day) workshop

  • What Worked: Pedagogy

66% cited pedagogical factors as workshop strength

Key strategies: - Patience and flexibility
- Coding together - Normalizing errors
- Plain language explanations

Student Feedback

It was helpful to practice what we were doing together so we could ask questions of each other. I also really appreciated when you explained what the code was saying be reading it out loud.

Although, I didn’t understand many things, [The instructor] stayed positive and never made me feel less intelligent.

Guideline 1: Tailot to students’ backgrounds

  • Don’t teach a course for other students – get to know yours!
  • Explicitly and deliberately build confidence
  • Do this through pedagogical strategies and a supportive attitude

Guideline 2: Use Familiar Data

Start with familiar contexts:

  • Student assessment data
  • Survey responses 
  • Gradebook

Guideline 3: Emphasize relevance

Explicitly link techniques to:

  • Educational research questions
  • Students’ work contexts
  • Real applications

Guideline 4: Create Supportive Environments

  • Normalize debugging process
  • Model problem-solving
  • Maintain patience with difficulties
  • Make errors learning opportunities

Key Takeaways

  • Students bring valuable experiences
  • Technical anxiety is addressable, but it requires explicit attention
  • Pedagogy can matter greatly!
  • Confidence can grow substantially, even over a short period

Discussion

  • At some point in the future, what do you hope to teach regarding learning analytics?

  • What challenges do you anticipate – learning-related, related to your institution, etc.?

Micro-Support

  • You’ll next see 5 different problems novices may face with R/R code
  • Your task in groups is to briefly discuss what you could say or do (including nothing!), and prepare one strategy to share

Here goes!

Problem 1

library(tidyverse)
student_data <- read_csv("student-data.csv")
Error: 'student-data.csv' does not exist in current working directory ('/Users/jrosenb8/teaching-ds-presentation').
studnet_data
Error: object 'studnet_data' not found

Problem 2

glimpse(student_data)
Rows: 20
Columns: 5
$ id     <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, …
$ name   <chr> "Student_1", "Student_2", "Student_3", "Student_4", "Student_5"…
$ gender <chr> "Female", "Male", "Male", "Male", "Female", "Female", "Male", "…
$ score  <dbl> 73.8, 92.6, 61.6, 88.6, 73.5, 96.7, 91.1, 68.0, 94.3, 70.4, 76.…
$ grade  <fct> C, A, D, B, C, A, A, D, A, C, C, D, C, B, B, D, D, D, A, A
filter(student_data, scores > 70)
Error in `filter()`:
ℹ In argument: `scores > 70`.
Caused by error:
! object 'scores' not found

Problem 3

“I’m totally lost. I can’t get my code to run, my data is ruined, and our pet’s heads are falling off!”

Problem 4

str_detect() is a function from the stringr package

str_detect(student_data, gender, "Female")
Error: object 'gender' not found
library(stringr)

Discussion

  • What resonates with your teaching experience?
  • Which strategies will you try first?
  • What additional challenges do you face?

Reference

Willet, K. B. S., & Rosenberg, J. M. (2023). The Design and Effects of Educational Data Science Workshops for Early Career Researchers. Journal of Formative Design in Learning, 7(2). https://doi.org/10.1007/s41686-023-00083-7