SNA Module 1: Code-Along
Guiding Research & Network Packages
Revisiting early work in the field of sociometry, this study by Pittinsky and Carolan (2008) assesses the level of agreement between teacher perceptions and student reports of classroom friendships among middle school students.
The central question guiding this investigation was:
Do student reports agree with teacher perceptions when it comes to classroom friendship ties and with what consequences for commonly used social network measures?
1 teacher, 1 middle school, four classrooms
Students given roster and asked to evaluate relationships with peers
Choices included best friend, friend, know-like, know, know-dislike, strongly dislike, and do not know.
Relations are valued (degrees of friendship, not just yes or no)
Data are directed (friendship nominations were not presumed to be reciprocal).
Teacher’s perceptions and students’ reports were statistically similar, 11–29% of possible ties did not match.
Students reported significantly more reciprocated friendship ties than the teacher perceived.
Observed level of agreement varied across classes and generally increased over time.
Let’s start by creating a new R script and loading the {tidyverse} package which we’ll use to import our network data files:
Note: Tidyverse is actually a collection of R packages that share an underlying design philosophy, grammar, and data structures commonly referred to as “tidy data principles.” LASER uses the {tidyverse} extensively.
Intro to Network Data Structures
Consistent with typical data storage, node-lists often include:
identifiers lik name or ID
demographic info (gender, age)
socio-economic info (job, income)
substantive info (grades, attendance)
id | gender | achievement |
---|---|---|
1 | female | high |
2 | male | average |
3 | female | average |
4 | male | high |
5 | female | average |
6 | female | average |
Radically different than typical data storage, edge-lists include:
ego and an alter
tie strength or frequency
edge attributes (time, event, text)
from | to | weight |
---|---|---|
1 | 2 | 1 |
1 | 4 | 1 |
1 | 5 | 1 |
1 | 6 | 1 |
1 | 7 | 1 |
1 | 8 | 1 |
Also radically different, an adjacency matrix includes:
column for each actor
row for each actor
a value indicating the presence/strength of a relation
1 | 2 | 3 | 4 | 5 | 6 | 7 | |
---|---|---|---|---|---|---|---|
1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
2 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
3 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
5 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Take a look at one of the network datasets in the data folder under the Files Tab in RStudio and consider the following:
What format is this data set stored as?
If edge data, is it directed or undirected? Valued?
If node data, does the file contain attribute data?
What are some things you notice about this dataset?
What questions do you have about this dataset?
Let’s start by importing two Excel files that contain data about the nodes and the edges in our student friendship network:
Now let’s take a look at the data file we just imported using the View()
function or another function of choice you may have learned previously:
Think about the questions below and be prepared to share your response:
What do you think the rows and columns in each file represent?
What about the values in each cell represent?
What else do you notice about the data?
What questions do you have?
Run the following code in your R script:
The tbl_graph()
function creates a special network data structure called a “tidy graph” that combines our nodes and edges into a single R object.
The benefits of a “tidy graph” is that it opens up the entire suite of tidyverse tools for manipulating and constructing network data and variables noted earlier.
Using your R script, type the name of network object we just created and run the code to produce the output on the next tab:
You should see an output that looks something like this:
# A tbl_graph: 27 nodes and 203 edges
#
# A directed simple graph with 2 components
#
# A tibble: 27 × 5
id gender achievement gender_num achievement_num
<dbl> <chr> <chr> <dbl> <dbl>
1 1 female high 1 1
2 2 male average 0 2
3 3 female average 1 2
4 4 male high 0 1
5 5 female average 1 2
6 6 female average 1 2
# ℹ 21 more rows
#
# A tibble: 203 × 3
from to weight
<int> <int> <dbl>
1 1 2 1
2 1 4 1
3 1 5 1
# ℹ 200 more rows
Think about the questions below:
What is size of the student-reported friendship network?
What else do you notice about this network?
What questions do have about this network summary?
Making Simple and Sophisticated Sociograms
The ggraph()
function is the first function required to build a sociogram. Try running this function on out student_network
and see what happens:
This function serves two critical roles:
It takes care of setting up the plot object for the network specified.
It creates the layout based on algorithm provided.
The {ggraph} packages allows for some very fairly sophisticated sociograms…
With a fair bit of coding:
SNA Case Study: Who’s Friends with Who in Middle School?
Guiding Study: Behavioral versus cognitive classroom friendship networks.
This work was supported by the National Science Foundation grants DRL-2025090 and DRL-2321128 (ECR:BCSER). Any opinions, findings, and conclusions expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.