Reputation: 1
I'm working on my MSc. thesis but I'm having trouble getting what I want to see and analyze on R.
I have a data frame like this:
Subject_ID Type Speed1 Speed2 Speed3 ... Speed20
1 A 25 27 24 31
2 B 32 21 35 33
3 B 21 25 27 29
4 A 31 28 38 20
5 A 30 22 21 28
6 B 27 33 31 24
Coming from an economic game programmed on z-Tree. I'm reading data using R. In the game, subjects choose their speed in each individual period, and there are 20 periods. I want to find behavioral differences between subjects of different types.
I want to characterize the behaviour of each type, using several subjects for each type. For example, Type A subjects use higher (on average) speeds but with a high variance. On the other hand, Type B subjects maybe have a lower average speed in periods but with lower variance.
Hopefully I want to see statistic information grouping by Type. Also, I would love to see a graph like:
Upvotes: 0
Views: 38
Reputation: 39613
This could be a good point to start. If you want to compare both types you should analyze how speed evolves across all subjects and then make a decision. You data is in wide format so in order to use ggplot2
functions you first have to format to longer using pivot_longer()
from tidyr
in tidyverse
. After that, it will be possible to design the plot. Here a code to produce a plot which is splitted by Type
and using color lines by speed. The included plot has ribbons but in order to add them to the plot is necessary to have limits variables defined in your dataset. Next the solution:
library(tidyverse)
#Code
df %>% pivot_longer(cols = -c(Subject_ID,Type)) %>%
rename(Speed=name) %>%
mutate(Speed=factor(Speed,levels = unique(Speed))) %>%
ggplot(aes(x=factor(Subject_ID),y=value,color=Speed,group=Speed))+
geom_point()+
geom_line(size=1)+
theme_bw()+
facet_wrap(.~Type,scales='free')+
xlab('Subject')
Output:
Some data used:
#Data
df <- structure(list(Subject_ID = 1:6, Type = c("A", "B", "B", "A",
"A", "B"), Speed1 = c(25L, 32L, 21L, 31L, 30L, 27L), Speed2 = c(27L,
21L, 25L, 28L, 22L, 33L), Speed3 = c(24L, 35L, 27L, 38L, 21L,
31L), Speed20 = c(31L, 33L, 29L, 20L, 28L, 24L)), class = "data.frame", row.names = c(NA,
-6L))
If you do not want the plot splitted by Type
you can avoid that code line for facets an obtain this:
#Code 2
df %>% pivot_longer(cols = -c(Subject_ID,Type)) %>%
rename(Speed=name) %>%
mutate(Speed=factor(Speed,levels = unique(Speed))) %>%
ggplot(aes(x=factor(Subject_ID),y=value,color=Speed,group=Speed))+
geom_point()+
geom_line(size=1)+
theme_bw()+
xlab('Subject')
Output:
Update: You can use group_by()
and summarise()
with sum()
in order to aggregate all values by type with next code:
#Code 3
df %>% pivot_longer(cols = -c(Subject_ID,Type)) %>%
rename(Speed=name) %>%
group_by(Subject_ID,Type) %>%
summarise(value=sum(value)) %>%
ggplot(aes(x=factor(Subject_ID),y=value,color=Type,group=Type))+
geom_point()+
geom_line(size=1)+
theme_bw()+
xlab('Subject')
Output:
Computing mean and SD by group and re arranging the plot scheme will produce this:
#Code 4
df %>% pivot_longer(cols = -c(Subject_ID,Type)) %>%
rename(Speed=name) %>%
group_by(Subject_ID,Type) %>%
summarise(Value=sum(value),Mean=mean(value),SD=sd(value),
Low=Value-Mean*SD,Up=Value+Mean*SD) %>%
ggplot(aes(x=factor(Subject_ID),y=Value,color=Type,group=Type))+
geom_line(size=1)+
geom_point()+
geom_ribbon(
aes(ymin = Low, ymax = Up,fill=Type),
alpha = 0.2
)+
theme_bw()+
xlab('Subject')
Output:
Upvotes: 1