How to tell ggplot several rows belong to one case in a long format dataset with nested variables

Question

I have a dataset in long format in which each participant undergoes two conditions in an experiment (repeated measures) and each condition is composed of a number of trials. Participants scored (Score) per condition/group, but also have individual reaction times (RT) per trial.

The dataset looks like this:

library(tidyverse)

df <- data.frame(ID = c(rep(1, 6), rep(2, 6), rep(3, 6)), 
              Gender = factor(c(rep("M", 6), rep("M", 6), rep("F", 6))), 
              Group = factor(c(rep(c(rep(0, 3), rep(1, 3)), 3))), 
              Trial = factor(rep(c(1:3), 6)),
              Score = c(rep(10, 3), rep(20, 3), rep(15, 3), rep(25, 3), rep(18, 3), rep(12, 3)), 
              RT = runif(18)
                 )

I wanted to do some plotting to explore the data and focus on the analysis of the score, which is simpler at this stage. The problem I have is that each row in Score is not really representing a single case, as it is RT the one that are somehow "leads" the row division of the dataset. To be clear, my problem is that if for example I want to plot a bar with the counts per case of Gender I would end up with a sum of 18 cases and not 3, as there are in reality.

ggplot(data=df, aes(Gender)) + 
  geom_bar()

I thought that a way to simplify the dataset could be that each RT row represents the mean/median per participant already, but this would involved subdividing my dataset in two and I prefer that this is the last option. In addition, this would not solve my problem as there will be two Gender per participant.

I know this has to be simple, but I am having trouble formulating this issue as I am still a newbie in R.

I appreciate any help!

Ronak Shah · Accepted Answer

Since you have multiple rows for each ID to count the gender keep only unique values for each ID and Gender before plotting. So you get something like this :

library(dplyr)
library(ggplot2)

df %>% distinct(ID, Gender) %>% ggplot(aes(Gender)) + geom_bar()

How to tell ggplot several rows belong to one case in a long format dataset with nested variables

Answers (1)

Related Questions