Lucas
Lucas

Reputation: 51

How to tell ggplot several rows belong to one case in a long format dataset with nested variables

I have a dataset in long format in which each participant undergoes two conditions in an experiment (repeated measures) and each condition is composed of a number of trials. Participants scored (Score) per condition/group, but also have individual reaction times (RT) per trial.

The dataset looks like this:

library(tidyverse)

df <- data.frame(ID = c(rep(1, 6), rep(2, 6), rep(3, 6)), 
              Gender = factor(c(rep("M", 6), rep("M", 6), rep("F", 6))), 
              Group = factor(c(rep(c(rep(0, 3), rep(1, 3)), 3))), 
              Trial = factor(rep(c(1:3), 6)),
              Score = c(rep(10, 3), rep(20, 3), rep(15, 3), rep(25, 3), rep(18, 3), rep(12, 3)), 
              RT = runif(18)
                 )

I wanted to do some plotting to explore the data and focus on the analysis of the score, which is simpler at this stage. The problem I have is that each row in Score is not really representing a single case, as it is RT the one that are somehow "leads" the row division of the dataset. To be clear, my problem is that if for example I want to plot a bar with the counts per case of Gender I would end up with a sum of 18 cases and not 3, as there are in reality.

ggplot(data=df, aes(Gender)) + 
  geom_bar()                       

I thought that a way to simplify the dataset could be that each RT row represents the mean/median per participant already, but this would involved subdividing my dataset in two and I prefer that this is the last option. In addition, this would not solve my problem as there will be two Gender per participant.

I know this has to be simple, but I am having trouble formulating this issue as I am still a newbie in R.

I appreciate any help!

Upvotes: 0

Views: 225

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389145

Since you have multiple rows for each ID to count the gender keep only unique values for each ID and Gender before plotting. So you get something like this :

library(dplyr)
library(ggplot2)

df %>% distinct(ID, Gender) %>% ggplot(aes(Gender)) + geom_bar()

enter image description here

Upvotes: 1

Related Questions