Using spread to create two value columns with tidyr

Question

I have a data frame that looks just like this (see link). I'd like to take the output that is produced below and go one step further by spreading the tone variable across both the n and the average variables. It seems like this topic might bear on this, but I can't get it to work: Is it possible to use spread on multiple columns in tidyr similar to dcast?

I'd like the final table to have the source variable in one column, then then the tone-n and tone-avg variables to be in columns. So I'd like the column headers to be "source" - "For - n" - "Against - n" "For -Avg" - "Against - Avg". This is for publication, not for further calculation, so it's about presenting data. It seems more intuitive to me to present data in this way. Thank you.

#variable1
Politician.For<-sample(seq(0,4,1),50, replace=TRUE)
#variable2
Politician.Against<-sample(seq(0,4,1),50, replace=TRUE)
#Variable3
Activist.For<-sample(seq(0,4,1),50,replace=TRUE)
#variable4
Activist.Against<-sample(seq(0,4,1),50,replace=TRUE)
#dataframe
df<-data.frame(Politician.For, Politician.Against, Activist.For,Activist.Against)

#tidyr
df %>%
 #Gather all columns 
 gather(df) %>%
 #separate by the period character 
 #(default separation character is non-alpha numeric characterr) 
 separate(col=df, into=c('source', 'tone')) %>%
 #group by both source and tone  
 group_by(source,tone) %>%
 #summarise to create counts and average
 summarise(n=sum(value), avg=mean(value)) %>%
 #try to spread
 spread(tone, c('n', 'value'))

user295691 · Accepted Answer

I think what you want is another gather to break out the count and mean as separate observations, the gather(type, val, -source, -tone) below.

gather(df, who, value) %>%
    separate(who, into=c('source', 'tone')) %>%
    group_by(source, tone) %>%
    summarise(n=sum(value), avg=mean(value)) %>%
    gather(type, val, -source, -tone) %>%
    unite(stat, c(tone, type)) %>%
    spread(stat, val)

Yields

Source: local data frame [2 x 5]

      source Against_avg Against_n For_avg For_n
1   Activist        1.82        91    1.84    92
2 Politician        1.94        97    1.70    85

Using spread to create two value columns with tidyr

Answers (2)

Related Questions