Heala45
Heala45

Reputation: 161

Reshaping data with reshape2 in R

I've been trying to figure out how the melt and cast functions works with the reshape2 package. But can't get the results that I'm looking for.

Heres the data:

data <- read.table(header=T, text="
  diagnosis  agrp   events  Period
  COPD  1   16  1998-1999
  COPD  2   51  1998-1999
  COPD  3   27  1998-1999
  COPD  4   9   1998-1999
  COPD  1   44  2000-2001
  COPD  2   122 2000-2001
  COPD  3   39  2000-2001
  COPD  4   12  2000-2001")

Heres what im trying to acheive

diagnosis   agrp    1998-1999   2000-2001   etc...
  COPD      1          16        44
  COPD      2          51        12
  COPD      3          27        39
  COPD      4          9         12

I'd like to transpose the data so that "Period" becomes a column for itself. Sufficient code to achieve this would be highly appreciated!

Update

This is how my data looks like:

     data <- read.table(header=T, text="
 diagnosis  agrp    1998-1999   2000-2001
KONTROLL    1     140903      72208
KONTROLL    2     88322       33704
KONTROLL    3     18175       3804
KONTROLL    4     6125        797")

This is what I'm trying to achieve:

 diagnosis  agrp    1998-1999   2000-2001   Total
KONTROLL    1     140903       72208        213111
KONTROLL    2     88322        33704        122026
KONTROLL    3     18175        3804       21979
KONTROLL    4      6125         797         6922

Upvotes: 1

Views: 104

Answers (1)

akrun
akrun

Reputation: 887991

Try

reshape(data, idvar=c('diagnosis', 'agrp'),
              timevar='Period', direction='wide')
#   diagnosis agrp events.1998-1999 events.2000-2001
#1      COPD    1               16               44
#2      COPD    2               51              122
#3      COPD    3               27               39
#4      COPD    4                9               12

Or using reshape2

library(reshape2)
dcast(data, diagnosis+agrp~Period, value.var='events')
#    diagnosis agrp 1998-1999 2000-2001
#1      COPD    1        16        44
#2      COPD    2        51       122
#3      COPD    3        27        39
#4      COPD    4         9        12

Or

library(tidyr)
spread(data, Period, events)
#   diagnosis agrp 1998-1999 2000-2001
#1      COPD    1        16        44
#2      COPD    2        51       122
#3      COPD    3        27        39
#4      COPD    4         9        12

Update

Based on the new data

  transform(data, Total=rowSums(data[,3:4]), check.names=FALSE)
  #  diagnosis agrp 1998-1999 2000-2001  Total
  #1  KONTROLL    1    140903     72208 213111
  #2  KONTROLL    2     88322     33704 122026
  #3  KONTROLL    3     18175      3804  21979
  #4  KONTROLL    4      6125       797   6922

Upvotes: 3

Related Questions