user1357015
user1357015

Reputation: 11686

rearrange data from short to long form in R

I have data in the following way:

         level1  level2  level3  level4
controls  x_11    x_12    x_13    x_14
cases     x_21    x_22    x_23    x_24

What's the best way to put this in long form? Specifically, I want x_11 rows at level 0 and then an indicator of 0, x_12 rows at level indicator of 0?

I'm trying to put this into the independence_test function in the coin library and that's the data it requires. Thank you!

Edit

I have this matrix:

         1 2 3 4
controls 9 7 7 7
cases    0 1 1 5

How do I get a matrix that is 37x2. Each row would have "status" and "bin". For instance, I would get 9 rows of

0, 1 (control, bin =1)

Then 7 rows of: 0,2 (control, bin = 2)

...

0 rows of: 0,1 (case, bin = 1)

1,2 (case, bin=2)

Thank you!

Edit 2 Dput input/output to one solution below:

> dput(mtx)
structure(c(9L, 0L, 7L, 1L, 7L, 1L, 7L, 5L), .Dim = c(2L, 4L), .Dimnames = list(
    c("controls", "cases"), c("1", "2", "3", "4")))

dput(long) structure(c("1", "1", "1", "1", "1", "1", "1", "1", "1", "3", "3", "3", "3", "3", "3", "3", "4", "1", "1", "1", "1", "1", "1", "1", "2", "3", "3", "3", "3", "3", "3", "3", "4", "4", "4", "4", "4", "controls", "controls", "controls", "controls", "controls", "controls", "controls", "controls", "controls", "controls", "controls", "controls", "controls", "controls", "controls", "controls", "cases", "controls", "controls", "controls", "controls", "controls", "controls", "controls", "cases", "controls", "controls", "controls", "controls", "controls", "controls", "controls", "cases", "cases", "cases", "cases", "cases"), .Dim = c(37L, 2L), .Dimnames = list(NULL, c("", "status")))

Upvotes: 0

Views: 218

Answers (3)

IRTFM
IRTFM

Reputation: 263352

Let's assume that contingency matrix is called mtx:

     cbind( bin=unlist(mapply( rep, times=mtx, rownames(mtx)[row(mtx)] )), 
            status=unlist(mapply( rep, times=mtx, colnames(mtx)[col(mtx)] ))
          )
  #--------------------------
      bin        status
 [1,] "controls" "1"   
 [2,] "controls" "1"   
 [3,] "controls" "1"   
 [4,] "controls" "1"   
 [5,] "controls" "1"   
 [6,] "controls" "1"   
 [7,] "controls" "1"   
 [8,] "controls" "1"   
 [9,] "controls" "1"   
[10,] "controls" "2"   
[11,] "controls" "2"   
[12,] "controls" "2"   
[13,] "controls" "2"   
[14,] "controls" "2"   
[15,] "controls" "2"   
[16,] "controls" "2"   
[17,] "cases"    "2"   
[18,] "controls" "3"   
[19,] "controls" "3"   
[20,] "controls" "3"   
[21,] "controls" "3"   
[22,] "controls" "3"   
[23,] "controls" "3"   
[24,] "controls" "3"   
[25,] "cases"    "3"   
[26,] "controls" "4"   
[27,] "controls" "4"   
[28,] "controls" "4"   
[29,] "controls" "4"   
[30,] "controls" "4"   
[31,] "controls" "4"   
[32,] "controls" "4"   
[33,] "cases"    "4"   
[34,] "cases"    "4"   
[35,] "cases"    "4"   
[36,] "cases"    "4"   
[37,] "cases"    "4"   

To see how this works you can play around with such a matrix:

dput(mtx)
structure(c(9, 0, 7, 1, 7, 1, 7, 5), .Dim = c(2L, 4L), .Dimnames = list(
    c("controls", "cases"), c("1", "2", "3", "4")))

Upvotes: 1

statsRus
statsRus

Reputation: 573

If you want to change your data from wide to long, the melt function is very helpful. I've tried to create a toy data set so your question can be answered, although it may not be exactly what you've intended (it can be very difficult to "guess" what somebody wants to do without a specific, reproducible example data set).

First we'll create a toy data set in R:

df.wide <- as.data.frame(matrix(1:8,2))
colnames(df.wide) <- c("Level 1", "Level 2", "Level 3", "Level 4")
rownames(df.wide) <- c("Controls", "Cases")

# creating an id variable for the rows
df.wide$id <- rownames(df.wide)

# examining the dataframe
print(df.wide)
Level 1 Level 2 Level 3 Level 4       id
Controls       1       3       5       7 Controls
Cases          2       4       6       8    Cases

And now we convert from wide to long:

require(reshape2)
df.long <- melt(df.wide)
print(df.long)

id variable value
1 Controls  Level 1     1
2    Cases  Level 1     2
3 Controls  Level 2     3
4    Cases  Level 2     4
5 Controls  Level 3     5
6    Cases  Level 3     6
7 Controls  Level 4     7
8    Cases  Level 4     8

Upvotes: 2

Greg Snow
Greg Snow

Reputation: 49640

The as.data.frame.table function and rep functions together may do what you want:

> m <- matrix(1:12, 4)
> df <- as.data.frame.table(m)
> df[ rep(1:nrow(df), df$Freq), ]
      Var1 Var2 Freq
1        A    A    1
2        B    A    2
2.1      B    A    2
3        C    A    3
3.1      C    A    3
3.2      C    A    3
4        D    A    4
4.1      D    A    4
4.2      D    A    4
4.3      D    A    4
5        A    B    5
5.1      A    B    5
.
.
.

Another option may be to look at the reshap2 or plyr packages.

Upvotes: 1

Related Questions