rmflight
rmflight

Reputation: 1891

Multiple non-grouped items in R tabular() output

I have a data.frame that I want to use tables::tabular() to setup for nice printing in latex. It has 5 repeated items in two groups (normal and compress), where I want three items to not be grouped, and then the rest to be grouped.

test_table <- structure(list(id = structure(c(2L, 3L, 5L, 1L, 4L, 2L, 3L, 5L, 
1L, 4L), .Label = c("GO:0005525", "GO:0005634", "GO:0008270", 
"GO:0019001", "GO:0046914"), class = "factor"), description = c("nucleus", 
"zinc ion binding", "transition metal ion binding", "GTP binding", 
"guanyl nucleotide binding", "nucleus", "zinc ion binding", "transition metal ion binding", 
"GTP binding", "guanyl nucleotide binding"), IPR.group = c("H", 
"W", "W", "AE", "AE", "H", "W", "W", "AE", "AE"), consistent = c(TRUE, 
TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE), p = c(4.92245771293119e-05, 
1.08157386873641e-21, 2.06049782601929e-14, 0.999999999562468, 
0.999999999985399, 1, 1, 0.999999999999996, 6.51428091733489e-09, 
2.3200965815753e-10), padjust = c(0.0166308749872604, 8.52640733187206e-19, 
1.2182693396339e-11, 1, 1, 1, 1, 1, 9.06251433499824e-07, 3.91930601101827e-08
), metal = c("zn", "zn", "zn", "mg", "mg", "ca", "ca", "ca", 
"ca", "ca"), perc = c(0.841726618705036, 0.831807780320366, 0.519281914893617, 
0.875598086124402, 0.876651982378855, 0, 0, 0, 0, 0), sig = c("TRUE", 
"TRUE", "TRUE", "FALSE", "FALSE", "FALSE", "FALSE", "FALSE", 
"TRUE", "TRUE"), which = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 
1L, 1L, 1L, 1L), .Label = c("compress", "normal"), class = "factor")), .Names = c("id", 
"description", "IPR.group", "consistent", "p", "padjust", "metal", 
"perc", "sig", "which"), row.names = c(NA, -10L), class = "data.frame")

test_table
           id                  description IPR.group consistent            p      padjust metal      perc   sig    which
1  GO:0005634                      nucleus         H       TRUE 4.922458e-05 1.663087e-02    zn 0.8417266  TRUE   normal
2  GO:0008270             zinc ion binding         W       TRUE 1.081574e-21 8.526407e-19    zn 0.8318078  TRUE   normal
3  GO:0046914 transition metal ion binding         W       TRUE 2.060498e-14 1.218269e-11    zn 0.5192819  TRUE   normal
4  GO:0005525                  GTP binding        AE       TRUE 1.000000e+00 1.000000e+00    mg 0.8755981 FALSE   normal
5  GO:0019001    guanyl nucleotide binding        AE       TRUE 1.000000e+00 1.000000e+00    mg 0.8766520 FALSE   normal
6  GO:0005634                      nucleus         H       TRUE 1.000000e+00 1.000000e+00    ca 0.0000000 FALSE compress
7  GO:0008270             zinc ion binding         W       TRUE 1.000000e+00 1.000000e+00    ca 0.0000000 FALSE compress
8  GO:0046914 transition metal ion binding         W       TRUE 1.000000e+00 1.000000e+00    ca 0.0000000 FALSE compress
9  GO:0005525                  GTP binding        AE       TRUE 6.514281e-09 9.062514e-07    ca 0.0000000  TRUE compress
10 GO:0019001    guanyl nucleotide binding        AE       TRUE 2.320097e-10 3.919306e-08    ca 0.0000000  TRUE compress

So, I can start to get close if I do:

library(tables)
tabular(id ~ which*(p + padjust + metal + perc + sig)*Heading()*identity, data = test_table)




    which                                                                      
            compress                             normal                                
 id         p         padjust   metal perc sig   p         padjust   metal perc   sig  
 GO:0005525 6.514e-09 9.063e-07 ca    0    TRUE  1.000e+00 1.000e+00 mg    0.8756 FALSE
 GO:0005634 1.000e+00 1.000e+00 ca    0    FALSE 4.922e-05 1.663e-02 zn    0.8417 TRUE 
 GO:0008270 1.000e+00 1.000e+00 ca    0    FALSE 1.082e-21 8.526e-19 zn    0.8318 TRUE 
 GO:0019001 2.320e-10 3.919e-08 ca    0    TRUE  1.000e+00 1.000e+00 mg    0.8767 FALSE
 GO:0046914 1.000e+00 1.000e+00 ca    0    FALSE 2.060e-14 1.218e-11 zn    0.5193 TRUE 

But, as soon as I try to add the description column anywhere I think it should be, I start to get errors:

tabular((id + description) ~ which*(p + padjust + metal + perc + sig)*Heading()*identity, data = test_table)
# Error in term2table(rows[[i]], cols[[j]], data, n) : Duplicate values: description and p

tabular((id + IPR.group) ~ which*(p + padjust + metal + perc + sig)*Heading()*identity, data = test_table)
# Error in term2table(rows[[i]], cols[[j]], data, n) : Duplicate values: IPR.group and p

Even putting description in the independent side returns something really funny where the character gets turned into a numeric:

tabular(id ~ description + which*(p + padjust + metal + perc + sig)*Heading()*identity, data = test_table)
                        which                                                                      
                        compress                             normal                                
 id         description p         padjust   metal perc sig   p         padjust   metal perc   sig  
 GO:0005525 2           6.514e-09 9.063e-07 ca    0    TRUE  1.000e+00 1.000e+00 mg    0.8756 FALSE
 GO:0005634 2           1.000e+00 1.000e+00 ca    0    FALSE 4.922e-05 1.663e-02 zn    0.8417 TRUE 
 GO:0008270 2           1.000e+00 1.000e+00 ca    0    FALSE 1.082e-21 8.526e-19 zn    0.8318 TRUE 
 GO:0019001 2           2.320e-10 3.919e-08 ca    0    TRUE  1.000e+00 1.000e+00 mg    0.8767 FALSE
 GO:0046914 2           1.000e+00 1.000e+00 ca    0    FALSE 2.060e-14 1.218e-11 zn    0.5193 TRUE 

I can fudge it if I make a new column that is the concatenation of the columns I want displayed, but I'd have to write something to make them all look consistent:

test_table$ID <- paste0(test_table$id, " ", test_table$description, " ", test_table$IPR.group)
test_table$ID <- factor(test_table$ID)
tabular(ID ~ which*(p + padjust + metal + perc + sig)*Heading()*identity, data = test_table)


                                           which                                                                      
                                           compress                             normal                                
 ID                                        p         padjust   metal perc sig   p         padjust   metal perc   sig  
 GO:0005525 GTP binding AE                 6.514e-09 9.063e-07 ca    0    TRUE  1.000e+00 1.000e+00 mg    0.8756 FALSE
 GO:0005634 nucleus H                      1.000e+00 1.000e+00 ca    0    FALSE 4.922e-05 1.663e-02 zn    0.8417 TRUE 
 GO:0008270 zinc ion binding W             1.000e+00 1.000e+00 ca    0    FALSE 1.082e-21 8.526e-19 zn    0.8318 TRUE 
 GO:0019001 guanyl nucleotide binding AE   2.320e-10 3.919e-08 ca    0    TRUE  1.000e+00 1.000e+00 mg    0.8767 FALSE
 GO:0046914 transition metal ion binding W 1.000e+00 1.000e+00 ca    0    FALSE 2.060e-14 1.218e-11 zn    0.5193 TRUE 

I thought I should be able to do it using 1 of the other solutions above, but not so much. Any help would be appreciated. Also, any solutions should also remove the which that is shown above compress and normal in the header of the table.

Upvotes: 0

Views: 533

Answers (1)

joran
joran

Reputation: 173677

This seems close, at least:

> tabular(id ~ Heading()*which*(description + p + padjust + metal + perc + sig)*Heading()*identity, data = test_table)

            compress                                                         
 id         description                  p         padjust   metal perc sig  
 GO:0005525 GTP binding                  6.514e-09 9.063e-07 ca    0    TRUE 
 GO:0005634 nucleus                      1.000e+00 1.000e+00 ca    0    FALSE
 GO:0008270 zinc ion binding             1.000e+00 1.000e+00 ca    0    FALSE
 GO:0019001 guanyl nucleotide binding    2.320e-10 3.919e-08 ca    0    TRUE 
 GO:0046914 transition metal ion binding 1.000e+00 1.000e+00 ca    0    FALSE

 normal                                                             
 description                  p         padjust   metal perc   sig  
 GTP binding                  1.000e+00 1.000e+00 mg    0.8756 FALSE
 nucleus                      4.922e-05 1.663e-02 zn    0.8417 TRUE 
 zinc ion binding             1.082e-21 8.526e-19 zn    0.8318 TRUE 
 guanyl nucleotide binding    1.000e+00 1.000e+00 mg    0.8767 FALSE
 transition metal ion binding 2.060e-14 1.218e-11 zn    0.5193 TRUE 

...but you may not be happy with the duplication of the description column in each which group. There might be a way to fix that by pulling the description term outside of the parens, but it looks like that will require some other magical incantation as the naive change complains with an error about duplicate values in combination with p it seems.

Edit: So close to the magic incantation...

tabular(id ~ (description*Heading()*min)+Heading()*which*(p + padjust + metal + perc + sig)*Heading()*identity, data = test_table)

This looks right (maybe?). Issue appears to be tabular really want to apply a summary function to description. unique() would probably be a better choice of "dummy" summary function than min() in this case I suppose, and seems to give the same result.

Edit: Latest refinement...

> tabular(id ~ (description*Heading()*unique)+Heading()*which*(p + padjust + metal + perc + sig)*Heading()*identity, data = test_table)

                                         compress                            
 id         description                  p         padjust   metal perc sig  
 GO:0005525 GTP binding                  6.514e-09 9.063e-07 ca    0    TRUE 
 GO:0005634 nucleus                      1.000e+00 1.000e+00 ca    0    FALSE
 GO:0008270 zinc ion binding             1.000e+00 1.000e+00 ca    0    FALSE
 GO:0019001 guanyl nucleotide binding    2.320e-10 3.919e-08 ca    0    TRUE 
 GO:0046914 transition metal ion binding 1.000e+00 1.000e+00 ca    0    FALSE

 normal                                
 p         padjust   metal perc   sig  
 1.000e+00 1.000e+00 mg    0.8756 FALSE
 4.922e-05 1.663e-02 zn    0.8417 TRUE 
 1.082e-21 8.526e-19 zn    0.8318 TRUE 
 1.000e+00 1.000e+00 mg    0.8767 FALSE
 2.060e-14 1.218e-11 zn    0.5193 TRUE 

Upvotes: 1

Related Questions