Reputation: 1891
I have a data.frame that I want to use tables::tabular()
to setup for nice printing in latex. It has 5 repeated items in two groups (normal
and compress
), where I want three items to not be grouped, and then the rest to be grouped.
test_table <- structure(list(id = structure(c(2L, 3L, 5L, 1L, 4L, 2L, 3L, 5L,
1L, 4L), .Label = c("GO:0005525", "GO:0005634", "GO:0008270",
"GO:0019001", "GO:0046914"), class = "factor"), description = c("nucleus",
"zinc ion binding", "transition metal ion binding", "GTP binding",
"guanyl nucleotide binding", "nucleus", "zinc ion binding", "transition metal ion binding",
"GTP binding", "guanyl nucleotide binding"), IPR.group = c("H",
"W", "W", "AE", "AE", "H", "W", "W", "AE", "AE"), consistent = c(TRUE,
TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE), p = c(4.92245771293119e-05,
1.08157386873641e-21, 2.06049782601929e-14, 0.999999999562468,
0.999999999985399, 1, 1, 0.999999999999996, 6.51428091733489e-09,
2.3200965815753e-10), padjust = c(0.0166308749872604, 8.52640733187206e-19,
1.2182693396339e-11, 1, 1, 1, 1, 1, 9.06251433499824e-07, 3.91930601101827e-08
), metal = c("zn", "zn", "zn", "mg", "mg", "ca", "ca", "ca",
"ca", "ca"), perc = c(0.841726618705036, 0.831807780320366, 0.519281914893617,
0.875598086124402, 0.876651982378855, 0, 0, 0, 0, 0), sig = c("TRUE",
"TRUE", "TRUE", "FALSE", "FALSE", "FALSE", "FALSE", "FALSE",
"TRUE", "TRUE"), which = structure(c(2L, 2L, 2L, 2L, 2L, 1L,
1L, 1L, 1L, 1L), .Label = c("compress", "normal"), class = "factor")), .Names = c("id",
"description", "IPR.group", "consistent", "p", "padjust", "metal",
"perc", "sig", "which"), row.names = c(NA, -10L), class = "data.frame")
test_table
id description IPR.group consistent p padjust metal perc sig which
1 GO:0005634 nucleus H TRUE 4.922458e-05 1.663087e-02 zn 0.8417266 TRUE normal
2 GO:0008270 zinc ion binding W TRUE 1.081574e-21 8.526407e-19 zn 0.8318078 TRUE normal
3 GO:0046914 transition metal ion binding W TRUE 2.060498e-14 1.218269e-11 zn 0.5192819 TRUE normal
4 GO:0005525 GTP binding AE TRUE 1.000000e+00 1.000000e+00 mg 0.8755981 FALSE normal
5 GO:0019001 guanyl nucleotide binding AE TRUE 1.000000e+00 1.000000e+00 mg 0.8766520 FALSE normal
6 GO:0005634 nucleus H TRUE 1.000000e+00 1.000000e+00 ca 0.0000000 FALSE compress
7 GO:0008270 zinc ion binding W TRUE 1.000000e+00 1.000000e+00 ca 0.0000000 FALSE compress
8 GO:0046914 transition metal ion binding W TRUE 1.000000e+00 1.000000e+00 ca 0.0000000 FALSE compress
9 GO:0005525 GTP binding AE TRUE 6.514281e-09 9.062514e-07 ca 0.0000000 TRUE compress
10 GO:0019001 guanyl nucleotide binding AE TRUE 2.320097e-10 3.919306e-08 ca 0.0000000 TRUE compress
So, I can start to get close if I do:
library(tables)
tabular(id ~ which*(p + padjust + metal + perc + sig)*Heading()*identity, data = test_table)
which
compress normal
id p padjust metal perc sig p padjust metal perc sig
GO:0005525 6.514e-09 9.063e-07 ca 0 TRUE 1.000e+00 1.000e+00 mg 0.8756 FALSE
GO:0005634 1.000e+00 1.000e+00 ca 0 FALSE 4.922e-05 1.663e-02 zn 0.8417 TRUE
GO:0008270 1.000e+00 1.000e+00 ca 0 FALSE 1.082e-21 8.526e-19 zn 0.8318 TRUE
GO:0019001 2.320e-10 3.919e-08 ca 0 TRUE 1.000e+00 1.000e+00 mg 0.8767 FALSE
GO:0046914 1.000e+00 1.000e+00 ca 0 FALSE 2.060e-14 1.218e-11 zn 0.5193 TRUE
But, as soon as I try to add the description
column anywhere I think
it should be, I start to get errors:
tabular((id + description) ~ which*(p + padjust + metal + perc + sig)*Heading()*identity, data = test_table)
# Error in term2table(rows[[i]], cols[[j]], data, n) : Duplicate values: description and p
tabular((id + IPR.group) ~ which*(p + padjust + metal + perc + sig)*Heading()*identity, data = test_table)
# Error in term2table(rows[[i]], cols[[j]], data, n) : Duplicate values: IPR.group and p
Even putting description
in the independent side returns something really funny where the character gets turned into a numeric:
tabular(id ~ description + which*(p + padjust + metal + perc + sig)*Heading()*identity, data = test_table)
which
compress normal
id description p padjust metal perc sig p padjust metal perc sig
GO:0005525 2 6.514e-09 9.063e-07 ca 0 TRUE 1.000e+00 1.000e+00 mg 0.8756 FALSE
GO:0005634 2 1.000e+00 1.000e+00 ca 0 FALSE 4.922e-05 1.663e-02 zn 0.8417 TRUE
GO:0008270 2 1.000e+00 1.000e+00 ca 0 FALSE 1.082e-21 8.526e-19 zn 0.8318 TRUE
GO:0019001 2 2.320e-10 3.919e-08 ca 0 TRUE 1.000e+00 1.000e+00 mg 0.8767 FALSE
GO:0046914 2 1.000e+00 1.000e+00 ca 0 FALSE 2.060e-14 1.218e-11 zn 0.5193 TRUE
I can fudge it if I make a new column that is the concatenation of the columns I want displayed, but I'd have to write something to make them all look consistent:
test_table$ID <- paste0(test_table$id, " ", test_table$description, " ", test_table$IPR.group)
test_table$ID <- factor(test_table$ID)
tabular(ID ~ which*(p + padjust + metal + perc + sig)*Heading()*identity, data = test_table)
which
compress normal
ID p padjust metal perc sig p padjust metal perc sig
GO:0005525 GTP binding AE 6.514e-09 9.063e-07 ca 0 TRUE 1.000e+00 1.000e+00 mg 0.8756 FALSE
GO:0005634 nucleus H 1.000e+00 1.000e+00 ca 0 FALSE 4.922e-05 1.663e-02 zn 0.8417 TRUE
GO:0008270 zinc ion binding W 1.000e+00 1.000e+00 ca 0 FALSE 1.082e-21 8.526e-19 zn 0.8318 TRUE
GO:0019001 guanyl nucleotide binding AE 2.320e-10 3.919e-08 ca 0 TRUE 1.000e+00 1.000e+00 mg 0.8767 FALSE
GO:0046914 transition metal ion binding W 1.000e+00 1.000e+00 ca 0 FALSE 2.060e-14 1.218e-11 zn 0.5193 TRUE
I thought I should be able to do it using 1 of the other solutions above, but not so much. Any help would be appreciated. Also, any solutions should also remove the which
that is shown above compress
and normal
in the header of the table.
Upvotes: 0
Views: 533
Reputation: 173677
This seems close, at least:
> tabular(id ~ Heading()*which*(description + p + padjust + metal + perc + sig)*Heading()*identity, data = test_table)
compress
id description p padjust metal perc sig
GO:0005525 GTP binding 6.514e-09 9.063e-07 ca 0 TRUE
GO:0005634 nucleus 1.000e+00 1.000e+00 ca 0 FALSE
GO:0008270 zinc ion binding 1.000e+00 1.000e+00 ca 0 FALSE
GO:0019001 guanyl nucleotide binding 2.320e-10 3.919e-08 ca 0 TRUE
GO:0046914 transition metal ion binding 1.000e+00 1.000e+00 ca 0 FALSE
normal
description p padjust metal perc sig
GTP binding 1.000e+00 1.000e+00 mg 0.8756 FALSE
nucleus 4.922e-05 1.663e-02 zn 0.8417 TRUE
zinc ion binding 1.082e-21 8.526e-19 zn 0.8318 TRUE
guanyl nucleotide binding 1.000e+00 1.000e+00 mg 0.8767 FALSE
transition metal ion binding 2.060e-14 1.218e-11 zn 0.5193 TRUE
...but you may not be happy with the duplication of the description
column in each which
group. There might be a way to fix that by pulling the description
term outside of the parens, but it looks like that will require some other magical incantation as the naive change complains with an error about duplicate values in combination with p
it seems.
Edit: So close to the magic incantation...
tabular(id ~ (description*Heading()*min)+Heading()*which*(p + padjust + metal + perc + sig)*Heading()*identity, data = test_table)
This looks right (maybe?). Issue appears to be tabular
really want to apply a summary function to description
. unique()
would probably be a better choice of "dummy" summary function than min()
in this case I suppose, and seems to give the same result.
Edit: Latest refinement...
> tabular(id ~ (description*Heading()*unique)+Heading()*which*(p + padjust + metal + perc + sig)*Heading()*identity, data = test_table)
compress
id description p padjust metal perc sig
GO:0005525 GTP binding 6.514e-09 9.063e-07 ca 0 TRUE
GO:0005634 nucleus 1.000e+00 1.000e+00 ca 0 FALSE
GO:0008270 zinc ion binding 1.000e+00 1.000e+00 ca 0 FALSE
GO:0019001 guanyl nucleotide binding 2.320e-10 3.919e-08 ca 0 TRUE
GO:0046914 transition metal ion binding 1.000e+00 1.000e+00 ca 0 FALSE
normal
p padjust metal perc sig
1.000e+00 1.000e+00 mg 0.8756 FALSE
4.922e-05 1.663e-02 zn 0.8417 TRUE
1.082e-21 8.526e-19 zn 0.8318 TRUE
1.000e+00 1.000e+00 mg 0.8767 FALSE
2.060e-14 1.218e-11 zn 0.5193 TRUE
Upvotes: 1