Reputation: 225
I am using the breast cancer data from dslabs
package, and i am using kableExtra
to produce nice tables but i have many columns and i tried to scale down and produce one summary table of 11-12 variable at the time but still looking too small.
Question1: Do you have any idea how i could produce ONE table with rows as my variables (columns) and columns with the statistics min
, max
, mean
... ? I am sure there is an efficient way to do this
# my current code
data(brca)
data=data.frame(cbind(brca$y,brca$x))
data=data %>%
rename(
Diagnosis = V1
)
desc=summary(data)
kable(desc[,1:12],caption = "Descriptif of variables",booktabs=T)%>%
kable_styling(latex_options =c("striped", "scale_down"))
kable(desc[,13:21],caption = "Descriptif of variables",booktabs=T)%>%
kable_styling(latex_options =c("striped", "scale_down"))
kable(desc[,22:31],caption = "Descriptif of variables",booktabs=T)%>%
kable_styling(latex_options =c("striped", "scale_down"))
Question 2: If you have any idea or recourses on how to include Rmd files in latex that would be appreciated, because right now, i produce my tables with R and then take a screen shot of my tables and finally insert them in my Latex (it is exhausting i have too many tables).
Thank you in advance for your help.
Upvotes: 2
Views: 683
Reputation: 18752
Question1:
You can summarize all the columns in your dataframe with dplyr::summarize_all
. This will output a wide dataset with one row and the number of columns equal to the number of columns in your dataset times the number of summary statistics you want. For example, it will contain texture_mean_mean
, texture_mean_med
, texture_mean_max
.
tidyr::pivot_longer
will pivot this wide dataset to the longer dataset you want. names_to
and names_pattern
is how this is done. (.*)_(.*)$
is a regular expression that captures two things: everything before the last underscore and everything after the last underscore: (texture_mean)_(mean)
. The first capture is mapped to the values of a column named "variable" and the second capture becomes the name of a new column with the corresponding value.
data %>%
dplyr::summarize_all(list(mean = ~mean(.),
med = ~median(.),
max = ~max(.))) %>%
tidyr::pivot_longer(everything(),
names_to = c("variable", ".value"),
names_pattern = "(.*)_(.*)$")
Question2:
Look into Hmisc::latex
function. It outputs latex code to a file:
data %>%
dplyr::summarize_all(list(mean = ~mean(.))) %>%
tidyr::pivot_longer(everything(),
names_to = c("variable", ".value"),
names_pattern = "(.*)_(.*)$") %>%
Hmisc::latex(na.blank = TRUE,
booktabs = TRUE,
table.env = FALSE,
center = "none",
file = "",
title = "")
Will output
%latex.default(., na.blank = TRUE, booktabs = TRUE, table.env = FALSE, center = "none", file = "", title = "")%
\begin{tabular}{llrrr}
\toprule
\multicolumn{1}{l}{}&\multicolumn{1}{c}{variable}&\multicolumn{1}{c}{mean}&\multicolumn{1}{c}{med}&\multicolumn{1}{c}{max}\tabularnewline
\midrule
1&Diagnosis&$1.37258347978910e+00$&$1.000e+00$&$2.000e+00$\tabularnewline
2&radius_mean&$1.41272917398946e+01$&$1.337e+01$&$2.811e+01$\tabularnewline
3&texture_mean&$1.92896485061511e+01$&$1.884e+01$&$3.928e+01$\tabularnewline
4&perimeter_mean&$9.19690333919156e+01$&$8.624e+01$&$1.885e+02$\tabularnewline
5&area_mean&$6.54889103690685e+02$&$5.511e+02$&$2.501e+03$\tabularnewline
6&smoothness_mean&$9.63602811950791e-02$&$9.587e-02$&$1.634e-01$\tabularnewline
7&compactness_mean&$1.04340984182777e-01$&$9.263e-02$&$3.454e-01$\tabularnewline
8&concavity_mean&$8.87993158172232e-02$&$6.154e-02$&$4.268e-01$\tabularnewline
9&concave_pts_mean&$4.89191458699473e-02$&$3.350e-02$&$2.012e-01$\tabularnewline
10&symmetry_mean&$1.81161862917399e-01$&$1.792e-01$&$3.040e-01$\tabularnewline
11&fractal_dim_mean&$6.27976098418278e-02$&$6.154e-02$&$9.744e-02$\tabularnewline
12&radius_se&$4.05172056239016e-01$&$3.242e-01$&$2.873e+00$\tabularnewline
13&texture_se&$1.21685342706503e+00$&$1.108e+00$&$4.885e+00$\tabularnewline
14&perimeter_se&$2.86605922671353e+00$&$2.287e+00$&$2.198e+01$\tabularnewline
15&area_se&$4.03370790861160e+01$&$2.453e+01$&$5.422e+02$\tabularnewline
16&smoothness_se&$7.04097891036907e-03$&$6.380e-03$&$3.113e-02$\tabularnewline
17&compactness_se&$2.54781388400703e-02$&$2.045e-02$&$1.354e-01$\tabularnewline
18&concavity_se&$3.18937163444640e-02$&$2.589e-02$&$3.960e-01$\tabularnewline
19&concave_pts_se&$1.17961370826011e-02$&$1.093e-02$&$5.279e-02$\tabularnewline
20&symmetry_se&$2.05422987697715e-02$&$1.873e-02$&$7.895e-02$\tabularnewline
21&fractal_dim_se&$3.79490386643234e-03$&$3.187e-03$&$2.984e-02$\tabularnewline
22&radius_worst&$1.62691898066784e+01$&$1.497e+01$&$3.604e+01$\tabularnewline
23&texture_worst&$2.56772231985940e+01$&$2.541e+01$&$4.954e+01$\tabularnewline
24&perimeter_worst&$1.07261212653779e+02$&$9.766e+01$&$2.512e+02$\tabularnewline
25&area_worst&$8.80583128295255e+02$&$6.865e+02$&$4.254e+03$\tabularnewline
26&smoothness_worst&$1.32368594024605e-01$&$1.313e-01$&$2.226e-01$\tabularnewline
27&compactness_worst&$2.54265043936731e-01$&$2.119e-01$&$1.058e+00$\tabularnewline
28&concavity_worst&$2.72188483304042e-01$&$2.267e-01$&$1.252e+00$\tabularnewline
29&concave_pts_worst&$1.14606223198594e-01$&$9.993e-02$&$2.910e-01$\tabularnewline
30&symmetry_worst&$2.90075571177504e-01$&$2.822e-01$&$6.638e-01$\tabularnewline
31&fractal_dim_worst&$8.39458172231986e-02$&$8.004e-02$&$2.075e-01$\tabularnewline
\bottomrule
\end{tabular}
For more information check out this question (specifically the answer using the latex
function)
Upvotes: 2