Jovan
Jovan

Reputation: 815

Dividing Summary table Matrix into a few table Matrix in R

So I have a matrix which dim is 17 cols and 1000 rows (all of it is numeric), and then I summary the matrix, summary(matrix) then I got these:

Picture 1) Summary Matrix

My Question is: Is there anyway to split these summary table into a few table? like these

          V1  V2  V3  V4  V5  V6 
Min

1st Qu

Median 

Mean

3rd Qu

Max

           V7  V8  V9  V10  V11  V12 

Min

1st Qu

Median 

Mean

3rd Qu

Max

           V13  V14  V15  V16  V17  

Min

1st Qu

Median 

Mean

3rd Qu

Max

I need to maintain space in my R shiny app for these matrix to be displayed without make it display collide each other like these

Picture 2) Summary Colliding

Note: sorry if all i can state is a picture

Upvotes: 2

Views: 503

Answers (2)

G. Grothendieck
G. Grothendieck

Reputation: 269852

1) read.dcf/unnest The elements of the matrix are of DCF form so we can use read.dcf and then unnest that:

library(tidyr)

s <- summary(mtcars)
DF <- read.dcf(textConnection(s), all = TRUE)
res <- setNames(data.frame(t(unnest(DF)), check.names = FALSE), trimws(colnames(s)))

giving:

> res
          mpg   cyl  disp    hp  drat    wt  qsec     vs     am  gear  carb
Min.    10.40 4.000  71.1  52.0 2.760 1.513 14.50 0.0000 0.0000 3.000 1.000
1st Qu. 15.43 4.000 120.8  96.5 3.080 2.581 16.89 0.0000 0.0000 3.000 2.000
Median  19.20 6.000 196.3 123.0 3.695 3.325 17.71 0.0000 0.0000 4.000 2.000
Mean    20.09 6.188 230.7 146.7 3.597 3.217 17.85 0.4375 0.4062 3.688 2.812
3rd Qu. 22.80 8.000 326.0 180.0 3.920 3.610 18.90 1.0000 1.0000 4.000 4.000
Max.    33.90 8.000 472.0 335.0 4.930 5.424 22.90 1.0000 1.0000 5.000 8.000

2) subset columns For reduced width this could be broken up into res[1:6] and res[7:11] or more generally if there are n columns and we want k columns per group except possibly for the last group:

n <- ncol(res)
k <- 6
g <- droplevels(gl(n, k, n)) # grouping vector
lapply(split(as.list(res), g), data.frame)

giving:

$`1`
          mpg   cyl  disp    hp  drat    wt
Min.    10.40 4.000  71.1  52.0 2.760 1.513
1st Qu. 15.43 4.000 120.8  96.5 3.080 2.581
Median  19.20 6.000 196.3 123.0 3.695 3.325
Mean    20.09 6.188 230.7 146.7 3.597 3.217
3rd Qu. 22.80 8.000 326.0 180.0 3.920 3.610
Max.    33.90 8.000 472.0 335.0 4.930 5.424

$`2`
         qsec     vs     am  gear  carb
Min.    14.50 0.0000 0.0000 3.000 1.000
1st Qu. 16.89 0.0000 0.0000 3.000 2.000
Median  17.71 0.0000 0.0000 4.000 2.000
Mean    17.85 0.4375 0.4062 3.688 2.812
3rd Qu. 18.90 1.0000 1.0000 4.000 4.000
Max.    22.90 1.0000 1.0000 5.000 8.000

3) no transpose Another alternative for reduced width is to just not transpose it:

data.frame(unnest(DF), row.names = trimws(colnames(s)), check.names = FALSE)

giving:

     Min.    1st Qu. Median  Mean    3rd Qu. Max.   
mpg    10.40   15.43   19.20   20.09   22.80   33.90
cyl    4.000   4.000   6.000   6.188   8.000   8.000
disp    71.1   120.8   196.3   230.7   326.0   472.0
hp      52.0    96.5   123.0   146.7   180.0   335.0
drat   2.760   3.080   3.695   3.597   3.920   4.930
wt     1.513   2.581   3.325   3.217   3.610   5.424
qsec   14.50   16.89   17.71   17.85   18.90   22.90
vs    0.0000  0.0000  0.0000  0.4375  1.0000  1.0000
am    0.0000  0.0000  0.0000  0.4062  1.0000  1.0000
gear   3.000   3.000   4.000   3.688   4.000   5.000
carb   1.000   2.000   2.000   2.812   4.000   8.000

4) psych::describe A simple alternative is to use psynh::describe

library(psych)

describe(mtcars)

giving:

     vars  n   mean     sd median trimmed    mad   min    max  range  skew kurtosis    se
mpg     1 32  20.09   6.03  19.20   19.70   5.41 10.40  33.90  23.50  0.61    -0.37  1.07
cyl     2 32   6.19   1.79   6.00    6.23   2.97  4.00   8.00   4.00 -0.17    -1.76  0.32
disp    3 32 230.72 123.94 196.30  222.52 140.48 71.10 472.00 400.90  0.38    -1.21 21.91
hp      4 32 146.69  68.56 123.00  141.19  77.10 52.00 335.00 283.00  0.73    -0.14 12.12
drat    5 32   3.60   0.53   3.70    3.58   0.70  2.76   4.93   2.17  0.27    -0.71  0.09
wt      6 32   3.22   0.98   3.33    3.15   0.77  1.51   5.42   3.91  0.42    -0.02  0.17
qsec    7 32  17.85   1.79  17.71   17.83   1.42 14.50  22.90   8.40  0.37     0.34  0.32
vs      8 32   0.44   0.50   0.00    0.42   0.00  0.00   1.00   1.00  0.24    -2.00  0.09
am      9 32   0.41   0.50   0.00    0.38   0.00  0.00   1.00   1.00  0.36    -1.92  0.09
gear   10 32   3.69   0.74   4.00    3.62   1.48  3.00   5.00   2.00  0.53    -1.07  0.13
carb   11 32   2.81   1.62   2.00    2.65   1.48  1.00   8.00   7.00  1.05     1.26  0.29

5) Hmisc::describe Hmisc also has a describe function:

library(Hmisc)
describe(mtcars)

giving:

mtcars 

 11  Variables      32  Observations
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
mpg 
       n  missing distinct     Info     Mean      Gmd      .05      .10      .25      .50      .75      .90      .95 
      32        0       25    0.999    20.09    6.796    12.00    14.34    15.43    19.20    22.80    30.09    31.30 

lowest : 10.4 13.3 14.3 14.7 15.0, highest: 26.0 27.3 30.4 32.4 33.9
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
cyl 
       n  missing distinct     Info     Mean      Gmd 
      32        0        3    0.866    6.188    1.948 

Value          4     6     8
Frequency     11     7    14
Proportion 0.344 0.219 0.438

...etc...

6) skimr::skim This is a new package. It can produce spark graphics as part of the summary output; however, that depends on font support which may be tricky so we have disabled that part below. Note that skim requires a data frame as input so if your input is a matrix use skim(as.data.frame(input)).

library(skimr)
skim_with(numeric = list(hist = NULL)) # omit spark histogram
skim(mtcars) 

giving:

Skim summary statistics
 n obs: 32 
 n variables: 11 

Variable type: numeric 
   variable missing complete  n   mean     sd   min    p25 median    p75    max
1        am       0       32 32   0.41   0.5   0      0      0      1      1   
2      carb       0       32 32   2.81   1.62  1      2      2      4      8   
3       cyl       0       32 32   6.19   1.79  4      4      6      8      8   
4      disp       0       32 32 230.72 123.94 71.1  120.83 196.3  326    472   
5      drat       0       32 32   3.6    0.53  2.76   3.08   3.7    3.92   4.93
6      gear       0       32 32   3.69   0.74  3      3      4      4      5   
7        hp       0       32 32 146.69  68.56 52     96.5  123    180    335   
8       mpg       0       32 32  20.09   6.03 10.4   15.43  19.2   22.8   33.9 
9      qsec       0       32 32  17.85   1.79 14.5   16.89  17.71  18.9   22.9 
10       vs       0       32 32   0.44   0.5   0      0      0      1      1   
11       wt       0       32 32   3.22   0.98  1.51   2.58   3.33   3.61   5.42

If you want to try the spark graphics see: Skimr - cant seem to produce the histograms

7) pastecs::stat.desc The pastecs package also has a function that could be used:

stat.desc(mtcars)

giving:

                     mpg         cyl         disp           hp         drat          wt        qsec          vs          am        gear       carb
nbr.val       32.0000000  32.0000000 3.200000e+01   32.0000000  32.00000000  32.0000000  32.0000000 32.00000000 32.00000000  32.0000000 32.0000000
nbr.null       0.0000000   0.0000000 0.000000e+00    0.0000000   0.00000000   0.0000000   0.0000000 18.00000000 19.00000000   0.0000000  0.0000000
nbr.na         0.0000000   0.0000000 0.000000e+00    0.0000000   0.00000000   0.0000000   0.0000000  0.00000000  0.00000000   0.0000000  0.0000000
min           10.4000000   4.0000000 7.110000e+01   52.0000000   2.76000000   1.5130000  14.5000000  0.00000000  0.00000000   3.0000000  1.0000000
max           33.9000000   8.0000000 4.720000e+02  335.0000000   4.93000000   5.4240000  22.9000000  1.00000000  1.00000000   5.0000000  8.0000000
range         23.5000000   4.0000000 4.009000e+02  283.0000000   2.17000000   3.9110000   8.4000000  1.00000000  1.00000000   2.0000000  7.0000000
sum          642.9000000 198.0000000 7.383100e+03 4694.0000000 115.09000000 102.9520000 571.1600000 14.00000000 13.00000000 118.0000000 90.0000000
median        19.2000000   6.0000000 1.963000e+02  123.0000000   3.69500000   3.3250000  17.7100000  0.00000000  0.00000000   4.0000000  2.0000000
mean          20.0906250   6.1875000 2.307219e+02  146.6875000   3.59656250   3.2172500  17.8487500  0.43750000  0.40625000   3.6875000  2.8125000
SE.mean        1.0654240   0.3157093 2.190947e+01   12.1203173   0.09451874   0.1729685   0.3158899  0.08909831  0.08820997   0.1304266  0.2855297
CI.mean.0.95   2.1729465   0.6438934 4.468466e+01   24.7195501   0.19277224   0.3527715   0.6442617  0.18171719  0.17990541   0.2660067  0.5823417
var           36.3241028   3.1895161 1.536080e+04 4700.8669355   0.28588135   0.9573790   3.1931661  0.25403226  0.24899194   0.5443548  2.6088710
std.dev        6.0269481   1.7859216 1.239387e+02   68.5628685   0.53467874   0.9784574   1.7869432  0.50401613  0.49899092   0.7378041  1.6152000
coef.var       0.2999881   0.2886338 5.371779e-01    0.4674077   0.14866382   0.3041285   0.1001159  1.15203687  1.22828533   0.2000825  0.5742933

Upvotes: 3

Uwe
Uwe

Reputation: 42564

Another possibility would be to create summary() piecewise:

library(data.table)
for (x in split(i <- seq_along(mtcars), i %/% 4)) 
  as.data.table(mtcars)[, print(summary(.SD)), .SDcols = x]
      mpg             cyl             disp      
 Min.   :10.40   Min.   :4.000   Min.   : 71.1  
 1st Qu.:15.43   1st Qu.:4.000   1st Qu.:120.8  
 Median :19.20   Median :6.000   Median :196.3  
 Mean   :20.09   Mean   :6.188   Mean   :230.7  
 3rd Qu.:22.80   3rd Qu.:8.000   3rd Qu.:326.0  
 Max.   :33.90   Max.   :8.000   Max.   :472.0  
       hp             drat             wt             qsec      
 Min.   : 52.0   Min.   :2.760   Min.   :1.513   Min.   :14.50  
 1st Qu.: 96.5   1st Qu.:3.080   1st Qu.:2.581   1st Qu.:16.89  
 Median :123.0   Median :3.695   Median :3.325   Median :17.71  
 Mean   :146.7   Mean   :3.597   Mean   :3.217   Mean   :17.85  
 3rd Qu.:180.0   3rd Qu.:3.920   3rd Qu.:3.610   3rd Qu.:18.90  
 Max.   :335.0   Max.   :4.930   Max.   :5.424   Max.   :22.90  
       vs               am              gear            carb      
 Min.   :0.0000   Min.   :0.0000   Min.   :3.000   Min.   :1.000  
 1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:3.000   1st Qu.:2.000  
 Median :0.0000   Median :0.0000   Median :4.000   Median :2.000  
 Mean   :0.4375   Mean   :0.4062   Mean   :3.688   Mean   :2.812  
 3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:4.000   3rd Qu.:4.000  
 Max.   :1.0000   Max.   :1.0000   Max.   :5.000   Max.   :8.000

or simulating OP's matrix:

# create dummy data
mat <- matrix(1:17000, ncol = 17)
# set column names
colnames(mat) <- 1:17
# print summary piecewise
for (x in split(i <- seq_along(dt), i %/% 6)) 
  print(summary(mat[, x]))
       1                2              3              4              5       
 Min.   :   1.0   Min.   :1001   Min.   :2001   Min.   :3001   Min.   :4001  
 1st Qu.: 250.8   1st Qu.:1251   1st Qu.:2251   1st Qu.:3251   1st Qu.:4251  
 Median : 500.5   Median :1500   Median :2500   Median :3500   Median :4500  
 Mean   : 500.5   Mean   :1500   Mean   :2500   Mean   :3500   Mean   :4500  
 3rd Qu.: 750.2   3rd Qu.:1750   3rd Qu.:2750   3rd Qu.:3750   3rd Qu.:4750  
 Max.   :1000.0   Max.   :2000   Max.   :3000   Max.   :4000   Max.   :5000  
       6              7              8              9              10              11       
 Min.   :5001   Min.   :6001   Min.   :7001   Min.   :8001   Min.   : 9001   Min.   :10001  
 1st Qu.:5251   1st Qu.:6251   1st Qu.:7251   1st Qu.:8251   1st Qu.: 9251   1st Qu.:10251  
 Median :5500   Median :6500   Median :7500   Median :8500   Median : 9500   Median :10500  
 Mean   :5500   Mean   :6500   Mean   :7500   Mean   :8500   Mean   : 9500   Mean   :10500  
 3rd Qu.:5750   3rd Qu.:6750   3rd Qu.:7750   3rd Qu.:8750   3rd Qu.: 9750   3rd Qu.:10750  
 Max.   :6000   Max.   :7000   Max.   :8000   Max.   :9000   Max.   :10000   Max.   :11000  
       12              13              14              15              16              17       
 Min.   :11001   Min.   :12001   Min.   :13001   Min.   :14001   Min.   :15001   Min.   :16001  
 1st Qu.:11251   1st Qu.:12251   1st Qu.:13251   1st Qu.:14251   1st Qu.:15251   1st Qu.:16251  
 Median :11500   Median :12500   Median :13500   Median :14500   Median :15500   Median :16500  
 Mean   :11500   Mean   :12500   Mean   :13500   Mean   :14500   Mean   :15500   Mean   :16500  
 3rd Qu.:11750   3rd Qu.:12750   3rd Qu.:13750   3rd Qu.:14750   3rd Qu.:15750   3rd Qu.:16750  
 Max.   :12000   Max.   :13000   Max.   :14000   Max.   :15000   Max.   :16000   Max.   :17000

Note that in the matrix case it is recommended / required to have column names explicitly set. If the respective matrix attribute is not set, summary() uses default column names which always start at V1.

Upvotes: 0

Related Questions