Stata_user
Stata_user

Reputation: 564

Descriptive table with number of observations by year and group

I would like to create a publication quality descriptive table in R with the number of observations by year and by group (spec).

I have the following data:

year    npatient nclaim spec
2011    1   2   1
2012    7   5   1
2013    6   4   1
2011    4   1   2
2012    6   9   2
2013    7   5   2
2011    10  10  3
2012    7   8   3
2013    5   3   3

I want to output to look something like this:

         2011 2012  2013
Spec =1         
   npatient 1   7   6
   nclaim   2   5   4
Spec =2         
   npatient 4   6   7
   nclaim   1   9   5
Spec =3         
   npatient 10  7   5
   nclaim   10  8   3

Thanks for your help.

Upvotes: 1

Views: 73

Answers (1)

NelsonGon
NelsonGon

Reputation: 13319

We can get similar output as follows using the tidyverse.

Note however that this exact output is not as far as I know supported in R(I do believe pandas from python can display it as it is in the question). It might be possible with newer pivot_* functions from tidyr:

df %>% 
   gather(key,val,-c(spec,year)) %>% 
   spread(year,val)
  spec      key 2011 2012 2013
1    1   nclaim    2    5    4
2    1 npatient    1    7    6
3    2   nclaim    1    9    5
4    2 npatient    4    6    7
5    3   nclaim   10    8    3
6    3 npatient   10    7    5

We can feed the above into kable(not very pretty):

df %>% 
  gather(key,val,-c(spec,year)) %>% 
   spread(year,val) %>% 
  knitr::kable()


| spec|key      | 2011| 2012| 2013|
|----:|:--------|----:|----:|----:|
|    1|nclaim   |    2|    5|    4|
|    1|npatient |    1|    7|    6|
|    2|nclaim   |    1|    9|    5|
|    2|npatient |    4|    6|    7|
|    3|nclaim   |   10|    8|    3|
|    3|npatient |   10|    7|    5|

Alternatively(less useful if the aim is publication ready prints):

df %>% 
   gather(key,val,-c(spec,year)) %>% 
   spread(year,val) %>% 
  split(.$spec)
$`1`
  spec      key 2011 2012 2013
1    1   nclaim    2    5    4
2    1 npatient    1    7    6

$`2`
  spec      key 2011 2012 2013
3    2   nclaim    1    9    5
4    2 npatient    4    6    7

$`3`
  spec      key 2011 2012 2013
5    3   nclaim   10    8    3
6    3 npatient   10    7    5

Data:

df <- structure(list(year = c(2011L, 2012L, 2013L, 2011L, 2012L, 2013L, 
2011L, 2012L, 2013L), npatient = c(1L, 7L, 6L, 4L, 6L, 7L, 10L, 
7L, 5L), nclaim = c(2L, 5L, 4L, 1L, 9L, 5L, 10L, 8L, 3L), spec = c(1L, 
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L)), class = "data.frame", row.names = c(NA, 
-9L))

Upvotes: 3

Related Questions