How to generate a (Pivot) table containing calculations of shares using dplyr

Question

I would like to generate a table out of a dataset. Of the variables to be displayed two are strings and the other three are numeric. The numeric variables contain information in absolute numbers. Next to the display of these absolute numbers in the table (sum of students starting at different universities), I want to display the respective shares of women as percentages. The dataset contains the sums as well as the shares of women in total numbers as separate variables (in columns).

Grouped by x1 and x2, the table should contain columns containing total sums (x3a, x4a, x5a) and shares in percent of female students (x3b, x4b, x5b).

As the dataset contains the total sums in a variable, I think I just need to add the respective variable into the code somewhere after some sort of grouping functions, but the shares still need to be calculated and then printed into a new variable/column and I seem to can't figure out a code for the whole table. I know there should be some group_by, summarise and mutate functions and the strings have to be factorised to get the code running, but I haven't found a solution yet.

Any help would be greatly appreciated!

That's how it should look like:

library(dplyr)
df=data.frame(
  x1=c("chr","chr","chr","chr","chr","chr","chr","chr","chr","chr"),
  x2=c("chr","chr","chr","chr","chr","chr","chr","chr","chr","chr"),
  x3=c(1,0,0,NA,0,1,1,NA,0,1), #x3=year of university start
  #x3a=containing total number (of students starting university)
  #x3b=containing percentage of female students, calculated on x3a
  x4=c(1,1,NA,1,1,0,NA,NA,0,1),#x4=year of university start
  #x4a
  #x4b
  x5=c(1,0,NA,1,0,0,NA,0,0,1)) #x5=year of university start
  #x5a
  #x5b

> df

I tried summarise and mutate functions but didn't manage to get the described table with correctly calculated shares.

How to generate a (Pivot) table containing calculations of shares using dplyr

Answers (1)

Related Questions