Sandhya Ghildiyal
Sandhya Ghildiyal

Reputation: 273

Applying multiple formulae to selected columns of dataframe in R

With my dataset "chile2", I am doing following calculations:

Tab<-with(chile2,table(Q27,Q12_1_TEXT))
Tab<-as.data.frame.matrix(Tab)

I want the output like below (only 4 columns are shown):

I am trying below code to do manipulations to the columns, which is strangely calculating few columns but not all:

 for(col in names(T1)){
    T1[col]=(T1[col]*100)/colSums(T1)
 }

  for(t in names(T1)){
    T1[t]=paste(round(T1[t],0),"%")
  }

The "All" row can be added after this.

Any other better way to do this?

dput(chile2)

structure(list(Q27 = structure(c(2L, 1L, 2L, 2L, 1L, 2L, 2L, 
2L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L), .Label = c("Si", 
"No"), class = "factor"), Q12_1_TEXT = c("Abertis Autopistas S.a.", 
"Accenture", "Adessa Falabella", "Administradora de Fondos de Cesantía", 
"AES GENER S.A.", "AFP HABITAT S.A.", "Agrícola Ariztía", "Agrosuper S.A.", 
"Aguas de Antofagasta S.A.", "AIG Chile Compañía de Seguros Generales S.A.", 
"Arcos Dorados Restaurantes de Chile Ltda", "Artel", "Arval - Relsa", 
"Banchile Inversiones", "BANCO INTERNACIONAL", "BASF CHILE SA", 
"BBVA Chile", "BOEHRINGER INGELHEIM LTDA", "Bredenmaster S.A.", 
"Caja de Compensación 18 de Septiembre")), class = c("data.table", 
"data.frame"), row.names = c(NA, -20L), .internal.selfref = <pointer: 0x0000000000100788>, .Names = c("Q27", 
"Q12_1_TEXT"))

Upvotes: 1

Views: 81

Answers (2)

Sathish
Sathish

Reputation: 12723

dcast the chile2 data to get the length by Q27 variable. Then find the proportions using prop.table and append it with '%' character using paste0()

library('data.table')
Tab <- dcast(chile2, Q27 ~ Q12_1_TEXT, fun.aggregate = length, value.var = 'Q12_1_TEXT')
Tab[, 2:ncol(Tab) := lapply(.SD, function(x) paste0( prop.table(x)*100, "%")), .SDcols = -1 ]
Tab
#    Q27 AES GENER S.A. AFP HABITAT S.A. AIG Chile Compañía de Seguros Generales S.A. Abertis Autopistas S.a.
# 1:  Si           100%               0%                                         100%                      0%
# 2:  No             0%             100%                                           0%                    100%

Now you can add All row.


EDIT:

I converted Q27 column of Tab data table from factor to character so it is consistent. Then, I added Total column which takes row sum from column 2 to last column. The .SDcols = -1 includes all columns except the first column, because it is of character type. We want to work with numeric columns when making row sums. Then I created the ALL row, so that it can be rbind later. Then the proportions with in the data is computed and % character is added to it. Finally, the Tab and ALL data are combined together using rbindlist.

library('data.table')
Tab <- dcast(chile2, Q27 ~ Q12_1_TEXT, fun.aggregate = length, value.var = 'Q12_1_TEXT')
Tab[, Q27 := as.character(Q27)]  # convert column one from factor to character
Tab[, Total := rowSums(.SD), .SDcols = -1 ]  # add Total column
ALL <- as.list( c( Q27 = "ALL", colSums(Tab[, 2:ncol(Tab)]) ) )  # create ALL row
Tab[, 2:ncol(Tab) := lapply(.SD, function(x) paste0( prop.table(x)*100, "%")), .SDcols = -1 ]  # find proportions
Tab <- rbindlist(l = list(Tab, ALL)) # row bind Tab and ALL

Output:

str(Tab)
# Classes ‘data.table’ and 'data.frame':    3 obs. of  22 variables:
# $ Q27                                         : chr  "Si" "No" "ALL"
# $ AES GENER S.A.                              : chr  "100%" "0%" "1"
# $ AFP HABITAT S.A.                            : chr  "0%" "100%" "1"
# $ AIG Chile Compañía de Seguros Generales S.A.: chr  "100%" "0%" "1"
# $ Abertis Autopistas S.a.                     : chr  "0%" "100%" "1"
# $ Accenture                                   : chr  "100%" "0%" "1"
# $ Adessa Falabella                            : chr  "0%" "100%" "1"
# $ Administradora de Fondos de Cesantía        : chr  "0%" "100%" "1"
# $ Agrosuper S.A.                              : chr  "0%" "100%" "1"
# $ Agrícola Ariztía                            : chr  "0%" "100%" "1"
# $ Aguas de Antofagasta S.A.                   : chr  "0%" "100%" "1"
# $ Arcos Dorados Restaurantes de Chile Ltda    : chr  "0%" "100%" "1"
# $ Artel                                       : chr  "100%" "0%" "1"
# $ Arval - Relsa                               : chr  "0%" "100%" "1"
# $ BANCO INTERNACIONAL                         : chr  "100%" "0%" "1"
# $ BASF CHILE SA                               : chr  "0%" "100%" "1"
# $ BBVA Chile                                  : chr  "100%" "0%" "1"
# $ BOEHRINGER INGELHEIM LTDA                   : chr  "100%" "0%" "1"
# $ Banchile Inversiones                        : chr  "0%" "100%" "1"
# $ Bredenmaster S.A.                           : chr  "0%" "100%" "1"
# $ Caja de Compensación 18 de Septiembre       : chr  "0%" "100%" "1"
# $ Total                                       : chr  "35%" "65%" "20"
# - attr(*, ".internal.selfref")=<externalptr> 

Upvotes: 1

akrun
akrun

Reputation: 887651

We can do this with base R using addmargins and prop.table. From the table output in the OP's post, apply prop.table with margin specified as 2 (for columnwise). Create the Sum row with addmargins. Then, we multiply the rows except the last one with 100 and paste the % to the values

res <- addmargins(prop.table(Tab, 2), 1)
res[-nrow(res),] <- paste0(res[-nrow(res), ] * 100, "%")
dimnames(res)[[1]][3] <- "All"
names(dimnames(res)) <- NULL
res[,1:3]
#   Abertis Autopistas S.a. Accenture Adessa Falabella
#Si  0%                      100%      0%              
#No  100%                    0%        100%            
#All 1                       1         1           

Upvotes: 1

Related Questions