user23438
user23438

Reputation: 455

Looping over variables in a dataframe

Let's say I have a small dataset :

x1 = c(rep("A",10),rep("B",5),rep("C",20))
x2 = c(rep("D",15),rep("E",7),rep("F",13))
x3 = c(rep("H",20),rep("I",15))
y = c(rep("yes",7),rep("no",20),rep("NA",8))
data1 = data.frame(x1,x2,x3,y)

And now I want to loop over the variables x1-x3. More precisely I would like to do the following:

prop.table(table(data1$x1,data1$y),margin=2)
prop.table(table(data1$x2,data1$y),margin=2)
prop.table(table(data1$x3,data1$y),margin=2)

I have tried loops but I must be missing something obvious because it is not working. A quick hint would be appreciated.

Upvotes: 1

Views: 2458

Answers (3)

thelatemail
thelatemail

Reputation: 93803

I'll give a variation here and suggest stacking the data to long form to do the tabulation once. This will mean your output tables are then of the same dimensions for each subgroup:

data1[1:3] <- lapply(data1[1:3], as.character) # only necessary because you have factors
long <- cbind(stack(data1[1:3]), data1[4])
with(long, table(values,y,ind) )

Output:

, , ind = x1
      y
values NA no yes
     A  0  3   7
     B  0  5   0
     C  8 12   0
     D  0  0   0
     E  0  0   0
     F  0  0   0
     H  0  0   0
     I  0  0   0

, , ind = x2
      y
values NA no yes
     A  0  0   0
     B  0  0   0
     C  0  0   0
     D  0  8   7
     E  0  7   0
     F  8  5   0
     H  0  0   0
     I  0  0   0

, , ind = x3
      y
values NA no yes
     A  0  0   0
     B  0  0   0
     C  0  0   0
     D  0  0   0
     E  0  0   0
     F  0  0   0
     H  0 13   7
     I  8  7   0

Upvotes: 2

Onyambu
Onyambu

Reputation: 79188

Map(function(x)prop.table(table(x,data1$y),margin=2),data1[-4])
$`x1`

x     NA   no  yes
  A 0.00 0.15 1.00
  B 0.00 0.25 0.00
  C 1.00 0.60 0.00

$x2

x     NA   no  yes
  D 0.00 0.40 1.00
  E 0.00 0.35 0.00
  F 1.00 0.25 0.00

$x3

x     NA   no  yes
  H 0.00 0.65 1.00
  I 1.00 0.35 0.00

or you can use

lapply(data1[-4],function(x)prop.table(table(x,data1$y),margin=2))

Upvotes: 0

Rafael D&#237;az
Rafael D&#237;az

Reputation: 2289

You can use a loop or the lapply function

# Option 1
for(i in 1:3){
  print(prop.table(table(data1[,i],data1$y),margin=2))
  }

# Option 2
lapply(data1[,-4], function(x) prop.table(table(x,data1$y),margin=2))

Upvotes: 4

Related Questions