Gil33
Gil33

Reputation: 123

R frequency table based on presence / absence samples

I wasn´t quite sure how to search for the topic I´m interested in, so I apologize in advance if this question has already been asked. Questions related to frequency table didn´t solve my doubt.

I have the following df, where 1 indicates a positive results and 2 a negative ones:

d1 <- data.frame( Household = c(1:5), State = c("AL","AL","AL","MI","MI"), Electricity = c(1,1,1,2,2),
Fuelwood = c(2,2,1,1,1))

I want to produce a frequency table where I can identify the percentage of people using Eletricity, Fuelwood and Electricity+Fuelwood, such as df2:

d2 <- data.frame (State = c("AL", "MI"), Electricity = c(66.6,0), Fuelwood = c(0,100), ElectricityANDFuelwood = c(33.3,0))

Please consider that my real df has approx. 42 k households, 5 energy sources and 27 states.

Upvotes: 3

Views: 622

Answers (1)

akrun
akrun

Reputation: 887691

We can look for rows in d1 where Electricity and Fuelwood are positive (1). Using that logical index, we can change the values in Electricity and Fuelwood rows that are both positive to negative or 2. Then, create an additional column ElecticityANDFuelwood using the index that was created. Change from wide to long form using melt, subset only the two columns State and variable, use table and prop.table to calculate the frequency and relative frequency.

indx <- with(d1, Electricity==1 & Fuelwood==1)

d1[indx,3:4] <- 2
dT <- transform(d1, ElectricityANDFuelwood= (indx)+0)[-1]

library(reshape2)
dT1 <- subset(melt(dT, id.var='State'), value==1, select=1:2)
round(100*prop.table(table(dT1), margin=1),2)
 #      variable
#State Electricity Fuelwood ElectricityANDFuelwood
#  AL       66.67     0.00                  33.33
#  MI        0.00   100.00                   0.00

Or a data.table solution contributed by @David Arenburg

library(data.table)
d2 <- as.data.table(d1[-1])[, ElectricityANDFuelwood := 
             (Electricity == 1 & Fuelwood == 1)]
d2[(ElectricityANDFuelwood), (2:3) := 2]
d2[, lapply(.SD, function(x) 100*sum(x == 1)/.N), by = State]  
#   State Electricity Fuelwood ElectricityANDFuelwood
#1:    AL    66.66667        0               33.33333
#2:    MI     0.00000      100                0.00000

Upvotes: 4

Related Questions