Kenrich
Kenrich

Reputation: 23

R: Iterate through a for loop to print multiple tables

In the house price prediction dataset, there are about 80 variables and 1459 obs.
To understand the data better, I have segregated the variables which are 'char' type.

char_variables = sapply(property_train, is.character)  
char_names = names(property_train[,char_variables])  
char_names

There are 42 variables that are char datatype.
I want to find the number of observations in each variable.
The simple code for that would be:

table(property_train$Zoning_Class)  

    Commer    FVR    RHD    RLD    RMD 
        10     65     16   1150    218

But repeating the same for 42 variables would be a tedious task.
The for loops I've tried to print all the tables show error.

for (val in char_names){  
    print(table(property_train[[val]]))
    }


    Abnorml AdjLand  Alloca  Family  Normal Partial 
        101       4      12      20    1197     125 

Is there a way to iterate the char_names through the dataframe to print all 42 tables.

str(property_train)

    'data.frame':   1459 obs. of  81 variables:  
     $ Id                       : int  1 2 3 4 5 6 7 8 9 10 ...  
     $ Building_Class           : int  60 20 60 70 60 50 20 60 50 190 ...  
     $ Zoning_Class             : chr  "RLD" "RLD" "RLD" "RLD" ...  
     $ Lot_Extent               : int  65 80 68 60 84 85 75 NA 51 50 ...  
     $ Lot_Size                 : int  8450 9600 11250 9550 14260 14115 10084 10382..   
     $ Road_Type                : chr  "Paved" "Paved" "Paved" "Paved" ...  
     $ Lane_Type                : chr  NA NA NA NA ...  
     $ Property_Shape           : chr  "Reg" "Reg" "IR1" "IR1" ...  
     $ Land_Outline             : chr  "Lvl" "Lvl" "Lvl" "Lvl" ...  

Upvotes: 0

Views: 1418

Answers (1)

pieterbons
pieterbons

Reputation: 1724

Actually, for me your code does not give an error (make sure to evaluate all lines in the for-loop together):

property_train <- data.frame(a = 1:10,
                 b = rep(c("A","B"),5),
                 c = LETTERS[1:10])

char_variables = sapply(property_train, is.character)
char_names = names(property_train[,char_variables])
char_names

table(property_train$b)

for (val in char_names){
  print(table(property_train[val])) 
}

You can also get this result in a bit more user-friendy form using dplyr and tidyr by pivoting all the character columns into a long format and counting all the column-value combinations:

library(dplyr)
library(tidyr)

property_train %>% 
  select(where(is.character)) %>% 
  pivot_longer(cols = everything(), names_to = "column") %>% 
  group_by(column, value) %>% 
  summarise(freq = n())

Upvotes: 0

Related Questions