OGC
OGC

Reputation: 274

Error message for a subset of dataframe: number of rows of result is not a multiple of vector length (arg 1)

I am trying to get a summary table for a subset of the variables in my dataframe. The subset focuses on numeric variables only. So I created a subset with only numeric variables and calling the subset as numtable. However I am getting an error message:

Warning message:
In cbind(Variables, Missing, Type, Min, Max, Mean, SD) :
  number of rows of result is not a multiple of vector length (arg 1)

Here are the codes:

#Summary of the numeric variables in the dataset
numvars <- names(df) %in% c("age_years", "sex", "sop1", "servcode", "los", "admt", "ptstatus", "msdrg",
                            "msmdc", "sm_er", "diagnosis_order1", "diagnosis_order2", "diagnosis_order3", "diagnosis_order4", "diagnosis_order5",
                            "diagnosis_order1", "diagnosis_order1", "diagnosis_order1", "diagnosis_order1", "diagnosis_order1", "diagnosis_order1", 
                            "diagnosis_order6", "diagnosis_order7", "diagnosis_order8", "diagnosis_order9", "diagnosis_order10", "diagnosis_order11", "diagnosis_order12",
                            "diagnosis_order13", "diagnosis_order14", "diagnosis_order15", "diagnosis_order16", "diagnosis_order17", "diagnosis_order18", "diagnosis_order19", 
                            "diagnosis_order20" ,"diagnosis_order21", "diagnosis_order22", "diagnosis_order23", "diagnosis_order24", "diagnosis_order25", "diagnosis_order26", 
                            "diagnosis_order27", "diagnosis_order28", "diagnosis_order29", "diagnosis_order30", "diagnosis_order31", "diagnosis_order32", "diagnosis_order33", 
                            "diagnosis_order34", "diagnosis_order35")
numtable <- df[numvars]
Variables <- names(numtable)
Missing <- sapply(numtable, function(x) sum(is.na(x)))
Type <- sapply(numtable, function(x) class(x))
Min <- sapply(numtable, function(x) min(x, na.rm = TRUE))
Max <- sapply(numtable, function(x) max(x, na.rm = TRUE))
SD <- sapply(numtable, function(x) format(round(sd(x, na.rm=TRUE), 2), nsmall = 2))
Mean <- sapply(numtable, function(x) format(round(mean(x, na.rm=TRUE), 2), nsmall = 2))
#To get the simple table
knitr::kable(as.data.frame(cbind(Variables, Missing, Type, Min, Max, Mean, SD), row.names = FALSE))
#To get the Latex table for the rows 
knitr::kable(as.data.frame(cbind(Variables, Missing, Type, Min, Max, Mean, SD), row.names = FALSE), "latex")

Suppose I have this dataframe:

df <- data.frame(age_years = c(33, 11, 45, 67, 8, 99), sex = factor(c(0, 1, 1, 0, 0, 0)))

    > df
    
          age_years       sex
        1          33          0
        2          11          1
        3          45          1
        4          67          0
        5          8           0
        6          99          0

Upvotes: 0

Views: 259

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388862

My guess is that one of the variable that you have manually selected is not numeric. What does Type return? Do all of them return numeric?

Also you can select the variables dynamically instead of manually selecting them which may lead to errors. Try this approach :

df <- type.convert(df)
numtable <- Filter(is.numeric, df)
Variables <- names(numtable)
Missing <- sapply(numtable, function(x) sum(is.na(x)))
Type <- sapply(numtable, function(x) class(x))
Min <- sapply(numtable, function(x) min(x, na.rm = TRUE))
Max <- sapply(numtable, function(x) max(x, na.rm = TRUE))
SD <- sapply(numtable, function(x) format(round(sd(x, na.rm=TRUE), 2), nsmall = 2))
Mean <- sapply(numtable, function(x) format(round(mean(x, na.rm=TRUE), 2), nsmall = 2))
#To get the simple table
knitr::kable(data.frame(Variables, Missing, Type, Min, Max, Mean, SD))
#To get the Latex table for the rows 
knitr::kable(data.frame(Variables, Missing, Type, Min, Max, Mean, SD), "latex")

Upvotes: 1

Related Questions