Reputation: 21282
I'm aware there are many posts on this already. I promise that I have looked at them. Nevertheless I'm struggling.
Below is a dput list which is the output of a call to lapply.
I wouldlike a nice, easy to read data frame with 2 columns, one for true and one for false, with a row for each of the 25 list items.
Tried:
falsies <- lapply(my_list, function(x) table(tolower(x) %in% c("", "unknown", "\\?"))) %>%
+ data.frame(do.call(rbind, .))
Error in data.frame(., do.call(rbind, .)) : arguments imply differing number of rows: 2, 25
falsies <- lapply(my_list, function(x) table(tolower(x) %in% c("", "unknown", "\\?"))) %>%
as.data.frame.matrix()
Error in seq_len(ncols) : argument must be coercible to non-negative integer In addition: Warning message: In seq_len(ncols) : first element used of 'length.out' argument
falsies <- lapply(my_list, function(x) table(tolower(x) %in% c("", "unknown", "\\?"))) %>% as.vector(t(.)) %>%
as.data.frame(Field = names(.), Value = unlist(.))
Error in as.vector(x, mode) : invalid 'mode' argument
How can I convert my list into a 2 feature wide data frame?
my_list <- structure(list(ID = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
"FALSE"), .Names = ""), class = "table"), Fiscal_Week_Date = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
"FALSE"), .Names = ""), class = "table"), FISCAL_WEEK = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
"FALSE"), .Names = ""), class = "table"), SU_CURRENT_RECORD_IND = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
"FALSE"), .Names = ""), class = "table"), PROFIT_CENTRE = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
"FALSE"), .Names = ""), class = "table"), ACTIVE_ON_BASE = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
"FALSE"), .Names = ""), class = "table"), SU_STATUS_ID = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
"FALSE"), .Names = ""), class = "table"), SU_BIRTH_DATE = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
"FALSE"), .Names = ""), class = "table"), SU_GENDER = structure(c(17193L,
13899L), .Dim = 2L, .Dimnames = structure(list(c("FALSE", "TRUE"
)), .Names = ""), class = "table"), AVERAGE_SPEND = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
"FALSE"), .Names = ""), class = "table"), CU_PAPERLESS_BILL_IND = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
"FALSE"), .Names = ""), class = "table"), SU_FIXED_MOBILE_IND = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
"FALSE"), .Names = ""), class = "table"), MMS_INDICATOR = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
"FALSE"), .Names = ""), class = "table"), INSURANCE_INDICATOR = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
"FALSE"), .Names = ""), class = "table"), INSURANCE_AMOUNT = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
"FALSE"), .Names = ""), class = "table"), PREFERRED_TOPUP_METHOD_DESC = structure(c(7672L,
23420L), .Dim = 2L, .Dimnames = structure(list(c("FALSE", "TRUE"
)), .Names = ""), class = "table"), BROADBAND_IND = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
"FALSE"), .Names = ""), class = "table"), ICT_IND = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
"FALSE"), .Names = ""), class = "table"), TENURE_IN_MONTHS = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
"FALSE"), .Names = ""), class = "table"), CONTRACT_TYPE = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
"FALSE"), .Names = ""), class = "table"), HA_DEVICE_CAPABILITY = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
"FALSE"), .Names = ""), class = "table"), Year = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
"FALSE"), .Names = ""), class = "table"), Week = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
"FALSE"), .Names = ""), class = "table"), Age = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
"FALSE"), .Names = ""), class = "table"), Target_New_Card = structure(31092L, .Dim = 1L, .Dimnames = structure(list(
"FALSE"), .Names = ""), class = "table")), .Names = c("ID",
"Fiscal_Week_Date", "FISCAL_WEEK", "SU_CURRENT_RECORD_IND", "PROFIT_CENTRE",
"ACTIVE_ON_BASE", "SU_STATUS_ID", "SU_BIRTH_DATE", "SU_GENDER",
"AVERAGE_SPEND", "CU_PAPERLESS_BILL_IND", "SU_FIXED_MOBILE_IND",
"MMS_INDICATOR", "INSURANCE_INDICATOR", "INSURANCE_AMOUNT", "PREFERRED_TOPUP_METHOD_DESC",
"BROADBAND_IND", "ICT_IND", "TENURE_IN_MONTHS", "CONTRACT_TYPE",
"HA_DEVICE_CAPABILITY", "Year", "Week", "Age", "Target_New_Card"
))
Upvotes: 2
Views: 2455
Reputation: 7630
There are a variety of ways to do this, but recognize that the output you requested will not be tidy, and so not a typical or best practice data frame. The primary challenge here is that your list is comprised of tables, with one of the elements being a table of FALSE
and TRUE
, and all of the others being a table of FALSE
only. Just the FALSE
values contain all the information, but you can have your data in whatever form works for you :)
Here we don't assume ID.FALSE
contains all the false ids, but we use the one element of my_list
with both TRUE
and FALSE
values to compute the total. Then we change that element so that it is in a compatible form, convert to a data.frame, add in the TRUE
values, and voila!
total <- sum(my_list$PREFERRED_TOPUP_METHOD_DESC)
my_list$PREFERRED_TOPUP_METHOD_DESC <- my_list$PREFERRED_TOPUP_METHOD_DESC["FALSE"]
DF <- as.data.frame(unlist(my_list))
DF[2] <- total - DF[1]
names(DF) <- c("FALSE", "TRUE")
head(DF)
# FALSE TRUE
# ID.FALSE 31092 0
# Fiscal_Week_Date.FALSE 31092 0
# FISCAL_WEEK.FALSE 31092 0
# SU_CURRENT_RECORD_IND.FALSE 31092 0
# PROFIT_CENTRE.FALSE 31092 0
# ACTIVE_ON_BASE.FALSE 31092 0
# a helpful pair of rows to convince yourself this worked
DF[c("SU_GENDER.FALSE", "SU_GENDER.TRUE"), ]
# FALSE TRUE
# SU_GENDER.FALSE 17193 13899
# SU_GENDER.TRUE 13899 17193
Upvotes: 0