Reputation: 125
I have this dataframe:
df <- data.frame(item_number = c(1, 2, 3, 3),
description = c("Big Wheel Agriculture Cart, 300-Lb Capacity, 32.75w X 58d X 28.25h, Black", "Dome Lids For 16\" Cater Trays, 16\" Diameter X 2.5\"h, Clear, 50/carton", "Aluminum Cater Trays, 1 Compartment, 18\" Diameter X 0.94\"h, Silver, 50/carton", "Aluminum Cater Trays, 7 Compartment Lazy Susan, 18\" Diameter X 0.94\"h, Silver, 50/carton"))
I want to split the description column (using commas as the delimiter) based on the category number. I want to split category_number 1 into 4 columns (product name, capacity, item_dimensions, color), but for category number 3 I want split into 5 columns (product_name, compartments, item dimensions, color, quantity). I want these new columns to be attached to the original df dataframe.
I tried case_when and if_else but I am unable to conditionally split as I described above. I am not sure if what I want to do is possible.
EDITED:
My desired df output from splitting:
df <- data.frame(item_number = c(1, 2, 3, 3),
description = c("Big Wheel Agriculture Cart, 300-Lb Capacity, 32.75w X 58d X 28.25h, Black", "Dome Lids For 16\" Cater Trays, 16\" Diameter X 2.5\"h, Clear, 50/carton", "Aluminum Cater Trays, 1 Compartment, 18\" Diameter X 0.94\"h, Silver, 50/carton", "Aluminum Cater Trays, 7 Compartment Lazy Susan, 18\" Diameter X 0.94\"h, Silver, 50/carton"),
product_name = c("Big Wheel Agriculture Cart", "Dome Lids For 16\" Cater Trays",
"Aluminum Cater Trays", "Aluminum Cater Trays"),
capacity = c("300-Lb Capacity", NA, NA, NA),
item_dimensions = c("32.75w X 58d X 28.25h", "16\" Diameter X 2.5\"h",
"18\" Diameter X 0.94\"h", "18\" Diameter X 0.94\"h"),
color = c("Black", "Clear", "Silver", "Silver"),
quantity = c(NA, "50/carton", "50/carton", "50/carton"),
compartment = c(NA, NA, "1 Compartment", "7 Compartment Lazy Susan"))
Upvotes: 0
Views: 928
Reputation: 21
I see the solution in:
items_list
)split_df
)separate
function)bind_rows
function)library(tidyverse)
df <- data.frame(item_number = c(1, 2, 3, 3),
description = c("Big Wheel Agriculture Cart, 300-Lb Capacity, 32.75w X 58d X 28.25h, Black", "Dome Lids For 16\" Cater Trays, 16\" Diameter X 2.5\"h, Clear, 50/carton", "Aluminum Cater Trays, 1 Compartment, 18\" Diameter X 0.94\"h, Silver, 50/carton", "Aluminum Cater Trays, 7 Compartment Lazy Susan, 18\" Diameter X 0.94\"h, Silver, 50/carton"))
items_list <- list(
c("product_name", "capacity", "item_dimensions", "color"),
c("product_name", "item_dimensions", "color", "quantity"),
c("product_name", "compartment", "item_dimensions", "color", "quantity")
)
split_df <- split(df, df$item_number)
new_df <- bind_rows(
lapply(split_df, function(df_x) {
item_n <- df_x$item_number[[1]]
df_x %>%
mutate(description_split = description) %>%
separate(col = "description_split",
into = items_list[[item_n]],
sep = ", ")
})
)
Upvotes: 1