Update column values randomly based on value in other column in R

Question

I want to add a new column SubCategory with values filled randomly based on value of Category column. Here's the details:

Sub_Hair = c("Shampoo", "Conditioner", "Gel", "HairOil", "Dye")
Sub_Beauty = c("Face", "Eye", "Lips")
Sub_Nail= c("NailPolish", "NailPolishRemover", "NailArtKit", "ManiPadiKit")
Sub_Others = c("Electric", "NonElectric")

> product_data_1[1:10, c("Pcode", "Category", "MRP")]
    Pcode Category    MRP
1  16156L   Beauty  $8.88
2  16162M   Others $21.27
3  16168M   Others  $2.98
4  16169E     Nail $26.64
5  16207A     Hair  $6.38
6  17012B   Beauty $33.03
7  17012C   Beauty $20.58
8  17012F   Beauty $36.29
9  17091A     Nail $20.55
10 17107D     Nail $28.20

I'm trying the below code. However, the rows are getting updated with just one subcategory for each category. For example, all rows with "Beauty" category, the subcategory is "Eye" instead of values randomly selected from "Face, Eye and Lips". Here's the code and output:

product_data_1 = within(product_data_1, SubCategory[Category == "Beauty"] <- sample(Sub_Beauty, 1))
product_data_1 = within(product_data_1, SubCategory[Category == "Hair"] <- sample(Sub_Hair, 1))
product_data_1 = within(product_data_1, SubCategory[Category == "Nail"] <- sample(Sub_Nail, 1))
product_data_1 = within(product_data_1, SubCategory[Category == "Others"] <- sample(Sub_Others, 1))

> product_data_1[1:10, c("Pcode", "Category", "MRP", "SubCategory")]
    Pcode Category    MRP SubCategory
1  16156L   Beauty  $8.88         Eye
2  16162M   Others $21.27    Electric
3  16168M   Others  $2.98    Electric
4  16169E     Nail $26.64  NailPolish
5  16207A     Hair  $6.38         Gel
6  17012B   Beauty $33.03         Eye
7  17012C   Beauty $20.58         Eye
8  17012F   Beauty $36.29         Eye
9  17091A     Nail $20.55  NailPolish
10 17107D     Nail $28.20  NailPolish

user10191355 · Accepted Answer

Put your subcategory values in a list like subcat_list <- list(Hair = Hair, Beauty = Beauty, Nail = Nail, Others = Others). You can then use product_data_1$Category to slice subcat_list and sapply to call sample on each element of the resultant list of vectors:

set.seed(323)
product_data_1$SubCategory <- sapply(subcat_list[product_data_1$Category], sample, 1)

You can also try a slightly different approach with dplyr + purrr:

library(tidyverse)
product_data_1 %>% 
    mutate(SubCategory = map_chr(Category, ~ sample(subcat_list[[.]], 1)))

Example output:

    Pcode Category    MRP SubCategory
1  16156L   Beauty  $8.88         Eye
2  16162M   Others $21.27    Electric
3  16168M   Others  $2.98    Electric
4  16169E     Nail $26.64  NailPolish
5  16207A     Hair  $6.38         Gel
6  17012B   Beauty $33.03         Eye
7  17012C   Beauty $20.58        Lips
8  17012F   Beauty $36.29        Face
9  17091A     Nail $20.55 ManiPadiKit
10 17107D     Nail $28.20  NailArtKit

Update column values randomly based on value in other column in R

Answers (2)

Example output:

Related Questions