Modify string names in a data frame based on a condition

Question

I have a data frame with a variable called "Control_Category". The variable has six names in it, which for simplicity sake I am going to make generic:

df <- data.frame(Control_Category = c("Really Long Name One",
"Super Really Long Name Two",
"Another Really Flippin' Long Name Three",
",Seriously, It's a Fourth Long Name",
"Definitely a Fifth Long Name",
"Finally, This guy is done, number six"))

I'm using this to make a slight joke. So, while the names are long they are tidy in that the values for each (1-6) are consistent. In this specific character vector of the data.frame, there are hundreds and hundreds of entries that match any one of those six.

What I need to do is to replace the long names with a short name. Therefore, where any of the above names are identified, replace that name with a shorter version, like:

One Two Three Four Five Six

I tried a function using 'case_when' and it failed miserably. Any help would be appreciated.

Additional Information Based on Questions From Community

The order of the items doesn't matter. There isn't a designation of 1 - 6. There just happen to be six and I made six stupid long strings. The strings themselves are long.

So, anywhere "Super Really Long Name Two" exists, that value needs to be updated to something like 'TWO" or a "Short_Name" that that approximate "TWO". In reality, the category is called "Audit, Testing and Examination Results". The short name would ideally just be "AUDIT".

Tim Biegeleisen · Accepted Answer

You could just use gsub() once for each replacement:

df$Control_Category <- gsub('Really Long Name One', 'One',  df$Control_Category)

You can repeat similar logic to handle the other five long/short name pairs.

Modify string names in a data frame based on a condition

Answers (2)

Related Questions