Reputation: 109
I have a data table like the following:
library(data.table)
DF <- as.data.table(list(ID = c(125534,"122-343",312343,"12343-343FGV", 1234, 713827), Product = c('Y', NA, NA, 'Z', NA, NA), Type = c(NA, 'D', 'G', NA, NA, NA)))
ID Product Type
1: 125534 Y NA
2: 122-343 NA D
3: 312343 NA G
4: 12343-343FGV Z NA
5: 1234 NA NA
6: 713827 NA NA
I would like to create a new column named CATEGORY based on how ID is categorized. My incorrect code looks like the following:
DF$CATEGORY <- ifelse(grepl("^12[0-9]|^31[0-9]|", DF$ID), 'IN', 'OUT')
Desired result:
ID Product Type CATEGORY
1: 125534 Y NA IN
2: 122-343 NA D OUT
3: 312343 NA G IN
4: 12343-343FGV Z NA OUT
5: 1234 NA NA OUT
6: 713827 NA NA OUT
I wanted to code it so that any ID with alphabetical letters or symbols, shorter than 6 characters and do not begin with either 12 or 31 are out. The remainder are in.
Upvotes: 1
Views: 564
Reputation: 886948
We can also do this by creating the 'CATEGORY' column with "OUT" values and then specify the 'i' with logical index that only matches the criteria for "IN" cases and assign (:=
) the 'CATEGORY' to "IN"
DF[, CATEGORY := "OUT"][grepl("^(12|31)[0-9]{4,}$", ID), CATEGORY := "IN"]
DF
# ID Product Type CATEGORY
#1: 125534 Y NA IN
#2: 122-343 NA D OUT
#3: 312343 NA G IN
#4: 12343-343FGV Z NA OUT
#5: 1234 NA NA OUT
#6: 713827 NA NA OUT
Upvotes: 1
Reputation: 1890
I think you mean this:
DF[, CATEGORY := ifelse(grepl("[^0-9]", ID) |
nchar(ID) < 6 |
!grepl("^12|^31", ID),
"OUT", "IN")]
Upvotes: 2