Reputation: 71
I am trying to create a new column (D$NEW) in Data.table D which matches each row of D to a whole column (D2$COLUMN1) in Data.table D2 using str_subset. (My data structure is at the bottom)
D[,NEW:= lapply(D[,C1],function(x)str_subset(as.character(D2$COLUMN1), x)]
This works fine. But I also want str_subset to ignore capital case. But when I use ignore.case(x)
D[,NEW:= lapply(D[,C1],function(x)str_subset(as.character(D2$COLUMN1), ignore.case(x))]
I get the following error
## PLEASE use (fixed|coll|regexp)(x, ignore_case=TRUE)
When I use ignore_case=TRUE
D[,F:= lapply(D[,V1],function(x) str_subset(as.character(D2$COLUMN1), x, ignore_case=TRUE))]
I get the following error:
Error in str_subset(as.character(), x, ignore_case = TRUE) : unused argument (ignore_case = TRUE)
How can I manage to force to ignore cases while using this function..
Data:
D<-data.table(C1=c("a","b","c","d","e","A","B","C"), C2=c(1,2,3,4,5,6,7,8,9,10))
D2<-data.table(COLUMN1=c("a"), COLUMN2=c("b"), COLUMN3=c(1:10))
Upvotes: 1
Views: 408
Reputation: 627536
The first error tells you that you cannot use an ignore.case()
as a function. The second error is related to the fact that the str_subset
function does not seem to have any ignore_case
argument.
Use an inline case-insensitive modifier (?i)
:
D[,NEW:= lapply(D[,C1],function(x)str_subset(as.character(D2$COLUMN1), paste0("(?i)",x)))]
^^^^^^^^^^^^^^^^
The inline case-insensitive modifier (?i)
does the same that as ignore.case
/ ignore_case
are doing. It makes matching case-insensitive. See more details on inline modifiers at regular-expressions.info. When placed at some place of the pattern, the part after it matches the string in a case-insensitive way. So, by placing it at the start of the pattern, you make the whole pattern case-insensitive.
Else, you may pass the TRUE
to the regex
function:
D[,NEW:= lapply(D[,C1],function(x)str_subset(as.character(D2$COLUMN1), regex(x, TRUE)))]
^^^^^^^^^^^^^^
The TRUE
is the value of the ignore_case
argument (you may write it as regex(x, ignore_case=TRUE)
). See more details on the options you may use in the stri_opts_regex section here. For some reason, the case_insensitive=TRUE
does not work. I got an error:
Error in
stri_opts_regex(case_insensitive = ignore_case, multiline = multiline,
:
formal argumentcase_insensitive
matched by multiple actual arguments
So, I had to replace it with ignore_case
.
Result:
> D
C1 C2 NEW
1: a 1 a,a,a,a,a,a,
2: b 2
3: c 3
4: d 4
5: e 5
6: A 6 a,a,a,a,a,a,
7: B 7
8: C 8
9: a 9 a,a,a,a,a,a,
10: b 10
Upvotes: 1