Reputation: 3045
I am trying to match a regex which outputs several values and assign it in-place to several new variables inside a data.table
library(data.table)
library(stringr)
fruit_regex <- "(\\d+): apples=(.*), oranges=(.*)"
DT <- data.table(V1=c("1: apples=0.1, oranges=0.01",
"2: apples=0.2, oranges=0.02",
"3: apples=0.3, oranges=0.03",
"4: apples=0.4, oranges=0.04",
"5: apples=0.5, oranges=0.05"))
DT[, c("txt","id","apples", "oranges"):= as.list(str_match_all(V1, fruit_regex))]
This, of course, fails and I am getting
>Warning messages:
>1: In `[.data.table`(DT, , `:=`(c("txt", "id", "apples", "oranges"), :
> Supplied 4 columns to be assigned a list (length 5) of values (1 unused)
str_match_all()
says to be vectorized over patterns and strings, but for some reason I can not get it to work.
There's another known issue with my regex which returns a redundant full match and can be cured with lookaround assertions.
Desired result(looking away from redundant V1
and txt
fields):
id apples oranges
1 0.1 0.01
2 0.2 0.02
3 0.3 0.03
4 0.4 0.04
5 0.5 0.05
Upvotes: 3
Views: 251
Reputation: 1001
You need to transform your results into something that R can insert into the dataframe, such as another data frame. For example, solved using the "plyr" package
library(data.table)
library(stringr)
library(plyr)
fruit_regex <- "(\\d+): apples=(.*), oranges=(.*)"
DT <- data.table(V1=c("1: apples=0.1, oranges=0.01",
"2: apples=0.2, oranges=0.02",
"3: apples=0.3, oranges=0.03",
"4: apples=0.4, oranges=0.04",
"5: apples=0.5, oranges=0.05"))
DT[, c("txt","id","apples", "oranges"):= ldply(str_match_all(V1, fruit_regex))]
Upvotes: 3