Reputation: 1599
I have these strings in every row of one column.
example_df <- tibble(string = c("[{\"positieVergelekenMetSchooladvies\":\"boven niveau\",\"percentage\":9.090909090909092,\"percentageVergelijking\":19.843418733556412,\"volgorde\":10},{\"positieVergelekenMetSchooladvies\":\"op niveau\",\"percentage\":81.81818181818181,\"percentageVergelijking\":78.58821425834631,\"volgorde\":20},{\"positieVergelekenMetSchooladvies\":\"onder niveau\",\"percentage\":9.090909090909092,\"percentageVergelijking\":1.5683670080972694,\"volgorde\":30}]"))
I'm only interested in the numbers. This regex works:
example_df %>%
.$string %>%
str_extract_all(., "[0-9]+\\.[0-9]+")
Instead of using the separate()
function I want to use the extract()
function. My understanding is that it differs from separate()
in that extract()
matches your regex you want to populate your new columns with. separate()
matches, of course, the separation string. But where separate()
matches all strings you fill in at sep=
extract()
matches only one group.
example_df %>%
extract(string,
into = c("boven_niveau_school",
"boven_niveau_verg",
"op_niveau_school",
"op_niveau_verg",
"onder_niveau_school",
"onder_niveau_verg"),
regex = "([0-9]+\\.[0-9]+)")
What am I doing wrong?
Upvotes: 0
Views: 78
Reputation: 887118
We can use regmatches/regexpr
from base R
out <- regmatches(example_df$string, gregexpr("\\d+\\.\\d+", example_df$string))[[1]]
example_df[paste0("new", seq_along(out))] <- as.list(out)
example_df
# A tibble: 1 x 7
# string new1 new2 new3 new4 new5 new6
# <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#1 "[{\"positieVergelekenMetSchooladvies\":\"boven niveau\",\"percentage\":9… 9.09090909… 19.84341873… 81.8181818… 78.588214… 9.0909090… 1.56836700…
Upvotes: 0
Reputation: 388982
Instead of separate
or extract
I would extract all the numbers from the string and then use unnest_wider
to create new columns.
library(tidyverse)
example_df %>%
mutate(temp = str_extract_all(string, "[0-9]+\\.[0-9]+")) %>%
unnest_wider(temp)
You can rename the columns as per your choice.
Upvotes: 1