Reputation: 1053
I'm trying to extract names from a list produced by SharePoint.
Each item in the list contains at least one name and a numeric id which varies in length.
The format of the list looks like:
all_projects %>%
select(contact_names)
A tibble: 116 x 1
contact_names
<chr>
1 last_name, first_name;#6903;#last_name, first_name;#36606
2 last_name, first_name;#8585
3 ...
4 last_name, first_name;#14801
Using stringr
I've managed to get the numbers out with the following:
str_replace_all(string, pattern = ";#?\\d*", ";")
But it results in:
\"last_name, first_name;;last_name, first_name;\",
Which would be ok but for the double ;;
. Inserting a (""
) blank string str_replace_all(string, pattern = ";#?\\d*", "")
returns:
\"last_name, first_namelast_name, first_name;\",
Ideally I'd like to separate the first and last names into two columns.
Any help greatly appreciated.
Upvotes: 1
Views: 55
Reputation: 887741
We could use separate/separate_rows
library(tidyverse)
separate_rows(df1, contact_names, sep = ";") %>%
filter(!grepl("#\\d+", contact_names)) %>%
mutate(contact_names = str_replace_all(contact_names, "#", "")) %>%
separate(contact_names, into = c("last", "first"), sep=",", remove = FALSE)
# A tibble: 4 x 3
# contact_names last first
#* <chr> <chr> <chr>
#1 last_name, first_name last_name first_name
#2 last_name, first_name last_name first_name
#3 last_name, first_name last_name first_name
#4 last_name, first_name last_name first_name
df1 <- tribble(
~contact_names,
"last_name, first_name;#6903;#last_name, first_name;#36606",
"last_name, first_name;#8585",
"last_name, first_name;#14801")
Upvotes: 1