Dom
Dom

Reputation: 1053

Regex and SharePoint names in R

I'm trying to extract names from a list produced by SharePoint.

Each item in the list contains at least one name and a numeric id which varies in length.

The format of the list looks like:

all_projects %>% 
  select(contact_names)

 A tibble: 116 x 1
                                                contact_names
                                                       <chr>
 1 last_name, first_name;#6903;#last_name, first_name;#36606
 2                               last_name, first_name;#8585
 3                                                       ...
 4                              last_name, first_name;#14801

Using stringr I've managed to get the numbers out with the following:

str_replace_all(string, pattern = ";#?\\d*", ";")

But it results in:

\"last_name, first_name;;last_name, first_name;\", 

Which would be ok but for the double ;;. Inserting a ("") blank string str_replace_all(string, pattern = ";#?\\d*", "") returns:

\"last_name, first_namelast_name, first_name;\", 

Ideally I'd like to separate the first and last names into two columns.

Any help greatly appreciated.

Upvotes: 1

Views: 55

Answers (1)

akrun
akrun

Reputation: 887741

We could use separate/separate_rows

library(tidyverse)
separate_rows(df1, contact_names, sep = ";") %>%
        filter(!grepl("#\\d+", contact_names)) %>% 
        mutate(contact_names = str_replace_all(contact_names, "#", "")) %>%
        separate(contact_names, into = c("last", "first"), sep=",", remove = FALSE)
# A tibble: 4 x 3
#          contact_names      last       first
#*                 <chr>     <chr>       <chr>
#1 last_name, first_name last_name  first_name
#2 last_name, first_name last_name  first_name
#3 last_name, first_name last_name  first_name
#4 last_name, first_name last_name  first_name

data

df1 <- tribble(
        ~contact_names,   
                     "last_name, first_name;#6903;#last_name, first_name;#36606",
                            "last_name, first_name;#8585", 
                           "last_name, first_name;#14801")

Upvotes: 1

Related Questions