user321627
user321627

Reputation: 2564

How can I extract elements from a bracketed list?

I currently have a dataset with elements such as

df <- data.frame(id = c(1,2), bracketedList = c("[235.223,636.11115,7453.773]","[2355.66377,6362.1645,7633.7473]"))

I am wondering how I can extract the first number (235.223, 2355.66377) before each comma in an element in bracketedList, the second column, and so forth?

Upvotes: 1

Views: 56

Answers (3)

akrun
akrun

Reputation: 886928

We can use base R

cbind(df, read.csv(text = gsub("[][]", "", df$bracketedList), header = FALSE))

-output

 id                    bracketedList       V1        V2       V3
1  1     [235.223,636.11115,7453.773]  235.223  636.1111 7453.773
2  2 [2355.66377,6362.1645,7633.7473] 2355.664 6362.1645 7633.747

Upvotes: 2

Anoushiravan R
Anoushiravan R

Reputation: 21908

This could also help you:

library(dplyr)
library(stringr)

df %>%
  mutate(ext = str_extract(bracketedList, "(?<=\\[).*?(?=,)"))

  id                    bracketedList        ext
1  1     [235.223,636.11115,7453.773]    235.223
2  2 [2355.66377,6362.1645,7633.7473] 2355.66377

Or in base R:

gsub("^\\[(?<=\\[)(.*?)(?=,).*", "\\1", df$bracketedList, perl = TRUE)

[1] "235.223"    "2355.66377"

In case you would like to extract all the numbers:

library(tidyr)

df %>%
  mutate(ext = str_extract_all(bracketedList, "[^,\\[\\]]+")) %>%
  unnest_wider(ext) %>%
  rename_with(~ gsub("\\.\\.\\.(\\d)", "\\1", .), contains("."))

# A tibble: 2 x 5
     id bracketedList                    `1`        `2`       `3`      
  <dbl> <chr>                            <chr>      <chr>     <chr>    
1     1 [235.223,636.11115,7453.773]     235.223    636.11115 7453.773 
2     2 [2355.66377,6362.1645,7633.7473] 2355.66377 6362.1645 7633.7473

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 388797

You can use readr::parse_number which will return the first number from bracketedList.

readr::parse_number(df$bracketedList)
#[1]  235.2236 2355.6638

To extract all (or any) the data from the string you can read the data as json.

cbind(df[1], do.call(rbind, lapply(df$bracketedList, jsonlite::fromJSON)))

#  id        1         2        3
#1  1  235.223  636.1111 7453.773
#2  2 2355.664 6362.1645 7633.747

Upvotes: 1

Related Questions