user7096526
user7096526

Reputation:

Split value of a column into different columns

I have a variable in a R data frame with this format, are are have a mixed format:

98 (56:78)
34 (45:67)
45
67 (34:98)

I want do divide that column into 3 values,and to put each value in a new column of the data frame.

98 56 78
34 45 67
45 NA NA
67 34 98

how can I do that? thank you very much!

Upvotes: 0

Views: 165

Answers (3)

Stijn
Stijn

Reputation: 96

A tidy solution using extract():

library(tidyverse)

df <- data_frame(a = c("98 (56:78)", "10"))

df %>%
  extract(a, into = c("x", "y", "z"), regex = "(\\d*) ?\\(?(\\d*):?(\\d*)") %>%
  mutate_at(vars(x, y, z), as.integer)

Which results in:

# A tibble: 2 x 3
      x     y     z
  <int> <int> <int>
1    98    56    78
2    10    NA    NA

Upvotes: 0

MrFlick
MrFlick

Reputation: 206596

With your sample data

dd <- data.frame(a=c("98 (56:78)", "45"))

You can do this using the tidyverse (and the tidyr package in particular) using

library(tidyverse)
dd %>% separate(a, c("x","y","z"), extra="drop")
#    x    y    z
# 1 98   56   78
# 2 45 <NA> <NA>

Upvotes: 3

lmo
lmo

Reputation: 38520

Here is a longer base R method that returns a nice looking data.frame.

setNames(do.call(rbind.data.frame,
                 regmatches(temp, regexec("(\\d+) \\((\\d+):(\\d+)\\)$", temp)))[-1],
         paste0("V", 1:3))

Here, regexec selects the desired numbers using the regular expression "(\d+) \((\d+):(\d+)\)$" and regmatches returns a list of length equal to its inputs with each element a length of 4. This list is turned into a data.frame with do.call and rbind.data.frame and the first column, which contains the full string, is removed. Finally, setNames replaces the ugly default names with more manageable ones.

This returns

  V1 V2 V3
1 98 56 78
2 91 53 76
3 99 53 75

Note that the variables are factor variables in this context, which are usually not very useful. To convert these to numeric, you can use the dat[] <- lapply(dat, as.numeric) paradigm.

data

temp <- c("98 (56:78)", "91 (53:76)", "99 (53:75)")

Upvotes: 0

Related Questions