Gabriel Sartori
Gabriel Sartori

Reputation: 33

Regex between first and second "_"

Hello guys i have a follow data:

data
proprio_com_luz
proprio_sem_ola_acabo

I want create two new variables

data                   condition variable
proprio_com_luz           com        luz
proprio_sem_ola_acabo     sem        ola_acabo

What regex help me here?

Upvotes: 2

Views: 60

Answers (2)

Jilber Urbina
Jilber Urbina

Reputation: 61204

If you are not familiar with regex, then you can use this (not short) approach

> string <- c("proprio_com_luz", "proprio_sem_ola_acabo")
> out <- do.call(rbind, lapply(strsplit(string, "_"), function(x) c(x[2], paste0(x[-c(1,2)], collapse="_"))))
> data.frame(data=string, condition=out[, 1], variable=out[, 2])
                   data condition  variable
1       proprio_com_luz       com       luz
2 proprio_sem_ola_acabo       sem ola_acabo

Upvotes: 1

acylam
acylam

Reputation: 18691

With extract from tidyr:

library(tidyr)

extract(df, data, c("condition", "variable"),
        regex = "^[^_]+_([^_]+)_(.+)$", remove = FALSE)

or with base R:

pattern <- "^[^_]+_([^_]+)_(.+)$"

df$condition = sub(pattern, "\\1", df$data)
df$variable = sub(pattern, "\\2", df$data)

Output:

                   data condition  variable
1       proprio_com_luz       com       luz
2 proprio_sem_ola_acabo       sem ola_acabo

Data:

df <- data.frame(data = c("proprio_com_luz",
                          "proprio_sem_ola_acabo"))

Upvotes: 4

Related Questions