bensir
bensir

Reputation: 35

Why is my line of code giving the wrong character location in a string

I am working with a dataset where the data my variables column is in the format blah blah [text I actually want]. I have wrote a line to try to replace every data point in the variables column with a new data point with just text I actually want.

I thought I finally cracked it, but it does not seem to actually be working so far.

melted$variable = str_sub(
    melted$variable, start = gregexpr(
        pattern ="\\[",melted$variable)[1], end = (
            str_length(melted$variable) - 1
        )
    )

melted is my data set, and variable is the column name

Upvotes: 3

Views: 49

Answers (2)

akrun
akrun

Reputation: 887028

In base R, we can do

regmatches(x, regexpr("(?<=\\[)[^]]+", x, perl = TRUE))
#[1] "ex1" "ex2"

data

x <- c("blah blah [ex1]", "blah blah [ex2]")

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 388862

We can use sub and extract everything between [ and ]

sub(".*\\[(.*)\\].*", "\\1", x)
#[1] "ex1" "ex2"

Or using str_extract

stringr::str_extract(x, "(?<=\\[).*(?=\\])")
#[1] "ex1" "ex2"

where x is

x <- c("blah blah [ex1]", "blah blah [ex2]")

which can be replaced with melted$variable.

Upvotes: 1

Related Questions