Mehdi Zare
Mehdi Zare

Reputation: 1381

Regex in R: extracting a word at the beginning of a string up to a special character

I have a string like this "JOHN_DOE" and want to extract "JOHN". JOHN has a variable length.

I tried regmatches("^[A-Z]_", "JOHN_DOE") but it doesn't work.

Upvotes: 1

Views: 81

Answers (3)

user12864379
user12864379

Reputation: 15

You could use the below code in order to get the desired output:

str_extract(x, "^[A-z]+(?= \\_)")

Upvotes: 0

akrun
akrun

Reputation: 887118

We can use sub to match the character _ followed by a word and replace it with ""

sub("_\\w+", "", "JOHN_DOE")
#[1] "JOHN"

If we have more characters followed the second word, add the .* to match characters that followed the word (\\w+)

sub("_\\w+.*", "","JOHN_DOE.M")
#[1] "JOHN"

Upvotes: 1

r.bot
r.bot

Reputation: 5424

One doesn't need stringr to do this, but I find it convenient.

Match from the start of the line ^ any upper or lower case alphabet character, zero or more times [a-zA-Z]* but not an underscore character [^\\_]

library(stringr)

x <- "JOHN_DOE"
str_extract(x, pattern = "^[a-zA-Z]*[^\\_]")

Upvotes: 0

Related Questions