Rana Usman
Rana Usman

Reputation: 1051

Extracting specific word followed by another in R

I have a description of open positions. I want to take grade out of them and post it in a column adjacent. It can be done by fetching word next to "Grade:" in the text description

Simulation

  structure(list(description = structure(2:1, .Label = c("Grade: L3 Position title bla bla bla", 
"Head of xxxxxxxx Grade: L5 Last Date to Apply: 22nd July 2019"
), class = "factor"), division = structure(2:1, .Label = c("ABC", 
"XYZ"), class = "factor")), class = "data.frame", row.names = c(NA, 
-2L))

Requested Result

Description     Division     Grade
sdsdsdsd         XYZ          L5
asdasdsadas      ABC          L3

I found this solution, it can get the word out, but not put it in column.

Extract text that follows a specific word/s in R

Upvotes: 0

Views: 61

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 388982

You can use sub and extract a word after "Grade" with 0 or more whitespace before and after :

sub(".*Grade\\s*:\\s*(\\w+).*", "\\1", df$description)
#[1] "L5" "L3"

Upvotes: 2

Cettt
Cettt

Reputation: 11981

you can use the stringr package like this:

library(stringr)
df[,"Grade"] <- sub("Grade: ", "", str_extract(df$description, "Grade: [^ ]+"))

Data:

df <- structure(list(description = structure(2:1, .Label = c("Grade: L3 Position title bla bla bla", 
                                                       "Head of xxxxxxxx Grade: L5 Last Date to Apply: 22nd July 2019"
), class = "factor"), division = structure(2:1, .Label = c("ABC", 
                                                           "XYZ"), class = "factor")), class = "data.frame", row.names = c(NA, 
                                                                                                                           -2L))

EDIT: I have just seen that there are far better answers inside the comments. So better use one of them since they do not rely on an extra package.

Upvotes: 2

Related Questions