move_slow_break_things
move_slow_break_things

Reputation: 847

Substring and gsub in R

I have strings formatted like \t\tloc: 'Silver Spring, MD', that I extracted from a website and want to retrieve just the city name and state abbreviations, e.g. Silver Spring, MD. I was thinking about doing a combination of gsub and substr but the city name could change based on other data so it wouldn't make sense to give substr a start and end index. Here is the code I have tried so far:

# Would like to extract the string "Silver Spring, MD"
# What I tried:
ldata <- "\t\tloc: 'Silver Spring, MD',"
dt<- gsub(".*: ", "",ldata)
# Produces: 'Silver Spring, MD',"

The string however always appears in the same way, with the city name in the 'ABCDE, FG' part of the string segment. I'm new to R so if there's a more efficient way to do this.

Upvotes: 2

Views: 912

Answers (2)

akrun
akrun

Reputation: 887571

Another option without using the capture group is

gsub("^[^']+'|',$", '', ldata)
#[1] "Silver Spring, MD"

Upvotes: 1

Shenglin Chen
Shenglin Chen

Reputation: 4554

dt<-sub(".*'(.*)'.*","\\1",ldata)

Upvotes: 1

Related Questions