Reputation: 309
I'm trying to figure out how to use regex to pull city names from an array of strings. Here's how the strings are formatted:
City of Covina Police Department, Covina, CA 91728
Right now I'm pulling state abbreviations by looping through each string, then looping through an array of US state abbreviations to see if the string includes one of them, like so:
states = [my array of states]
string = the current string from the array
states.each do |state|
if string.include?(state)
counter[state] += 1
end
end
Based on how the strings are formatted, how would I use Regex to find the city in each string? I'm thinking that because I've found the state, and because the city is always immediately preceding the state, I should be able to use this to find it, but I'm not that well versed in Regex so I'm having trouble with the answer. Any ideas?
Upvotes: 0
Views: 356
Reputation: 110685
If:
then you can write:
str.split(',')[-2].strip
Examples:
str = "City of Covina Police Department, Covina, CA 91728"
str.split(',')[-2].strip #=> "Covina"
str = "City of Covina, Police Department, Covina, CA 91728"
str.split(',')[-2].strip #=> "Covina"
Upvotes: 2
Reputation: 4292
I'm not from US so I have no idea wether the state code is always in format XX and zip code alway 5 digits, but based on that assumption. here it is
/\w+(?=, \w{2} \d{5}$)/
(?=...$)
is positive lookahead for the end of the string
\w{2}
state code
\d{5}
zip code
Upvotes: 0