Reputation: 55
I have to match only the first Country name in the pattern below. The country names are given in all upper case letters. I used the following code to get the matches but it matches all the countries.
'\\b[A-Z]{2,}.\\b'
Eg: In the pattern below, I just want UNITED KINGDOM
x = "~ London, Greater London ~ UNITED KINGDOM;~ Ottawa, Ontario ~ CANADA;~,~ AUSTRALIA;~,~ POLAND;~,~ USA"
Upvotes: 0
Views: 65
Reputation: 66819
This seems to work:
regmatches(x, regexpr('\\b[A-Z ]{2,}\\b', x))
# [1] "UNITED KINGDOM"
I just added a space to make the character set [A-Z ]
. Note that regexpr
gets the first match while gregexpr
gets all of them (similar to sub
vs gsub
).
For more info, I recommend the official docs at ?regexpr
.
Upvotes: 2