Reputation: 825

Find just part of string with a regex

I have a string like so:

"@[30:Larry Middleton]"

I want to return just 30. Where 30 will always be digits, and can be of 1 to infinity in length.

I've tried:

user_id = result.match(/@\[(\d+):.*]/)

But that returns everything. How can I get back just 30?

Upvotes: 1

Answers (5)

Phrogz

Reputation: 303500

If that's really all your string, you don't need to match the rest of the pattern; just match the consecutive integers:

irb(main):001:0> result = "@[30:Larry Middleton]"
#=> "@[30:Larry Middleton]"
irb(main):002:0> result[/\d+/]
#=> "30"

However, if you need to match this as part of a larger string that might have digits elsewhere:

irb(main):004:0> result[/@\[(\d+):.*?\]/]
#=> "@[30:Larry Middleton]"
irb(main):005:0> result[/@\[(\d+):.*?\]/,1]
#=> "30"
irb(main):006:0> result[/@\[(\d+):.*?\]/,1].to_i
#=> 30

If you need the name also:

irb(main):002:0> m = result.match /@\[(\d+):(.*?)\]/
#=> #<MatchData "@[30:Larry Middleton]" 1:"30" 2:"Larry Middleton">
irb(main):003:0> m[1]
#=> "30"
irb(main):004:0> m[2]
#=> "Larry Middleton"

In Ruby 1.9 you can even name the matches, instead of using the capture number:

irb(main):005:0> m = result.match /@\[(?<id>\d+):(?<name>.*?)\]/
#=> #<MatchData "@[30:Larry Middleton]" id:"30" name:"Larry Middleton">
irb(main):006:0> m[:id]
#=> "30"
irb(main):007:0> m[:name]
#=> "Larry Middleton"

And if you need to find many of these:

irb(main):008:0> result = "First there was @[30:Larry Middleton], age 17, and then there was @[42:Phrogz], age unknown."
#irb(main):015:0> result.scan /@\[(\d+):.*?\]/
#=> [["30"], ["42"]]
irb(main):016:0> result.scan(/@\[(\d+):.*?\]/).flatten.map(&:to_i)
#=> [30, 42]
irb(main):017:0> result.scan(/@\[(\d+):(.*?)\]/).each{ |id,name| puts "#{name} is #{id}" }
Larry is 30
Phrogz is 42

Upvotes: 5

bkempner

Reputation: 652

I prefer String#scan for most of my regex needs, here's what I would do:

results.scan(/@\[(\d+):/).flatten.map(&:to_i).first

For your second question about getting the name:

results.scan(/(\d+):([A-Za-z ]+)\]$/).flatten[1]

Scan will always return an array of sub string matches:

"@[123:foo bars]".scan(/\d+/) #=> ['123']

If you include a pattern in parens, then each match for those "sub-patterns" will be included in a sub array:

"@[123:foo bars]".scan(/(\d+):(\w+)/) #=> [['123'], ['foo']]

That's why we have to do flatten on results involving sub-patterns:

[['123'], ['foo']].flatten = ['123', 'foo']

Also it always returns strings, that's why conversion to integer is needed in the first example:

['123'].to_i = 123

Hope this is helpful.

Upvotes: 1

Niet the Dark Absol

Reputation: 324790

I don't know ruby, but if it supports lookbehinds and lookaheads:

user_id = result.match(/(?<@\[)\d+(?=:)/)

If not, you should have some way of retrieving subpattern from the match - again, I wouldn't know how.

Upvotes: 1

WarHog

Reputation: 8710

You've forgot to escape ']':

user_id = result.match(/@\[(\d+):.*\]/)[1]

Upvotes: 1

Jon M

Reputation: 11705

Try this:

user_id = result.match(/@\[(\d+):.*]/)[1]

Upvotes: 2

Find just part of string with a regex

Answers (5)

Related Questions