Reputation: 2758
For the strings:
text::handle:[email protected]::text
text::chat_identifier:chat0123456789&text
I have the current regex:
m/(handle:|chat_identifier:)(.+?)(:{2}|&)/
And I am currently using $2
in order to obtain the value I wish (in the first string [email protected]
and in the second, chat0123456789
).
Is there a better/faster/simpler way to solve this problem, though?
Upvotes: 4
Views: 122
Reputation: 6204
If the values you want are always in the same position and it's safe to split on :
and &
, then perhaps the following will work for you:
use Modern::Perl;
say +( split /[:&]+/ )[2] for <DATA>;
__DATA__
text::handle:[email protected]::text
text::chat_identifier:chat0123456789&text
Output:
[email protected]
chat0123456789
Upvotes: 1
Reputation: 3675
Looks like you have allot of good solutions already here. The split method seems like the simplest. But depending on your requirements you could also use a more generic regex that breaks the string in its basic pieces. It will work for other datatypes and property names than in your examples.
([^:]+)::([^:]+):([^:&]+)(?:::|&)\1
The captures groups are as follows:
Upvotes: 1
Reputation: 44279
For a regex solution, this one is slightly simpler and doesn't need to backtrack:
m/(handle|chat_identifier):([^:&]+)/
Note the slight difference: yours allows single colons within the value, mine doesn't (it stops at the first colon encountered). If that is not a problem, you can use my variant. Or as I mentioned in a comment, split at :
and use the fourth element in the result.
An equivalent version that does only stop at double colons is this:
m/(handle|chat_identifier):((?:(?!::|&).)+)/
Not so beautiful, but it still avoids backtracking (the lookahead might make it slower, though... you will need to profile that, if speed matters at all).
Upvotes: 2
Reputation: 11963
Whether it's "better" or not depends on the context, but you could take this approach: split the string on ":" and take the fourth element of the resulting list. That's arguably more readable than the regex and more robust if the third field can be something other than "handle" or "chat_identifier".
I think the speed would be very similar for either approach but probably for almost any implementation in perl. I'd want to show that speed was critical for this step before worrying about it...
Upvotes: 4