pvomps
pvomps

Reputation: 13

return nth match from string using regex

I am using Tableau to create a visualization and need to apply Regex to string values in my data set. I'm trying to use Regex to return the nth match of this string of data: b29f3b2f2b2f3b3f1r2f3+b3x#. The data will always be in one line and I need to break the data out into substrings each time the characters b,s,f, or d are encountered and I need to match the nth occurrence returned. For example, when identifying which number match to return the following will match:

I can get the n=1 match to return the proper value using bfsd(?=[bfsd]) and have tried to get the subsequent values to return using lookahead, but can't find a regex which works. Any help is appreciated.

Upvotes: 1

Views: 4118

Answers (3)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627327

Your item pattern is [bfsd][^bfsd]*.

You may use ^(?:.*?([bfsd][^bfsd]*)){n} to get what you need, just update the n variable with the number you need to get.

This pattern will get you the second value:

^(?:.*?([bfsd][^bfsd]*)){2}

See regex demo.

Details

  • ^ - start of string
  • (?:.*?([bfsd][^bfsd]*)){2} - two occurrences of
    • .*? - any 0+ chars, as few as possible
    • ([bfsd][^bfsd]*) - b, f, s or d followed with 0+ chars othet than b, f, s and d.

Upvotes: 3

karakfa
karakfa

Reputation: 67547

if you have gawk, this will partition the input field as your spec

$ awk -v FPAT='[a-f][0-9rx#+]+' '{$1=$1}1'

$ echo "b29f3b2f2b2f3b3f1r2f3+b3x#" | 
  awk -v FPAT='[a-f][0-9rx#+]+' '{for(i=1;i<=NF;i++) print i " -> " $i}'


1 -> b29
2 -> f3
3 -> b2
4 -> f2
5 -> b2
6 -> f3
7 -> b3
8 -> f1r2
9 -> f3+
10 -> b3x#

Upvotes: 0

Poul Bak
Poul Bak

Reputation: 10930

You can use this regex:

[bsfd][^bsfd]*

Use the 'global' flag.

This will create matches that start with one of the four letters, followed by any number of other characters.

The result will be an array with all the matches. Note the Array will start with index 0 (not 1).

Upvotes: 0

Related Questions