Jake McAllister
Jake McAllister

Reputation: 1075

How to match text in-between a character pattern with regular expression

Below is the text that I have:

Etiam porta sem malesuada magna mollis euismod. Praesent commodo cursus magna,
vel scelerisque nisl consectetur et. Nulla vitae elit libero, a pharetra augue. 
Donec sed odio dui. Donec id elit non mi porta gravida at eget metus.
|------|------| 
|6 | TEXT | 
|7 | TEXT | 
|8,9 | TEXT | 
|------|------|
Etiam porta sem malesuada magna mollis euismod. Praesent commodo cursus magna,
vel scelerisque nisl consectetur et. Nulla vitae elit libero, a pharetra 

I want to match this bit how would I do with a regular expression?

|6 | TEXT | 
|7 | TEXT | 
|8,9 | TEXT |

Here is what I have so far

How can I achieve this?

Upvotes: 1

Views: 211

Answers (4)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

I would use your current pattern as a delimiter and use the lazy dot matching pattern to match the subtexts you need (using a capturing group in the pattern around that subtext with String#scan):

/^\|-+\|-+\|\p{Zs}*\s*(.*?)(?=\s*^\|-+\|-+\|)/m

See regex demo. I added some more subpatterns to "trim" the output "on the fly". /m modifier is used to make a . match any character including a newline. ^\|-+\|-+\|\p{Zs}*\s* will match the leading delimiter, (.*?) will match and capture the shortest string up to the next delimiter, and (?=\s*^\|-+\|-+\|) will not be consumed (in case you want overlapping matches). Remove (?= and the last ) to avoid overlapping matches.

rx = /^\|-+\|-+\|\p{Zs}*\s*(.*?)(?=\s*^\|-+\|-+\|)/m
s = "Donec sed odio dui. Donec id elit non mi porta gravida at eget metus.\n|------|------| \n|6 | TEXT | \n|7 | TEXT | \n|8,9 | TEXT | \n|------|------|\nEtiam porta sem malesuada magna mollis euismod. Praesent commodo cursus magna,\nvel scelerisque nisl consectetur et. Nulla vitae elit libero, a pharetra "
puts s.scan(rx)

IDEONE demo

Upvotes: 0

Cary Swoveland
Cary Swoveland

Reputation: 110675

Must you use a regular expression? If your string is str, you can write;

puts str.split('|------|------|')[1]
  # |6 | TEXT | 
  # |7 | TEXT | 
  # |8,9 | TEXT | 

Upvotes: 1

sawa
sawa

Reputation: 168081

string.split(/^[-|]+\s*\n/)[1]

............

Upvotes: 0

RedLaser
RedLaser

Reputation: 680

The following matches what you need

\|\d(,\d)* \| .+ \|

It matches a | then a digit, then zero or more , and digit, then |, then any text, then |

As shown here: https://regex101.com/r/eB0vI3/2

Upvotes: 1

Related Questions