tahwos
tahwos

Reputation: 574

Regular expression syntax

I have a similar problem, to a previously asked question. But similar practices apparently do not produce similar results.

Previous Question

New question - I want to match the lines beginning in T as the first match, and the following lines beginning with X as the second match (as a whole string, to be later matched by another regex)

What I have so far is (^T(\d+)\n(.*?)(?:the_problem)/m) I don't know what to replace "the_problem" with, or even if that is the issue. I assumed some rendition (?:\n|\z), but apparently not. Everything I tried, would not count the next occurrence of ^T(\d+) as the start of a new group, and continue to capture all of the lines between each occurrence, at the same time.

Sample text;
T01C0.025
T02C0.035
T03C0.055
T04C0.150
T05C0.065
T06C0.075
%
G05
G90
T01
X011200Y004700
X011200Y009700
X018500Y011200
X013500Y-011200
X023800Y019500
T02
X034800Y017800
X-033800Y-017800
X032800Y017800
T03
X036730Y003000
X038700Y003000
X040668Y-003000
X059230Y003000
T04
X110580Y017800
X023800Y027300
X095500Y028500
X005500Y-006500
X021500Y-006500
T05
X003950Y002000
X003950Y004500
X003950Y007000
T06
X026300Y027300
M30

I only want to capture the shorter version of T01, T02,...T0n, not the longer version at the top, then the entire collection of ^X(-?\d+)Y(-?\d+) that follows it, as another match.

Result 1.
Match 1. T01
Match 2. X011200Y004700
         X011200Y009700
         X018500Y011200
         X013500Y-011200
         X023800Y019500

Result 2.         
Match 1. T02
Match 2. X034800Y017800
         X-033800Y-017800
         X032800Y017800

Result 3.         
Match 1. T03
Match 2. X036730Y003000
         X038700Y003000

         ....etc....

Thanks in advance for any help ;-) Note: I prefer to use raw Ruby, without extensions or plugins. My version of ruby is 1.8.6.

Upvotes: 2

Views: 155

Answers (3)

Emily
Emily

Reputation: 18193

Try this instead:

^(T[^\s]+)[\n\r\s]((?:(?:X\S+)[\n\r\s])+)

It makes the groups for the X lines into non-capturing groups, then puts all the repetitions of the final pattern into a single group. All the X lines will be in a single capture.

You can test this using Rubular (an indispensable tool for developing regular expressions) http://rubular.com/r/PRnurKy64Q

Upvotes: 2

Justin Morgan
Justin Morgan

Reputation: 30715

I'm not totally sure I understand your problem, but I'll give this a shot. It looks like you want:

/(^T\d+$(^X[-A-Z\d]+$)+)*/g

This will have to be run under multiline mode so that ^ and $ match after and before newlines. Word of caution: I don't have much practice with mulitline regex, so you might want to do a sanity check on the use of ^ and $.

Also, I notice you didn't include the lines similar to T01C0.025 in your sample results, so I made the T\d+ assumption based on that.

Upvotes: 0

David Chan
David Chan

Reputation: 7505

this seems to work...

^(T[^\s]+)[\n\r\s]((X[^\s]+)[\n\r\s]){1,}

Upvotes: 0

Related Questions