Flo Bayer
Flo Bayer

Reputation: 1260

regex: matching only first occurrence per line

I want to select the first " - " from each line:

123 - foo - asdf
234 - bar - abcdefg
345 - foobar and hello world

If you use \s-\s it will select both occurrences from the first 2 lines.

So what I want is this:

enter image description here

I want 1 space, followed by a hyphen, followed by another space \s-\s, not just the hyphen and replace it with test, which is for the first line: 123testfoo - asdf.

I think you have to add a ? to make it non-greedy, but I don't know how.

Thanks.

Edit: Here's the goal:

I've got a huge file of IDs and Texts and I want to create an MySQL INSERT statement. So I want to replace the first \s-\s occurrence with , ' (in that part).

Upvotes: 24

Views: 49756

Answers (4)

d0dulk0
d0dulk0

Reputation: 112

/(?<=(^[^-]*?))-/gm

This worked for me on the regexr website. I'm unfortunately not sure what regexp flavor it works with

Uses lookbehind concept with grouping together with global and multiline flags Seems to match any first occurrence of any character within a line just replace both the "-" with something else

Upvotes: 0

Alvaro Rodriguez Scelza
Alvaro Rodriguez Scelza

Reputation: 4164

If you find yourself with a similar problem but cannot fix it using accepted answer, try this:

I was trying to remove the first

','

from an sql exported data file, so as to merge people name with their surnames, example of data: ,

('MARTIN','MULLER QUEIJO',46574180,'1996-06-12','Dirección no definida','Dirección no definida','',3)
,('ALFREDO','ACOSTA',342,'23423','asdas','asdasd','',3)
,('JULIO','RODRIGUEZ',3223424202569,'42423','23fs','asdas','',3)
,('EVELIS (43)','MAIDANA CASCO',2342,'asdas','dfgdfg','3ggfd','',3)

To do so I ended up using this regex:

(',')(?=.*(',').*(',').*(','))

As you can see I get the first occurrence of my desired token (',') and discard the same token as many times as you predict it to appear in each line. SO this may only work if you already know how many times this token appears in the lines, and as long as it is the same amount of times for avery line (4 times in this case, so I removed it 3 times).

Upvotes: 1

ctwheels
ctwheels

Reputation: 22817

To match the first occurrence on every line you need to anchor the pattern to the start of the line.

See regex in use here

^([^-]*?)\s*-\s*
  • ^ Assert position at the start of the line
  • ([^-]*?) Capture any character except - any number of times, but as few as possible into capture group 1
  • \s*-\s* Match any number of whitespace characters, followed by the hyphen - character, followed by any number of whitespace characters

Replacement: $1, '

The token $1 is a reference to the text that was most recently captured by the first capture group.

Upvotes: 21

Justinas Marozas
Justinas Marozas

Reputation: 2654

Use negative lookbehind to make sure there are no more hyphens in the line before your match:

(?<![-].*)(\s[-]\s)

Upvotes: 0

Related Questions