Finkelson
Finkelson

Reputation: 3013

How to capture parts of a string into groups using regex?

I need to capture 4 groups from:

John.7200_24.6.txt.gz

Output:

Group1: John
Group2: 7200
Group3: 24
Group4: 6

Here is my regex: ([^.|_|data|gz]+)

It captures a single group with multiple matches. How can I fix it?

Upvotes: 0

Views: 3271

Answers (1)

The fourth bird
The fourth bird

Reputation: 163362

This pattern ([^.|_|data|gz]+) can be written as ([^._datagz|]+) which uses a negated character class to match 1+ chars other than the single chars listed.

You use a single capture group to split on, if you want 4 separate groups, you should create 4 groups and match instead of split.

^(\w+)\.(\d+)_(\d+)\.(\d+)
  • ^ Start of string
  • (\w+)\. Capture 1+ word chars in group 1 and match .
  • (\d+)_ Capture 1+ digits in group 2 and match _
  • (\d+)\. Capture 1+ digitsin group 3 and match .
  • (\d+) Capture 1+ digits in group 4

Regex demo

Or matching the full example string:

^(\w+)\.(\d+)_(\d+)\.(\d+)\.\w+\.gz$

Upvotes: 2

Related Questions