James Cook
James Cook

Reputation: 344

Regex for all characters upto not including \n

Here I have a text string.

Serial#......... 12345678910123456\nCust#........... 654321\nCustomer Name... Some Customer\nBILL TO NO NAME. Bill To: 123456 - Some Company Pty Ltd\nDATE...... 01/01/00

I want to capture 2 parts of this string.

Cust#........... 654321 BILL TO NO NAME. Bill To: 123456 - Some Company Pty Ltd

using regex.

So far I have Cust#.*?\d+ which captures

Cust#........... 654321

However I dont think this is the best approach.

Note.. This is 1 string from thousands, so data within strings is dynamic, can I capture what is within end of line \n character to achieve my result??

Upvotes: 0

Views: 56

Answers (2)

The fourth bird
The fourth bird

Reputation: 163362

You might use 2 capturing groups. In the first group, use your pattern without the lazy quantifier, as the digits are at the end of the line.

Then match (not capture) all the lines that do not start with BILL

After that, capture in group 2 the whole line that starts with BILL

^(Cust#.*\d+)(?:\r?\n(?!BILL ).*)*\r?\n(BILL .*)

Explanation

  • ^ Start of string
  • ( Capture group 1
    • Cust#.*\d+ The pattern to match Cust# with the digits at the end
  • ) Close group
  • (?:\r?\n(?!BILL ).*)*\r?\n Match all lines that do not start with BILL
  • ( Capture group 2
    • BILL .* Match the line that starts with BILL
  • ) Close group

Regex demo

Upvotes: 0

Gary_W
Gary_W

Reputation: 10360

Try this regex: ^.*?\n(.*?)\n.*?\n(.*?)\n.*$ at least it should give you a different way of looking at the problem.

It describes the entire string, using carriage returns as element delimiters. The parenthesis defines groups which you want to save, which are the 2nd and 4th groups.

Of course this depends on the elements you want always being the 2nd and 4th and being delimited by the newlines.

https://regex101.com/r/harmzn/1

Upvotes: 1

Related Questions