Reputation: 3025
I have a text file that denotes remarks with a single '
.
Some lines have two quotes but I need to get everything from the first instance of a '
and the line feed.
I AL01 ' A-LINE '091398 GDK 33394178
402922 0831850 ' '091398 GDK 33394179
I AL02 ' A-LINE '091398 GDK 33394180
400722 0833118 ' '091398 GDK 33394181
I A10A ' A-LINE 102 ' 53198 DJ 33394182
395335 0832203 ' ' 53198 DJ 33394183
I A10B ' A-LINE 102 ' 53198 DJ 3339418
Upvotes: 173
Views: 354479
Reputation: 170
https://regex101.com/r/Jjc2xR/1
/(\w*\(Hex\): w*)(.*?)(?= |$)/gm
I'm sure this one works, it will capture de hexa serial in the badly structured text multilined bellow
Space Reservation: disabled
Serial Number: wCVt1]IlvQWv
Serial Number (Hex): 77435674315d496c76515776
Comment: new comment
I'm a eternal newbie in regex but I'll try explain this one
(\w*(Hex): w*) : Find text in line where string contains "Hex: "
(.*?) This is the second captured text and means everything after
(?= |$) create a limit that is the space between = and the |
So with the second group, you will have the value
Upvotes: 0
Reputation: 469
In your example I'd go for the following pattern:
'([^\n]+)$
use multiline and global options to match all occurences.
To include the linefeed in the match you could use:
'[^\n]+\n
But this might miss the last line if it has no linefeed.
For a single line, if you don't need to match the linefeed I'd prefer to use:
'[^$]+$
Upvotes: 13
Reputation: 201
When I tried '.* in windows (Notepad ++) it would match everything after first ' until end of last line.
To capture everything until end of that line I typed the following:
'.*?\n
This would only capture everything from ' until end of that line.
Upvotes: 20
Reputation: 39836
The appropriate regex would be the ' char followed by any number of any chars [including zero chars] ending with an end of string/line token:
'.*$
And if you wanted to capture everything after the ' char but not include it in the output, you would use:
(?<=').*$
This basically says give me all characters that follow the ' char until the end of the line.
Edit: It has been noted that $ is implicit when using .* and therefore not strictly required, therefore the pattern:
'.*
is technically correct, however it is clearer to be specific and avoid confusion for later code maintenance, hence my use of the $. It is my belief that it is always better to declare explicit behaviour than rely on implicit behaviour in situations where clarity could be questioned.
Upvotes: 132
Reputation: 106332
This will capture everything up to the ' in backreference 1 - and everything after the ' in backreference 2. You may need to escape the apostrophes though depending on language (\')
/^([^']*)'?(.*)$/
Quick modification: if the line doesn't have an ' - backreference 1 should still catch the whole line.
^ - start of string
([^']*) - capture any number of not ' characters
'? - match the ' 0 or 1 time
(.*) - capture any number of characters
$ - end of string
Upvotes: 5
Reputation: 7451
'.*$
Starting with a single quote ('
), match any character (.
) zero or more times (*
) until the end of the line ($
).
Upvotes: 35