Reputation: 575
Here the string:
Acanthite (Y: 1855) 02.BA.35 [18] [19] [20]
(IUPAC: Disilver sulfide)
Acetamide (1974-039) 10.AA.20 [21] [22] [23]
(IUPAC: Acetic acid amide)
Achalaite (2013-103) 04.?? [24] [no] [no]
Achavalite (Y: 1939
Here's my regex:
([^B35\[1-9\] 0:Y\(\)\n-.?])+
I've also tried:
^[a-z]+
What I would like outputted as a multi line is: (No particular programming language used)
Acanthite
Acetamide
Achalaite
Achavalite
Upvotes: 1
Views: 57
Reputation: 626926
Since you have a multiline string as input and you need to remove everything but the first words on the lines starting with Latin letters, you can use the following trick:
^
start-of-string anchor together with /m
multiline modifier)The regex is:
(?im)^([a-z]+).*(\r?\n[^a-z].*)*
See the demo
The (?im)
is the inline representation of m
multiline and i
ignorecase flags.
The regex breakdown:
^
- start of line([a-z]+)
- 1 or more Latin letters.*
- the rest of line(\r?\n[^a-z].*)*
- 0 or more sequences of...
\r?\n
- newlines[^a-z]
- a symbol other than a Latin letter.*
- the rest of lineNote that to match and remove the non-welcome lines from the start of string, you need to add the (?:[^a-z].*\r?\n)*
subpattern to the beginning:
(?im)^(?:[^a-z].*\r?\n)*([a-z]+).*(\r?\n[^a-z].*)*
^^^^^^^^^^^^^^^^^
See another demo
Upvotes: 1
Reputation: 829
use this pattern
A\w*e\s
See demo: https://regex101.com/r/hH8xD4/1
Upvotes: 0
Reputation: 174716
Just add case insensitive modifier. or You need to include A-Z
inside the character class.
/^[a-z]+/im
or
(?im)^[a-z]+
or
(?m)^[a-zA-Z]+
Upvotes: 0