user3425894
user3425894

Reputation: 11

Find all lines where the first word (and only the 1st word) contains a digit

Problem: I have a text with multiple lines. One line can contain multiple sentences. I need a regex that only shows the lines where the first word of the line itself contains a non-specific number (could be 1 or 2234234)

For example:

I have to admit that i am a n00b at regex. So far i found following:

^(.*)?[0-9](.*)?

However it will also match if there is a number in the e.g. third word but not the 1st one. I see that ^(.*)? matches anything from the start of the line, so also any text up to the 3rd word which contains the number.

And to make it more complicated the 1st word could also contain special characters (?/&%$"§ or any other).

If i would use a character class such as ^[a-zA-Z]? instead of ^(.*)? everything would be fine as far as i can see it, but it wouldn't catch whitespaces or special characters nor if there is more than one character in front of the number.

Upvotes: 1

Views: 1472

Answers (2)

Tim Pietzcker
Tim Pietzcker

Reputation: 336158

You can use this:

^\s*\S*[0-9].*

Explanation:

^     # Start of string
\s*   # Match optional whitespace at the start of the line
\S*   # Match any number of characters except whitespace
[0-9] # Match a digit
.*    # Match the rest of the string

See it live on regex101.com.

Upvotes: 3

dewd
dewd

Reputation: 4429

I think you need to check for whitespace. try: ^\s*\S*[0-9]+\S*\s

^ can either mean "anything except" e.g. [^9] is anything except the number 9, or it can mean match from the beginning of the string, as it does here.

\s* means match optional whitespace ie \s is match whitespace and * is zero or more times.

\S* is match optional non-whitespace. This is any character except newlines, carriage returns, space and tabs.

[0-9]+ is match 1 or more numbers ie [0-9] is match numbers, and + is 1 or more times.

\S* - same as \S* above.

\s is match 1 whitespace character.

Upvotes: 0

Related Questions