zoggbot
zoggbot

Reputation: 13

Regex match a hash that has been split over multiple lines

I want to match a hash that has been word wrapped by an author, and received over multiple lines.

Example:

SHA256: AB76235776BC87DBAB76235776BC87DBAB76235776BC87
DBAB76235776BC87DB

Has been received. My usual regex to match a sha256 hash like this is of course: [0-9A-Fa-f]{64}

But this does not work. I would like to leave the file unmodified while searching for this match, any ideas on how to match the split hash without removing newlines?

I'd like to have a regex that basically says 'look for 64 sequential hexadecimal values, but allow for one or more newlines in the mix, kthx'

Thanks in advance. C# is the language.

Upvotes: 1

Views: 906

Answers (2)

Alan Moore
Alan Moore

Reputation: 75222

Try this:

\b(?:[a-fA-F0-9]\s*){64}\b

It allows any kind of whitespace, not just line separators. If it really has to allow only line separators, you can use this:

\b(?:[a-fA-F0-9][\r\n]*){64}\b

This will also include the line separator following the number, if there is one, and if it's followed by a word character. You can prevent that like this:

\b(?:[a-fA-F0-9][\r\n]*){63}[a-fA-F0-9]\b

Upvotes: 2

Vogel612
Vogel612

Reputation: 5647

Change your regex to include newline characters:

[A-Z0-9a-z\\r\\n ]{64, }

You could modify the upper bound to include a restriction on the number of linebreaks.
In this case you need to keep in mind linebreaks can be 2 symbols long, depending on machine culture and OS.

1 linebreak --> 66 chars
2 linebreaks --> 68 chars
Continue as much as you like.

On a sidenote. While parsing the file, you generally leave it rest. All your modifications are made with the variables you read the file in to. This is why I do not see the point of keeping the linebreaks.

Upvotes: 1

Related Questions