Reputation: 1003
I want to use regular expression to find strings in a file that have part of them that are non-numeric.
This would be a good string IDxxxxxx0123456789
.
This would be a bad string IDxxxxxx01234?6789
.
The file I am grepping has many different lines of text, and I am specifically interested in ones that conform to IDxxxxxx then I expect 10 digits. I want to find the lines where the 10 digits are not all digits.
I have this so far,
grep "ID.\{6\}[^0-9]" myFile
This works fine if the first character after the IDxxxxxx is non-numeric. So I extended this as follows;
grep "ID.\{6\}[^0-9]\{1,10\}" myFile
which I hoped would mean IDxxxxxx
followed by 1 to 10 non-numeric characters. This again works if the first character is non-numeric, but not the second.
I think I must be getting close, but not close enough. Can anyone steer me a little on this one please. I shall keep at this, and if I find an answer before anyone answers then I will post what I find.
Thanks in anticipation
(Update - I want to grep out all the bad strings)
Upvotes: 2
Views: 6946
Reputation: 11890
You're writing [^0-9], but ^
means "Every chars but not one of the sequent".
So you have to change it like this:
"ID.{6}[0-9]{1,10}\b"
In your way, if the first one is not numeric, the string matches because you have a range {1,10} that must be of non-numeric characters.
Moreover, you need to add \b
. Otherwise it will match your second string. With \b
, instead, you're saying that after numbers there must be a space, comma, or something that terminates the string, not any other chars.
Upvotes: 0
Reputation: 21972
Here is your strings:
$> cat ./text
This would be a good string IDxxxxxx0123456789
This would be a bad string IDxxxxxx01234?6789
The idea is to use --invert-match
flag.
$> grep --perl-regex --invert-match "ID.{6}[0-9]{10}" ./text
This would be a bad string IDxxxxxx01234?6789
Upvotes: 0