dig_123
dig_123

Reputation: 2358

egrep regex operation not working as expected

I have a file with content as such:

[TEXT_ID=2]
[TEXT_REV=3]
[NO_OF_BYTES=16]
0010002$%!003000040000000010100
[TXT]
FF FF
[TXT_ID=2$@]
[TXT_REV=3]
[NO_OF_BYTES=17]
0010002003000040000000010100
 [TXT]
 FF FF
$%^&

I want to identify anything other than 0-9, a-z, A-Z, Space, Enter and Tab as a junk character.

I have to however make sure that a = or [ or ] when comes as a part of [CONTEXT=val] line, will be a valid character. However if it comes in any other line then will be an junk character.

For example in the 9th line of my file if comes any =, [ or ], it is junk:

0010002003000040000000010100=[

So I'm using the below:

egrep -v "^[' '0-9a-zA-Z\t\n\v\f\r]*$|^[ ]*\[[A-Z].*\_*[A-Z]*=*[0-9]*\][ ]*$" SSPR.240, which gives an output as:

0010002$%!003000040000000010100
$%^&

However it is not considering the line:

[TXT_ID=2$@]

How can I modify my egrep statement?

Upvotes: 0

Views: 52

Answers (1)

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89547

You can try something like:

 egrep -v '^([[:space:]]*\[[[:alnum:]_]+=?[[:alnum:]_]*][[:space:]]*|[[:alnum:][:space:]_]*)$' file

Upvotes: 1

Related Questions