Problems with mulitpliers using grep

Question

I have the following file

1:10177 rs367896724 A AC
1:10352 rs555500075 T TA
1:10616 rs376342519 CCGCCGTTGCAAAGGCGCGCCG C
1:11012 rs544419019 C G
1:11063 rs561109771 T G
1:13110 rs540538026 G A
1:13116 rs62635286 T G
1:13118 rs62028691 A G
1:13273 rs531730856 G C
1:13284 rs548333521 GT A

Where the last two columns can have only values [ATCG]. I want to grep all the lines where I have only a letter in the last two columns

Expected output: I have the following file

1:11012 rs544419019 C G
1:11063 rs561109771 T G
1:13110 rs540538026 G A
1:13116 rs62635286 T G
1:13118 rs62028691 A G
1:13273 rs531730856 G C

I've tried the following but I got no results

grep -F '[ACTG]?\s[ACTG]?$' file | head

grep '[ACTG]?\s[ACTG]?$' file | head

grep -E '.?\s.?$' file

With the last command, I got the following:

1:10616 rs376342519 CCGCCGTTGCAAAGGCGCGCCG C
1:11012 rs544419019 C G
1:11063 rs561109771 T G
1:13110 rs540538026 G A
1:13116 rs62635286 T G
1:13118 rs62028691 A G
1:13273 rs531730856 G C
1:13284 rs548333521 G A

Thanks for the help!

Jonathon K · Accepted Answer

If you want exactly one character in the last two columns use a leading whitespace character. From your description it sounds like there shouldn't be any optional characters either.

grep -E '\s.\s.$' file

Or

grep -E '(\s[ACTG]){2}$' file

Either should work.

Problems with mulitpliers using grep

Answers (2)

Related Questions