user3446084
user3446084

Reputation: 149

how to extract rows in a big table based on a list file containing the specific names in linux

I have a really big data table (DataTable.txt), a snapshot as below:

SNPname chr position sample1 sample2 sample3 sample4 ....sample2000
rs1 1 1000 A A B B ..... A
rs2 2 1500 B A B A ..... B
rs3 3 1503 B B A A ..... A
.
.
.
.
rs99999 22 999999 A A A ...... B

And I have a list of SNPnames that I want to include in my output table (other SNPnames not in this list will be excluded). The list (list.txt) is as below:

rs4560
rs4780
rs6
rs798
rs2634
rs987
rs1839
rs3948
rs2423
rs232

How can I produce a new output table that contains only the SNPnames listed in the list file?

Please advise, thank you. :)

Upvotes: 2

Views: 2455

Answers (2)

Kent
Kent

Reputation: 195079

give this a try:

grep -Fwf list.txt bigtable.txt

Upvotes: 1

fedorqui
fedorqui

Reputation: 289775

You can use for example this:

grep -wFf list.txt DataTable.txt
  • -w matches words.
  • -f gets the patterns from the file list.txt.
  • -F compares the strings as such, not as possible regular expressions.

Based on your sample input, and changing rs3 to rs6 to have a match, this what I get:

$ grep -wFf list.txt DataTable.txt
rs6 3 1503 B B A A ..... A

Upvotes: 3

Related Questions