Reputation: 627
Hopefully I'm going to make sense here...
I have a huge file - each line represents data from a different individual. What I want to do is to grep (or extract) out information (lines) for certain individuals - I don't want to keep greping out the individuals individually then appending it all together at the end but I was curious whether there is a loop I can set up by proving a text file with the IDs (ie ID001, ID002... ID100) or some variable that is unique to each individual. I'm fairly new to programming so I'm not sure what I should be googling/looking for to get the answer - but is this possible in Shell?
Apologies for what might be a simple question.
Thanks!
EDIT 1: I'm adding a little more info here: format might be different but essentially the file is a genetics file and has the following format:
FAM001 ID001 A A T T TC T A…… A G
FAM001 ID002 A A T T C C A G…… T C
FAM004 ID003 A A T G T G A A…… A G
.
.
FAM100 ID100 G A C T C G T G…… T G
Is it possible to set up a loop, say, similar to/includes this:
for f in $( cat ~/FAMID.txt )
With the FAMID.txt as:
FAM001
FAM050
FAM087
to be able to run a certain analysis on the individuals with a certain FAMID ID but only running the program on the families in the list provided?
Hope that makes sense.
Upvotes: 2
Views: 3028
Reputation: 247042
This is all you need:
grep -wFf FAMID.txt data.txt
where:
-f FAMID.txt
tells grep to read the patterns from the file-F
tells grep that the patterns are plain strings so it can pick an appropriate matching engine-w
tells grep to only match patterns that form a whole word (so if you accidentally get "FAM" in the pattern file, you don't match every line of the data file)Upvotes: 1