Reputation: 251
I am having problems trying to create a table containing a master list of names that have been matched and counted in two separate groups.
The Input_list.txt contains a master list of names and looks like this:
-5S_rRNA
-7SK
-ABCA8
-AC002480.4
-AC002978.1
-RP11-129B22.2
These names have been grep'd and counted in two separate data groups; group1_data.txt and group2_data.txt and look like this:
group1_data.txt
-5S_rRNA 20
-7SK 25
-AC002480.4 1
-AC002978.1 2
group2_data.txt
-5S_rRNA 1
-ABCA8 1
I would like to create a table that contains the master Input_list.txt and the 2 data.txt files with the matched names and corresponding counts. If there isn't a match, I would like to return a value of 0 and to look like this:
Input group1 group2
5S_rRNA 20 1
7SK 25 0
ABCA8 0 1
AC002480.4 1 0
AC002978.1 2 0
The number of matched names are not equal between the Input_list.txt and two data.txt files.
I've tried sort but I'm really stuck. Any suggestions would be great!
Upvotes: 0
Views: 20
Reputation: 98028
Using join:
join -e 0 -a 1 -o '1.1 2.2' Input_list.txt group1_data.txt | \
join -a 1 -e 0 -o '1.1 1.2 2.2' - group2_data.txt | \
sed '/ 0 0$/d'
Prints:
-5S_rRNA 20 1
-7SK 25 0
-ABCA8 0 1
-AC002480.4 1 0
-AC002978.1 2 0
Upvotes: 1