Reputation: 93
I have a (very) basic understanding of AWK, I have tried a few ways of doing this but all print out far more lines than I want:
I have 10 lines in file.1
:
chr10 234567
chr20 123456
...
chrX 62312
I want to move to uppercase and match the first 2 columns of file.2
, so first line below matches second line above, but I don't want to get second line below which matches third line above for position but not chr, and I don't want the first line below to match the first line above.
CHR20 123456 ... 234567
CHR28 234567 ... 62312
I have:
$ cat file.1 | tr '[:lower:]' '[:upper:]' | <grep? awk?>
and would love to know how to proceed. I had used a simple grep - previously but the second column of file.1
matches more in the searched file so I get hundreds of lines returned. I want to just match on the first 2 columns (they correspond to the first 2 columns in the file.2
).
Hope thats clear enough for you, look forward to your answers=)
Upvotes: 0
Views: 6549
Reputation: 171263
If the files are sorted by the first column you can do:
join -i file.1 file.2 ¦ awk '$3==$2{ $3=""; print}'
If they're not sorted, sort them first.
The -i flag says to ignore case.
That won't work if there are multiple lines with the same field in the first column. To make that work you would need something more complicated
Upvotes: 4