Banana Mannock
Banana Mannock

Reputation: 37

How to find lines using patterns in a file in UNIX

I am trying to use a .txt file with around 5000 patterns (spaced with a line) to search through another file of 18000 lines for any matches. So far I've tried every form of grep and awk I can find on the internet and it's still not working, so I am completely stumped.

Here's some text from each file.

Pattern.txt

rs2622590
rs925489
rs2798334
rs6801957
rs6801957
rs13137008
rs3807989
rs10850409
rs2798269
rs549182

There's no extra spaces or anything.

File.txt

snpid hg18chr bp a1 a2 zscore pval CEUmaf
rs3131972       1       742584  A       G       0.289   0.7726  .
rs3131969       1       744045  A       G       0.393   0.6946  .
rs3131967       1       744197  T       C       0.443   0.658   .
rs1048488       1       750775  T       C       -0.289  0.7726  .
rs12562034      1       758311  A       G       -1.552  0.1207  0.09167
rs4040617       1       769185  A       G       -0.414  0.6786  0.875
rs4970383       1       828418  A       C       0.214   0.8303  .
rs4475691       1       836671  T       C       -0.604  0.5461  .
rs1806509       1       843817  A       C       -0.262  0.7933  .

The file.txt was downloaded directly from a med directory.

I'm pretty new to UNIX so any help would be amazing!

Sorry edit: I have definitely tried every single thing you guys are recommending and the result is blank. Am I maybe missing a syntax issue or something in my text files?

P.P.S I know there are matches as doing individual greps works. I'll move this question to unix.stackexchange. Thanks for your answers guys I'll try them all out.

Issue solved: I was obviously using DOS carriages. I didn't know about this before so thank you everyone that answered. For future users who are having this issue, here is the solution that worked:

dos2unix *

awk 'NR==FNR{p[$0];next} $1 in p' Patterns.txt File.txt > Output.txt

Upvotes: 0

Views: 2110

Answers (3)

anubhava
anubhava

Reputation: 784998

You can use grep -Fw here:

grep -Fw -f Pattern.txt File.txt

Options used are:

  • -F - Fixed string search to tread input as non-regex
  • -w - Match full words only
  • -f file - Read pattern from a file

Upvotes: 3

Ed Morton
Ed Morton

Reputation: 203254

idk if it's what you want or not, but this will print every line from File.txt whose first field equals a string from Patterns.txt:

awk 'NR==FNR{p[$0];next} $1 in p' Patterns.txt File.txt

If that is not what you want, tell us what you do want. If it is what you want but doesn't produce the output you expect then one or both of your files contains control characters courtesy of being created in Windows so run dos2unix or similar on them both first.

Upvotes: 0

ACCurrent
ACCurrent

Reputation: 385

Use a shell script to read each line of the file containing your patterns then fgrep it.

#!/bin/bash

FILENAME=$1

awk '{kount++;print   $0}' $FILENAME | fgrep -f - PATTERNFILE.txt

Upvotes: -1

Related Questions