jake reading
jake reading

Reputation: 79

grep -Ff producing invalid output

I'm using

code -

 grep -Ff list.txt C:/data/*.txt > found.txt

but it keeps outputting invalid responses, lines don't contain the emails i input..

list.txt contains -

[email protected]
[email protected]
[email protected]
[email protected]
[email protected]

and so on.. email to match on each line,

search files contain -

user1:phonenumber1:[email protected]:last-active:recent
user2:phonennumber2:[email protected]:last-active:inactive
user3:phonenumber3:[email protected]:last-active:never

then another may contain -

blublublu         [email protected]         phonenumber         subscribed
nanananana        [email protected]      phonenumber         unsubscribed
useruser          [email protected]       phonenumber      pending

so what I'm trying to do is present grep with a list of emails/list of strings " list.txt " and to then search the directory provided for matches of each string and output the entire line that contains each match.

example of output in this case would be -

user1:phonenumber1:[email protected]:last-active:recent
user2:phonennumber2:[email protected]:last-active:inactive
blublublu         [email protected]         phonenumber         subscribed
nanananana        [email protected]      phonenumber         unsubscribed

yet it wouldn't output the other two lines -

 user3:phonenumber3:[email protected]:last-active:never
 useruser          [email protected]       phonenumber      pending

because no string is within that line.

Upvotes: -4

Views: 598

Answers (2)

Thomas Smyth
Thomas Smyth

Reputation: 5644

I think your file list.txt may have blank lines in it, causing it to match every line in the files specified with C:/data/*.txt. To fix you can either manually delete every empty line or run the command sed -i '/^$/d' list.txt where the -i flag edits the file in place.

The issue may also be related to dos carriage returns, try running: cat -v list.txt and checking if the lines are followed by ^M:

[email protected]^M
[email protected]^M

If this is the case you will need to amend the file using either dos2unix or tr -d '\r' < list.txt > output.txt.

Upvotes: 0

choroba
choroba

Reputation: 241918

The file list.txt probably contains empty lines or some of the separators. When I added : to list.txt, all the lines from the first sample started to match. Similarly, adding a space made all the lines from the second sample match. Adding @ causes the same symptoms.

Try running grep -oFf ... (if your grep supports -o) to see the exact matching parts. If there are empty lines in list.txt, the number of matches will be less than the number of matches without -o. Try searching the output of -o for extremely short outputs to check for suspicious strings. You can also examine the shortest lines in list.txt.

while read line ; do echo ${#line} "$line" ; done < list.txt | sort -nk1,1

Upvotes: 0

Related Questions