Search one file's lines for a partial match in another file

Question

I have 2 files, the first one:

values.txt

test@
test1@
test3@
test4@
test6@
test7@    
test8@
test9@
test10@

data.csv

"username","email"
"user","test@gmail.com"
"user1","test1@gmail.com"
"user2","test3@gmail.com"
"user4","test4@gmail.com"
"user456","loka@gmail.com"
"user789","lopa@gmail.com"
"user5","test7@gmail.com"
"user","xpos@gmail.com"
"user5","test9@gmail.com"
"user","xpx@gmail.com"

I want the output to be like this:

"user","test@gmail.com"
"user1","test1@gmail.com"
"user2","test3@gmail.com"
"user4","test4@gmail.com"
"user5","test7@gmail.com"
"user5","test9@gmail.com"

What I was able to do :

$ awk -F, -v q='"' 'NR==FNR{a[q $0 q]; next} 
                    $2 in a' values.txt data.csv > test1.csv

This will work only when i have the full "email" exp: test9@gmail.com and not only test9@ a new file test1.csv containing:

"user5","test9@gmail.com"
 ....
 ....

Couldn't figure out how to do it with a partial substring with awk

anubhava · Accepted Answer

You may use this awk:

awk -F, 'NR==FNR {a[$1]; next} {ea = $2; gsub(/^"|@.*$/, "", ea)} ea "@" in a' values.txt data.csv

"user","test@gmail.com"
"user1","test1@gmail.com"
"user2","test3@gmail.com"
"user4","test4@gmail.com"
"user5","test7@gmail.com"
"user5","test9@gmail.com"

A more readable version:

awk -F, 'NR == FNR {
   a[$1]                   # from values.txt store each value in array a
   next
}
{
   ea = $2                 # copy $2 into ea (email address)
   gsub(/^"|@.*$/, "", ea) # strip starting " and text after @
}
ea "@" in a                # check if ea + "@" exists in array a
' values.txt data.csv

Search one file's lines for a partial match in another file

Answers (2)

Related Questions

Search one file&#39;s lines for a partial match in another file

Answers (2)

Related Questions

Search one file's lines for a partial match in another file