Reputation: 1
I am trying to match data from two files and create a new file with the results.
File 1 has data that looks like this:
19V17R1-wipedrive-2016.05.23-07.25PM-d0.pdf
19XPT32-wipedrive-2016.05.03-05.50AM-d0.pdf
19XPT32-wipedrive-2016.07.06-08.32PM-d0.pdf
1BC6062-wipedrive-2018.07.26-08.34AM-d0.pdf
File 2 just has the first 7 characters, like so:
19V17R1
1BC6062
The final file should look like this:
19V17R1 19V17R1-wipedrive-2016.05.23-07.25PM-d0.pdf
1BC6062 1BC6062-wipedrive-2018.07.26-08.34AM-d0.pdf
I can match the files by creating a file with just the first 7 characters and then doing:
awk 'FNR==NR{!a[$1]++;next}$0 in a' /RMAs.txt /sortedWipelogs.txt > matches.text
What I can't figure out is how to output the entire filename in the second column. Thanks.
Upvotes: 0
Views: 66
Reputation: 26471
There are many ways to do this. There is already a join
answer. Here is a grep
one:
$ grep -F -f file2 file1
19V17R1-wipedrive-2016.05.23-07.25PM-d0.pdf
1BC6062-wipedrive-2018.07.26-08.34AM-d0.pdf
But this could also match other parts of the file, but if you are certain of the format. This will do it. You also do not really need the first column, as they match! If you want the first column, you can do it simply like this
$ grep -F -f file2 file1 | awk '{print substr($0,1,7), $0 }'
19V17R1 19V17R1-wipedrive-2016.05.23-07.25PM-d0.pdf
1BC6062 1BC6062-wipedrive-2018.07.26-08.34AM-d0.pdf
or just
$ awk '(NR==FNR){a[$1];next}(substr($0,1,7) in a){ print substr($0,1,7), $0 }' file2 file1
or even shorter with -
as a delimiter (only for file1
to avoid possible blank-problems in file2
$ awk '(NR==FNR){a[$1];next}($1 in a){ print $1, $0 }' file2 FS="-" file1
Upvotes: 0
Reputation: 8711
Using Perl
perl -lne ' BEGIN { $x=join("|", map{chomp;$_} qx(cat mweb2.txt)) } s/^($x)/$1 $1/g and print '
with the inputs
$ cat mweb1.txt
19V17R1-wipedrive-2016.05.23-07.25PM-d0.pdf
19XPT32-wipedrive-2016.05.03-05.50AM-d0.pdf
19XPT32-wipedrive-2016.07.06-08.32PM-d0.pdf
1BC6062-wipedrive-2018.07.26-08.34AM-d0.pdf
$ cat mweb2.txt
19V17R1
1BC6062
$ perl -lne ' BEGIN { $x=join("|", map{chomp;$_} qx(cat mweb2.txt)) } s/^($x)/$1 $1/g and print ' mweb1.txt
19V17R1 19V17R1-wipedrive-2016.05.23-07.25PM-d0.pdf
1BC6062 1BC6062-wipedrive-2018.07.26-08.34AM-d0.pdf
$
Upvotes: 0
Reputation: 67467
if both of the files are sorted as shown, then simply
$ join -t- file1 file2
19V17R1-wipedrive-2016.05.23-07.25PM-d0.pdf
1BC6062-wipedrive-2018.07.26-08.34AM-d0.pdf
for the desired output format, this might be easier than setting -o
options of join
$ join <(awk '{print substr($0,1,7) "\t" $0}' file1) file2
19V17R1 19V17R1-wipedrive-2016.05.23-07.25PM-d0.pdf
1BC6062 1BC6062-wipedrive-2018.07.26-08.34AM-d0.pdf
Upvotes: 1
Reputation: 133458
Could you please try following.
awk 'FNR==NR{a[$0]=$0;next} a[$1]{print a[$1],$0}' Input_file2 FS="-" Input_file1
Explanation: Adding explanation for above code now.
awk '
FNR==NR{ ##Checking condition FNR==NR which will be true when first Input_file named file2 is being read.
a[$0]=$0 ##Creating an array named a whose index is $0 and value is $0.
next ##Using next will skip all further statements from here.
} ##Closing block for FNR==NR here.
a[$1]{ ##Checking condition if a[$1] is NOT NULL then do following.
print a[$1],$0 ##Printing value of array a whose index is $1 of current lie, along with the current line.
}' file2 FS="-" file1 ##Closing block and mentioning Input_file file2 name then setting FS="-" and mentioning Input_file name file1 here.
Upvotes: 0
Reputation: 881323
That's as simple as creating the following go.awk
:
NR==FNR { lookup[substr($0,1,7)] = $0 }
NR!=FNR { print $0" "lookup[$0] }
Then you run it with:
awk -f go.awk file1.txt file2.txt
The first command is executed for each line in the first input file and it simply stores the entire line in an associative array, keyed on the first seven characters, for later lookup.
The second command, for each file in the second and subsequent input files, outputs the line and the related entry in the associative array. The output you see is exactly what you asked for:
19V17R1 19V17R1-wipedrive-2016.05.23-07.25PM-d0.pdf
1BC6062 1BC6062-wipedrive-2018.07.26-08.34AM-d0.pdf
Now I prefer using scripts since it means I don't have to go searching in my history for arbitrarily complex awk
commands but, if you want a one-liner to do the same thing:
awk 'NR==FNR{lookup[substr($0,1,7)]=$0}NR!=FNR{print $0" "lookup[$0]}' file1.txt file2.txt
Upvotes: 0