Reputation: 53
I have two files with two columns separated by space
cat file1.txt
281475225437349 33,32,21,17,20,22,18,30,19,16,23,31
281475550885480 35,32,33,21,39,40,57,36,41,17,20,38,34,37,16,99
cat file2.txt
281475550885480 16,17,20,21,32,33,34,35,36,37,38,39,40,41
281475225437349 16,17,18,19,20,21,22,23,24,25,30,31,32,33
i want to compare values from file1 column2 with file2 column2 for same value in column1. and print only those values which does exist in file1 column2 but not in file2 column2 along and not vice versa along with respective value in column1
Desired o/p
it should not print anything for 281475225437349 as all values from file1 column2 present in file2 column 2 for 281475225437349
it should only print values for 281475550885480 which present in file1 column2 but not present in file2 column2 . Like values 57 and 99 for 281475550885480
So o/p file like:
cat output.txt
281475550885480 57,99
I have tried sorting the file and the comparing using sdiff but it give difference and its taking time
sdiff file1.txt file2.txt
Upvotes: 2
Views: 1275
Reputation: 67467
another awk
$ awk -v c=, 'NR==FNR {a[$1]=$2; next}
{for(i=2;i<=NF;i++)
{if(c a[$1] c !~ c $i c) p=(p==""?$1" ":p c) $i}
if(p) print p; p=""}' file2 FS=' |,' file1
281475550885480 57,99
Upvotes: 1
Reputation: 203189
$ cat tst.awk
BEGIN { FS="[ ,]" }
NR==FNR {
for (i=2; i<=NF; i++) {
file2[$1,$i]
}
next
}
{
diff = ""
for (i=2; i<=NF; i++) {
if ( !(($1,$i) in file2) ) {
diff = (diff == "" ? "" : diff ",") $i
}
}
if (diff != "") {
print $1, diff
}
}
$ awk -f tst.awk file2 file1
281475550885480 57,99
Upvotes: 0
Reputation: 58371
This might work for you (GNU sed):
sed -r 's#^(\S+)\s(\S+)$#/^\1 /s/$/,\\n\2,/#' file2 |
sed -rnf - -e ':a;s/(\b[0-9]+,)(.*\n.*)\1/\2/;ta;s/(.*),\n.*/\1/p' file1
The solution can be split into two parts.
First file2 is turned into a sed script which appends the values for each key to the same key in file1. Additional ,
's are added to make matching easier and will act as indicator of whether a line from file1 has values to be printed.
The script generated from file2 is piped into a second sed invocation and an additional subscript uses substitution and a loop to remove matching values from file1.
On a match failing, and because the second sed invocation uses the -n
switch to make printing explicit, a final match removes the introduced ,
and non-matched values following the newline and prints the required result.
Upvotes: 0
Reputation: 26471
This is straightforward with awk:
awk '(NR==FNR) { a[$1]=","$2","; next }
{ delete b } # clear array for new run
{ split($2,f,",") } # split string of file1 in array f
{ for(i in f) if (! match(a[$1],"," f[i] ",")) b[f[i]] }
{ c=$1" "; for(i in b) {printf c i; c="," }; if (c==",") printf "\n" }' file2 file1
which returns:
281475550885480 57,99
The idea is to store the string of column two with an extra two <comma>'s to ensure that every number is between commas. Then search for the comma-sandwiched substring.
If you have to do a real number check and you might have to compare numbers "5" with "005" and they should be equal, then you should do the following:
awk '(NR==FNR) { a[$1]=$2; next }
{ delete b } # clear array
{ # split strings in number elements
split($2,f,","); for(i in f) f1[f[i]+0];
split(a[$1],f,","); for(i in f) f2[f[i]+0]
}
{ for(i in f1) if (! (i in f2)) b[i] }
{ c=$1" "; for(i in b) {printf c i; c="," }; if (c==",") printf "\n" }' file2 file1
Upvotes: 0
Reputation: 241768
Perl solution: create a hash of hashes from the second file. The key is the large number, the inner keys are the smaller numbers from the comma separated list. Then iterate over the first file and check what numbers aren't mentioned in the remembered structure.
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
open my $f1, '<', 'file1' or die $!;
open my $f2, '<', 'file2' or die $!;
my %seen;
while (<$f2>) {
my ($key, $value_string) = split ' ';
my @values = split /,/, $value_string; #/
undef @{ $seen{$key} }{@values};
}
while (<$f1>) {
my ($key, $value_string) = split ' ';
my @values = split /,/, $value_string;
my %surplus;
undef @surplus{@values};
delete @surplus{ keys %{ $seen{$key} } };
say $key, ' ', join ',', keys %surplus
if keys %surplus;
}
BTW, when you switch the files, the output will be
281475225437349 24,25
because 24 and 25 aren't present in file1.
Upvotes: 3
Reputation: 11
You must use two loop in your code. Also split ',' character and set two array, after
foreach(var itemOne as arrayOne){
boolean isExist=false;
foreach(var itemTwo as arrayTwo)
if(itemOne==itemTwo) isExist=true;
if(isExist) console.log(itemOne+" is exist");
}
Upvotes: 1