Reputation: 175
I have two csv files old.csv and new.csv. I need only new or updated records from new.csv file. Delete records from new.csv if that is exists in old.csv.
old.csv
"R","abc","london","1234567"
"S","def","london","1234567"
"T","kevin","boston","9876"
"U","krish","canada","1234567"
new.csv
"R","abc","london","5678"
"S","def","london","1234567"
"T","kevin","boston","9876"
"V","Bell","tokyo","2222"
Output in new.csv
"R","abc","london","5678"
"V","Bell","tokyo","2222"
Note : if All records are same in new.csv then new.csv should be empty
Upvotes: 5
Views: 17298
Reputation: 19982
When the files are sorted:
comm -13 old.csv new.csv
When they are not sorted, and sorting is allowed:
comm -13 <(sort old.csv) <(sort new.csv)
Upvotes: 5
Reputation: 37394
Use for example grep
:
$ grep -v -f old.csv new.csv # > the_new_new.csv
"R","abc","london","5678"
"V","Bell","tokyo","2222"
and:
$ grep -v -f old.csv old.csv
$ # see, no differencies in 2 identical files
man grep
:
-f FILE, --file=FILE
Obtain patterns from FILE, one per line. The empty file
contains zero patterns, and therefore matches nothing. (-f is
specified by POSIX.)
-v, --invert-match
Invert the sense of matching, to select non-matching lines. (-v
is specified by POSIX.)
Then again, you could use awk for it:
$ awk 'NR==FNR{a[$0];next} !($0 in a)' old.csv new.csv
"R","abc","london","5678"
"V","Bell","tokyo","2222"
Explained:
awk '
NR==FNR{ # the records in the first file are hashed to memory
a[$0]
next
}
!($0 in a) # the records which are not found in the hash are printed
' old.csv new.csv # > the_new_new.csv
Upvotes: 9