Reputation: 31
I'm trying my best to learn PHP and hack things out myself. But this part has me stuck.
I have two CSV files with hundreds of rows each.
CSV 1 looks like this:
name, email, interest
CSV 2 looks like this:
email only
I'm trying to write a script to compare the two files looking for duplicates. I only want to keep the duplicates. But as you can see, CSV 2 only contains an email. If an email in CSV 1 DOES NOT EXIST in CSV 2, then the row containing that email in CSV 1 should be deleted.
The end result can either overwrite CSV 1 or create a fresh new file called "final.csv"... whatever is easiest.
I would be grateful for the help.
I tried something along these lines with no luck:
egrep -v $(cat csv2.csv | tr '\n' '|' | sed 's/.$//') csv1.csv
and
grep -v -f csv22.csv csv1.csv >output-file
cheers,
marc
Upvotes: 1
Views: 4705
Reputation: 31
Solved! The problem was with Mac line breaks. Look at the code below to see the additions at the beginning and end of the code to fix that problem. Thank you Crayon Violent for all of your help!
ini_set('auto_detect_line_endings',TRUE);
if (($file3 = fopen("output.csv", "w")) !== FALSE) {
if (($file1 = fopen("dirty.csv", "r")) !== FALSE) {
while (($file1Row = fgetcsv($file1)) !== FALSE) {
if (($file2 = fopen("clean.csv", "r")) !== FALSE) {
while (($file2Row = fgetcsv($file2)) !== FALSE) {
if ( strtolower(trim($file2Row[0])) == strtolower(trim($file1Row[1])) )
fputcsv($file3, $file1Row);
}
fclose($file2);
}
}
fclose($file1);
}
fclose($file3);
}
ini_set('auto_detect_line_endings',FALSE);
Upvotes: 0
Reputation: 32532
Here is a script that will loop through both files and output a 3rd file where email addresses in file2 are found in file1.
if (($file3 = fopen("file3.csv", "w")) !== FALSE) {
if (($file1 = fopen("file1.csv", "r")) !== FALSE) {
while (($file1Row = fgetcsv($file1)) !== FALSE) {
if (($file2 = fopen("file2.csv", "r")) !== FALSE) {
while (($file2Row = fgetcsv($file2)) !== FALSE) {
if ( strtolower(trim($file2Row[0])) == strtolower(trim($file1Row[1])) )
fputcsv($file3, $file1Row);
}
fclose($file2);
}
}
fclose($file1);
}
fclose($file3);
}
Couple of notes:
$file1Row[1]
and $file2Row[0]
. Column # starts at 0.$file3
fopen to "a" instead of "w"Example:
file1.csv:
john,[email protected],blah
mary,[email protected],something
jane,[email protected],blarg
bob,[email protected],asdfsfd
file2.csv
[email protected]
[email protected]
file3.csv (generated)
mary,[email protected],something
bob,[email protected],asdfsfd
Upvotes: 2