Andrew
Andrew

Reputation: 243

Removing duplicate lines from multiple (2) text files in PHP

I have 2 .txt files. First .txt file is curl data (robot) and it always gets 2000 .txt lines including new ones

and the second .txt file has new data of first .txt file. I use the second .txt file for the script.

I cant remove dublicates. (I mean I try to get new values according to the old values) so script always use data with new and also old.

Is there a way to open all the files, remove duplicates and save the lines accordingly to second file?

THERE ARE THREE REFRESH EXAMPLES

here is FIRST refresh and 2 .txt files

first .txt file (you should think it has 2000 lines) refresh curl robot

Something here10
Something here9
Something here8
Something here7
Something here6
Something here5
Something here4
Something here3
Something here2
Something here1

second .txt file that i will use

Something here10
Something here9
Something here8
Something here7
Something here6
Something here5
Something here4
Something here3
Something here2
Something here1

here is SECOND refresh and 2 .txt files

first .txt file (you should think it has 2000 lines) refresh curl bot

Something here14
Something here13
Something here12
Something here11
Something here10
Something here9
Something here8
Something here7
Something here6
Something here5

second .txt file that i will use

Something here14
Something here13
Something here12
Something here11

here is THIRD refresh and 2 .txt files

first .txt file (you should think it has 2000 lines) refresh curl bot

Something here16
Something here15
Something here14
Something here13
Something here12
Something here11
Something here10
Something here9
Something here8
Something here7

second .txt file that i will use

Something here16
Something here15

EDIT: I posted two new refresh

here is FOURTH refresh and 2 .txt files

first .txt file (you should think it has 2000 lines) refresh curl bot

Something here20
Something here19
Something here18
Something here17
Something here16
Something here15
Something here14
Something here13
Something here12
Something here11

second .txt file that i will use

Something here20
Something here19
Something here18
Something here17

here is FIFTH refresh and 2 .txt files

first .txt file (you should think it has 2000 lines) refresh curl bot

Something here24
Something here23
Something here22
Something here21
Something here20
Something here19
Something here18
Something here17
Something here16
Something here15

second .txt file that i will use

Something here24
Something here23
Something here22
Something here21

Upvotes: 1

Views: 581

Answers (2)

OldPadawan
OldPadawan

Reputation: 1251

(reading and interpreting comments) I think you need the following code, using PHP array push

<?php

error_reporting(E_ALL); ini_set('display_errors', 1);

$array1 = array('here9', 'here8', 'here7', 'here6', 'here5', 'here4', 'here3', 'here2', 'here1');
$array2 = array('here4', 'here3', 'here2', 'here1');

echo"Array 1:<br />"; // just checking -> will be removed
print_r($array1); // just checking -> will be removed

echo"<br /><br />Array 2:<br />"; // just checking -> will be removed
print_r($array2); // just checking -> will be removed

echo"<br /><br />"; // will be removed

$newarray = array(); // create new empty array to receive new data

foreach ($array1 as $value) { /* parse array */

// here, we'll make use of PHP array_push
if( !in_array($value, $array2) ) { // if value is not in 2nd array

array_push($newarray, $value); // we add to new array we created

} else { /* do nothing */ }
    }

echo"New array with duplicate removed:<br />"; // just checking -> will be removed
print_r($newarray); // just checking -> will be removed

file_put_contents('output.txt', $newarray); // we write new content of array to file

?>

Upvotes: 1

MackProgramsAlot
MackProgramsAlot

Reputation: 593

I tried to keep this as high level as possible but in essence push each line onto an array and then use array_unique to remove duplicates:

    $line_array = array();
    $files = getFiles();
    foreach($files as $file)
    {
        $lines = $file->getAllLines();
        foreach($lines as $line)
        {
            $line_array[] = $line;
        }
    }
    $without_duplicates = array_unique($line_array);

Upvotes: 0

Related Questions