Hexana
Hexana

Reputation: 1135

Remove Duplicate Lines in Text File

I have a text file which I am trying to remove duplicate lines.

Text file example:

new featuredProduct('', '21640'), 
new featuredProduct('', '24664'), 
new featuredProduct('', '22142'), 
new featuredProduct('', '22142'), 
new featuredProduct('', '22142'), 
new featuredProduct('', '22142'), 
new featuredProduct('', '22142'), 

The PHP Code I've tried:

$lines = file('textfile.txt');
$lines = array_unique($lines);
file_put_contents('textfile.txt', implode($lines));

The PHP file is called duplicates.php and the textfile is in the same directory. I would like to be left with only:

new featuredProduct('', '21640'), 
new featuredProduct('', '24664'), 
new featuredProduct('', '22142'),  

The file function is trying to read the file into the $lines array then array_unique() to remove the duplicate entries. Then put the filtered results back in the same file.

Upvotes: 4

Views: 3443

Answers (3)

Axalix
Axalix

Reputation: 2871

I know this question is about PHP and I don't know either you use Linux / Unix or Windows, but there is one really nice bash solution to get rid of duplicates that will work way faster for big files I think. You can even execute it from PHP with a system call:

awk '!a[$0]++' input.txt

Upvotes: 3

Mahadeva Prasad
Mahadeva Prasad

Reputation: 709

Try this

$string = file_put_contents('textfile.txt');
$splitstr = explode('),', $string );
$str = implode('),',array_unique($splitstr));
var_dump($str);

Upvotes: 0

Rizier123
Rizier123

Reputation: 59691

The problem is the new line characters at the end of each line. Because you don't have a new line character at the end of the last line it won't be the same as the others.

So just remove them when you read the file and then add then when you save the file again:

$lines = file('test.txt', FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);
$lines = array_unique($lines);
file_put_contents('test.txt', implode(PHP_EOL, $lines));

If yo do: var_dump($lines); right after the file() call you will see it:

array(7) {
  [0]=>
  string(36) "new featuredProduct('', '21640'), 
"
  [1]=>
  string(36) "new featuredProduct('', '24664'), 
"
  [2]=>
  string(36) "new featuredProduct('', '22142'), 
"
  [3]=>
  string(36) "new featuredProduct('', '22142'), 
"
  [4]=>
  string(36) "new featuredProduct('', '22142'), 
"
  [5]=>
  string(36) "new featuredProduct('', '22142'), 
"
  [6]=>
  string(34) "new featuredProduct('', '22142'), "
       //^^ See here                            ^ And here
}

Upvotes: 6

Related Questions