Reputation: 509
What is a Perl one-liner to only print those lines that only ever appear once (that is, do not print if it appears more than once--the truly unique lines)?
For example, if I have a file that contains duplicate lines:
line1
line2
line2
line3
line1
line4
line5
The output should be:
line3
line4
line5
I can do perl -ne 'print if $a{$_}++' file
to see only the lines that are duplicates...
line2
line1
I can swap the if
for its antonym, unless
, and see only one occurrence of each line in the file...
perl -ne 'print unless $a{$_}++' file
line1
line2
line3
line4
line5
I'm assuming I have to slurp the entire file in and process it using single \n
delimiters for each line, maybe into a hash? Just not sure how to do that.
Upvotes: 1
Views: 1038
Reputation: 109
This should work although it's a Unix solution:
sort file1|uniq -u
line3
line4
line5
where file1 has:
line1
line2
line2
line3
line1
line4
line5
The -u option of the uniq command lists only non-duplicate entries, and uniq works on a sorted output
Upvotes: 1
Reputation: 126772
As mentioned above, to filter the file in this way and keep the lines in order you need to either read through the file twice or to store line number information while you read
This one-liner seems the best option
perl -e '@a = @ARGV; ++$c{$_} while <>; @ARGV = @a; $c{$_} == 1 and print while <>;' myfile.txt
line3
line4
line5
This is a slightly shorter alternative, but it uses double the amount of memory to store the file data
perl -e '@l = <>; ++$c{$_} for @l; $c{$_} == 1 and print for @l;' myfile.txt
Upvotes: 2
Reputation: 18444
Another way to do it:
perl -e'@a=<>; $d{$_}++ for @a; print grep {$d{$_}<2} @a' file
Upvotes: 4
Reputation: 386696
You only know if a line is unique after you've read all the lines, so you can't possibly start printing before you've reached the end of the file!
# Varying order
perl -nle'++$lines{$_}; END { print for grep $lines{$_}==1, keys %lines; }' file
or
# Sorted
perl -nle'++$lines{$_}; END { print for sort grep $lines{$_}==1, keys %lines; }' file
or
# Original order
perl -nle'
if ( my $orig_line_num = $line_nums_by_line{$_} ) {
$lines_by_line_num[$orig_line_num] = undef;
} else {
$lines_by_line_num[$.] = $_;
$line_nums_by_line{$_} = $.;
}
END { print for grep defined, @lines_by_line_num; }
' file
Upvotes: 2