Reputation: 5927
I have a file strings.txt
listing strings, which I am processing like this:
sort strings.txt | uniq -c | sort -n > uniq.counts
So the resulting file uniq.counts
will list uniq strings sorted in the ascending order by their counts, so something like this:
1 some string with spaces
5 some-other,string
25 most;frequent:string
Note that strings in strings.txt
may contain spaces, commas, semicolons and other separators, except for the tab. How can I get uniq.counts
to be in this format:
1<tab>some string with spaces
5<tab>some-other,string
25<tab>most;frequent:string
Upvotes: 4
Views: 3305
Reputation: 785641
You can do:
sort strings.txt | uniq -c | sort -n | sed -E 's/^ *//; s/ /\t/' > uniq.counts
sed
will first remove all leading spaces at the beginning of the line (before counts) and then it will replace space after count to tab
character.
Upvotes: 5
Reputation: 84579
You can simply pipe the output of the sort, etc to sed
before writing to uniq.counts
, e.g. add:
| sed -e 's/^\([0-9][0-9]*\)\(.*$\)/\1\t\2/' > uniq.counts
The full expression would be:
$ sort strings.txt | uniq -c | sort -n | \
sed -e 's/^\([0-9][0-9]*\)\(.*$\)/\1\t\2/' > uniq.counts
(line continuation included for clarity)
Upvotes: 3
Reputation: 88776
With GNU sed:
sort strings.txt | uniq -c | sort -n | sed -r 's/([0-9]) /\1\t/' > uniq.counts
Output to uniq.counts:
1 some string with spaces 5 some-other,string 25 most;frequent:string
If you want to edit your file "in place" use sed's option -i
.
Upvotes: 2