Reputation: 269
I'm following a programming course and I'm trying to do a practice activity but I'm stuck. I have a file with the following list:
Monday day
Tuesday day
Easter holiday
Monday day
christmas holiday
Tuesday day
Friday day
Thursday day
thanksgiving holiday
What I'm trying to do is
This would be my desired output:
1 christmas holiday
1 Easter holiday
1 Friday day
2 Monday day
1 thanksgiving holiday
1 Thursday day
2 Tuesday day
I have tried using the following line of code:
cat my_file | sort | uniq -c | less
My problem is that words are not really sorted because words starting with capital letters would come before words starting with lowercase letters. Also, I don't know how to add the tab between the number and the word (in my output, there's only a space between them).
Could you help me?
Upvotes: 2
Views: 59
Reputation: 17721
You may use -f
to sort case-insensitive, and replacing spaces with tabs with sed
(1). cat
my be omitted from the pipe:
sort -f my_file | uniq -c | sed $'s/ */\t/g' | less
Note: The dollar sign in front of the sed
parameter interprets \t
as tab and not as \t
.
If the first tab in each line is annoying, you can remove it with sed
as well:
sort -f my_file | uniq -c | sed 's/^ *//' | sed $'s/ */\t/g' | less
This produces:
1 christmas holiday
1 Easter holiday
1 Friday day
2 Monday day
1 thanksgiving holiday
1 Thursday day
2 Tuesday day
Finally, if you want to keep spaces between the second an the third column, you should omit the g
(replace all occurences of the search pattern) from the second sed
invocation:
sort -f my_file | uniq -c | sed 's/ *//' | sed $'s/ */\t/' | less
Result:
1 christmas holiday
1 Easter holiday
1 Friday day
2 Monday day
1 thanksgiving holiday
1 Thursday day
2 Tuesday day
Upvotes: 2