Cynass
Cynass

Reputation: 55

Shell : How to extract the first occurence based on the first word of each lines?

I have a log file recording specific actions where the "first word" is the ID of the action, and I'd want to extract from it the first occurence of each ID so I can display the first action of each ID.

I'm not sure I'm quite clear, so let's say I have a file monitoring the actions of a bunch of people and that gets updated everytime someone does something :

Alice ate an apple
Eve fell asleep
Bob watched TV
Bob sat on a chair
Alice went to the kitchen
Dave drank coffee
Carol bought a car
Eve fed the cat
Eve took out the trash
Dave took a shower
Bob washed the dishes
Alice read a book
Carol played the piano
...

Let's say I want to see what is the first action done by each people, so the desired output would be :

Alice ate an apple
Eve fell asleep
Bob watched TV
Dave drank coffee
Carol bought a car

I tried some conbinations of uniq and grep but there is a problem : To use the uniq command I would need to sort the lines first which defeat my purpose to get the first occurence (Example here, "Eve fed the cat" will come before "Eve fell asleep")

Is there a better way to achieve this ?

Thank you all for taking the time to read me.

Upvotes: 0

Views: 71

Answers (2)

Steve
Steve

Reputation: 54392

Idiomatic :

awk '!seen[$1]++' file

This uses an associative array and the postfix increment operator to add the first field, $1, into an array called 'seen'. The value is zero the first time the key is encountered, so it can just be negated to return true the first time the key is seen.

Upvotes: 0

dawg
dawg

Reputation: 103764

With awk this is simple:

$ awk '++arr[$1]==1' file

Prints:

Alice ate an apple 
Eve fell asleep 
Bob watched TV 
Dave drank coffee 
Carol bought a car 

Works this way:

awk '++arr[$1]==1' file
        ^           arr is an associative array with key/value combo
      ^             when created with $1 key (the first col) val is 0
      ^             ++before adds 1 before return value         
               ^    equal to
                 ^  1 meaning first time seen
    ^           ^   if this resolve true (col 1 seen first time) print

You can do this with other shell tools (Bash, Ruby, Perl, Python, etc.) but almost all easy solutions will use that tools version of an associative array that counts the number of times X has been seen.

Upvotes: 4

Related Questions