Rachit Jain
Rachit Jain

Reputation: 302

Bash script to filter contents of a file

I have a file that looks like:

Location1 Person1 []
Location1 Person1 [place1, place2]
Location2 Person1 [place1]

I want the output to be:

 Location1 Person1 [place1, place2]
 Location2 Person1 [place1]

Means I want to tell awk (or any other tool), that for a unique key Location and Person, if there are 2 entires, take the entry that has something in brackets.

Currently i am doing this, but its not helping

awk '!seen[$1$2]++' $FileName > temp.txt

Upvotes: 0

Views: 84

Answers (3)

retrography
retrography

Reputation: 6812

Take it easy, you don't need awk for that!

sort -r file | sort -t" " -k1,2 -u

Gives you:

Location1 Person1 [place1, place2]
Location2 Person1 [place1]

My assumption is that you can't have several entries with values within brackets for the same person at the same location.

Explanation:

  • -r: reverse
  • -t: column separator
  • -k: key fields
  • -u: unique

Sort with unique switch always keeps the first instance of a duplicate row. If you want to keep the last instance (here the row with the lower sorting order, which is the one including a value within brackets), you have to sort the data in reverse order before feeding them into your unique sort.

Upvotes: 2

karakfa
karakfa

Reputation: 67467

alternative awk to print the highest number of values for each unique key

$ awk '{k=$1 FS $2} (k in v){n=split($0,t,",")}
           !(k in v)||n>c[k]{c[k]=n; v[k]=$0}
                         END{for(k in v) print v[k]}' file

Location1 Person1 [place1, place2]
Location2 Person1 [place1]

in case of a tie, this will print the first line (change n>c[k] to n>=c[k] for the last)

Upvotes: 0

Ed Morton
Ed Morton

Reputation: 203209

This might be what you want:

$ cat tst.awk
/[[][^]]+[]]/ { print; printed[$1,$2]; next }
{ saved[$1,$2] = $0 }
END {
    for (key in saved) {
        if ( !(key in printed) ) {
                print saved[key]
        }
    }
}

$ awk -f tst.awk file
Location1 Person1 [place1, place2]
Location2 Person1 [place1]

It just depends on your requirements and input samples you haven't shared with us yet.

Upvotes: 2

Related Questions