compguy24
compguy24

Reputation: 957

Bash csv delete line if previous line matches

I have a csv file like so:

"a", "b", "c" 
"1", "b", "4"
"3", "g", "f"

Wherever "b" appears in the same place as the second column value in two consecutive lines, I want to delete the second line, thus resulting in:

"a", "b", "c" 
"3", "g", "f"

This at least gets me started with the parsing:

awk -F "," '$1' file.csv

Upvotes: 2

Views: 1337

Answers (2)

John1024
John1024

Reputation: 113814

This deletes a line any time that the second column is a repeat:

$ awk -F, '$2==last{next} {last=$2} 1' file.csv
"a", "b", "c" 
"3", "g", "f"

How it works

  • $2==last{next}

    If the second column, $2, is equal to the previous second column, last, then skip this and go to the next line.

  • last=$2

    Update the value of last.

  • 1

    This is cryptic shorthand for "print the line".

Variation 1

If we only want to delete lines that have the second column equal to "b" when that is a repeat of the previous line, then:

awk -F, '$2==last && $2==" \"b\"" {next} {last=$2} 1' file.csv

Variation 2

Suppose that we only want to remove the lines every second occurrence of a "b":

awk -F, '$2==last && $2==" \"b\"" {last="";next} {last=$2} 1' file.csv

Variation 3

Suppose we want to skip any line with second column of "b" if it is followed by a line with the same second column. Then:

awk -F, '$2==last && $2==" \"b\"" {line=$0;next} NR>1{print line} {last=$2;line=$0} END{print line}' file.csv

Upvotes: 8

ShellFish
ShellFish

Reputation: 4551

Try this:

awk -F', ' 'BEGIN{OFS=FS} {
    if ($2 == "\"b\"") { 
        if (!var) {
            print
            var=1
        } else {
            var=""
        }
    } else {
        print
        var=""
    }
}' files.csv

Upvotes: 0

Related Questions