Bash csv delete line if previous line matches

Question

I have a csv file like so:

"a", "b", "c" 
"1", "b", "4"
"3", "g", "f"

Wherever "b" appears in the same place as the second column value in two consecutive lines, I want to delete the second line, thus resulting in:

"a", "b", "c" 
"3", "g", "f"

This at least gets me started with the parsing:

awk -F "," '$1' file.csv

John1024 · Accepted Answer

This deletes a line any time that the second column is a repeat:

$ awk -F, '$2==last{next} {last=$2} 1' file.csv
"a", "b", "c" 
"3", "g", "f"

$2==last{next}

If the second column, $2, is equal to the previous second column, last, then skip this and go to the next line.
last=$2

Update the value of last.
1

This is cryptic shorthand for "print the line".

If we only want to delete lines that have the second column equal to "b" when that is a repeat of the previous line, then:

awk -F, '$2==last && $2==" \"b\"" {next} {last=$2} 1' file.csv

Suppose that we only want to remove the lines every second occurrence of a "b":

awk -F, '$2==last && $2==" \"b\"" {last="";next} {last=$2} 1' file.csv

Suppose we want to skip any line with second column of "b" if it is followed by a line with the same second column. Then:

awk -F, '$2==last && $2==" \"b\"" {line=$0;next} NR>1{print line} {last=$2;line=$0} END{print line}' file.csv

Answers (2)