Reputation: 957
I have a csv file like so:
"a", "b", "c"
"1", "b", "4"
"3", "g", "f"
Wherever "b"
appears in the same place as the second column value in two consecutive lines, I want to delete the second line, thus resulting in:
"a", "b", "c"
"3", "g", "f"
This at least gets me started with the parsing:
awk -F "," '$1' file.csv
Upvotes: 2
Views: 1337
Reputation: 113814
This deletes a line any time that the second column is a repeat:
$ awk -F, '$2==last{next} {last=$2} 1' file.csv
"a", "b", "c"
"3", "g", "f"
$2==last{next}
If the second column, $2
, is equal to the previous second column, last
, then skip this and go to the next
line.
last=$2
Update the value of last
.
1
This is cryptic shorthand for "print the line".
If we only want to delete lines that have the second column equal to "b"
when that is a repeat of the previous line, then:
awk -F, '$2==last && $2==" \"b\"" {next} {last=$2} 1' file.csv
Suppose that we only want to remove the lines every second occurrence of a "b"
:
awk -F, '$2==last && $2==" \"b\"" {last="";next} {last=$2} 1' file.csv
Suppose we want to skip any line with second column of "b"
if it is followed by a line with the same second column. Then:
awk -F, '$2==last && $2==" \"b\"" {line=$0;next} NR>1{print line} {last=$2;line=$0} END{print line}' file.csv
Upvotes: 8
Reputation: 4551
Try this:
awk -F', ' 'BEGIN{OFS=FS} {
if ($2 == "\"b\"") {
if (!var) {
print
var=1
} else {
var=""
}
} else {
print
var=""
}
}' files.csv
Upvotes: 0