ibav
ibav

Reputation: 171

Bash script - remove lines by looking ahead

I have a csv file where some rows have an empty first field, and some rows have content in the first field. The rows with content in the first field are header rows.

I would like to remove every unnecessary header row. The best way I can see of doing this is by deleting every row for which:

  1. First field is not empty
  2. First field in the following row is not empty

I do not necessarily need to keep the data in the same file, so I can see this being possible using grep, awk, or sed, but none of my attempts have come close to working.

Example input:

header1,value1,etc
,value2,etc
header2,value3,etc
header3,value4,etc
,value5,etc

Desired output:

header1,value1,etc
,value2,etc
header3,value4,etc
,value5,etc

Since the header2 line is not followed by a line with an empty field 1, it is an unnecessary header row.

Upvotes: 0

Views: 36

Answers (2)

glenn jackman
glenn jackman

Reputation: 246807

These kind of tasks are often conceptually easier by reversing the file and checking if the previous line is a header:

tac file |
  awk -F, '$1 && have_header {next} {print; have_header = length($1)}' |
  tac

Upvotes: 0

rici
rici

Reputation: 241721

awk -F, '$1{h=$0;next}h{print h;h=""}1' file

-F,: Use comma as a field separator

$1{h=$0;next}: If the first field has data ( other than 0 ), save the line and go on to the next line.

h{print h;h=""}1: If there is a saved header line, print it and forget it. (This can only execute if there is nothing in $1 because of the next above.)

1: print the current line.

Upvotes: 4

Related Questions