Reputation: 477
In the data below I want to correctly distinguish the indented lines. Each line consists of 2 fields that are tab separated so each indented line starts with an invisible tab.
I would like to know why the following script that tests for non-whitespace in the first field does only prints the second and second last field of the data pasted below instead of all lines that are not indented. Suggestions for a solution are welcome but I would like to know what is wrong with what I wrote.
Here is the script
BEGIN {FS="\t"; OFS="\t"}
/\s*(directors)\s*$/ {type=$1; next}
$1~/\S/ {print}
Data.
directors
Özkul, Ahmet Salih Ii 2013
'Abd Al-Hamid, Ja'far A Two Hour Delay 2001
Badgeless sur la Croisette 2012
Just Outside the Frame: The Profilmic Event and Beyond 2008
Mesocafe 2009
Mesocafé 2011
'D.J'Arlia, Domenic She'll Never Know 2012
Cantarella 2011
Makhno Beer 2010
'Kid Niagara' Kallet, Harry Drug Demon Romance 2012
'Kusare, Mak (I) Baby Beautiful 2013/II
Comrade 2008
'Kusare, Mak (II) A Play Called a Temple Made of Clay 2014
'Legend' Spivey, Larry The Crime City Diaries: Entry 1 - Crooked 2012
'Noble Julz'Hamilton, Ulia Church Hurt 2015
Upvotes: 1
Views: 1548
Reputation: 784998
Use posix regex properties for space rather than PCRE \s
or \S
:
awk 'BEGIN {FS=OFS="\t"}
/[[:space:]]*directors[[:space:]]**$/ {type=$1; next}
$1~/[^[:space:]]/' file
Note use of [[:space:]]
instead of \s
and [^[:space:]]
instead of \S
.
Upvotes: 4