Reputation: 201
I was wondering how to filter the following lines in AWK:
DSL -
1. Digital Simulation Language. Extensions to FORTRAN to simulate analog
computer functions. "DSL/90 - A Digital Simulation Program for Continuous
System Modelling", Proc SJCC 28, AFIPS (Spring 1966). Version: DSL/90 for
the IBM 7090. Sammet 1969, p.632.
FLIP -
1. Early assembly language on G-15. Listed in CACM 2(5):16 (May 1959).
2. "FLIP User's Manual", G. Kahn, TR 5, INRIA 1981.
3. Formal LIst Processor. Early language for pattern-matching on LISP
structures. Similar to CONVERT. "FLIP, A Format List Processor", W.
Teitelman, Memo MAC-M-263, MIT 1966.
So I can get something like this:
DSL
FLIP
I am using the following sentences in AWK:
BEGIN { RS = "\n\n\n" ; FS = " - " }
{ print $1 }
But what I get is just this:
DSL
Thanks in advance!
Upvotes: 0
Views: 805
Reputation: 7802
Assuming the format is constant (no spaces in first entry):
if ($2=="-"){print $1}
Edit: but if you had an entry like:
Objective C -
...
You would need something like:
if ($NF=="-"){$NF="";print}
awk is really good at parsing flat files that are in a predictable format.
Upvotes: 2
Reputation: 203209
@JonathanLeffler gave you a good awk answer to your specific question but if you're going to be working on files with that format a lot, you may want to consider reformatting them to have records separated by newlines with each list item on a single line, e.g.:
$ cat file
DSL -
1. Digital Simulation Language. Extensions to FORTRAN to simulate analog
computer functions. "DSL/90 - A Digital Simulation Program for Continuous
System Modelling", Proc SJCC 28, AFIPS (Spring 1966). Version: DSL/90 for
the IBM 7090. Sammet 1969, p.632.
FLIP -
1. Early assembly language on G-15. Listed in CACM 2(5):16 (May 1959).
2. "FLIP User's Manual", G. Kahn, TR 5, INRIA 1981.
3. Formal LIst Processor. Early language for pattern-matching on LISP
structures. Similar to CONVERT. "FLIP, A Format List Processor", W.
Teitelman, Memo MAC-M-263, MIT 1966.
$ awk '!/^[[:space:]]*$/{printf "%s%s", (NF==2 && /-[[:space:]]*$/ ? rs rs : (/^ +[[:digit:]]+\./ ? rs : "")), $0; rs="\n"} END{print ""}' file
DSL -
1. Digital Simulation Language. Extensions to FORTRAN to simulate analogcomputer functions. "DSL/90 - A Digital Simulation Program for ContinuousSystem Modelling", Proc SJCC 28, AFIPS (Spring 1966). Version: DSL/90 forthe IBM 7090. Sammet 1969, p.632.
FLIP -
1. Early assembly language on G-15. Listed in CACM 2(5):16 (May 1959).
2. "FLIP User's Manual", G. Kahn, TR 5, INRIA 1981.
3. Formal LIst Processor. Early language for pattern-matching on LISPstructures. Similar to CONVERT. "FLIP, A Format List Processor", W.Teitelman, Memo MAC-M-263, MIT 1966.
That way you can process the output easily to print or do whatever else you want, e.g.
1) to print every header line plus first bullet item:
$ awk '...' file | awk 'BEGIN{RS=""; ORS="\n\n"; FS=OFS="\n"} {print $1,$2}'
DSL -
1. Digital Simulation Language. Extensions to FORTRAN to simulate analogcomputer functions. "DSL/90 - A Digital Simulation Program for ContinuousSystem Modelling", Proc SJCC 28, AFIPS (Spring 1966). Version: DSL/90 forthe IBM 7090. Sammet 1969, p.632.
FLIP -
1. Early assembly language on G-15. Listed in CACM 2(5):16 (May 1959).
2) to print the header line plus the second bullet item of just the "FLIP" record:
$ awk '...' file | awk 'BEGIN{RS=""; ORS="\n\n"; FS=OFS="\n"} /^FLIP -/{print $1,$3}'
FLIP -
2. "FLIP User's Manual", G. Kahn, TR 5, INRIA 1981.
3) to print the header line plus a count of the bullet items for that header:
$ awk '...' file | awk 'BEGIN{RS=""; FS=OFS="\n"} {print $1 NF-1}'
DSL - 1
FLIP - 3
etc., etc.
Upvotes: 1
Reputation: 1
If all the lines you want to skip start with a space this will work:
awk -F"-" '{if (substr($1,1,1)!=" ")print $1}'
Upvotes: 0
Reputation: 753525
It appears that you're looking for a line with two words only on it, and the second word is -
. If so, then you could write:
awk 'NF == 2 && $2 == "-" { print $1 }'
You could further qualify it to insist that $1
starts at the beginning of the line (no leading blanks):
awk '$0 !~ /^ / && NF == 2 && $2 == "-" { print $1 }'
Both these produce lines containing just DSL
and FLIP
on the given data.
Upvotes: 1