Reputation: 637
I have a file that looks something like this:
# a mess of text
Hello. Student Joe Deere has
id number 1. Over.
# some more messy text
Hello. Student Steve Michael Smith has
id number 2. Over.
# etc.
I want to record the pairs (Joe Deere, 1)
, (Steve Michael Smith, 2)
, etc. into a list (or two separate lists with the same order). Namely, I will need to loop over those pairs and do something with the names and ids.
(names and ids are on distinct lines, but come in the order: name1
, id1
, name2
, id2
, etc. in the text). I am able to extract the lines of interest with
VAR=$(awk '/Student/,/Over/' filename.txt)
I think I know how to extract the names and ids with grep
, but it will give me the result as one big block like
`Joe Deere 1 Steve Michael Smith 2 ...`
(and maybe even with a separator between names and ids). I am not sure at this point how to go forward with this, and in any case it doesn't feel like the right approach.
I am sure that there is a one-liner in awk
that will do what I need. The possibilities are infinite and the documentation monumental.
Any suggestion?
Upvotes: 1
Views: 105
Reputation: 246942
grep -oP 'Hello. Student \K.+(?= has)|id number \K\d+' file | paste - -
Upvotes: 0
Reputation: 133545
Could you please try following too.
awk '
/id number/{
sub(/\./,"",$3)
print val", "$3
val=""
next
}
{
gsub(/Hello\. Student | has.*/,"")
val=$0
}
' Input_file
Upvotes: 1
Reputation: 203712
$ cat tst.awk
/^id number/ {
gsub(/^([^ ]+ ){2}| [^ ]+$/,"",prev)
printf "(%s, %d)\n", prev, $3
}
{ prev = $0 }
$ awk -f tst.awk file
(Joe Deere, 1)
(Steve Michael Smith, 2)
Upvotes: 2