Reputation: 1422
I have a restricted bash (has grep and sed amongst other tools, but not awk) which I'm trying to use to quickly automate some routine work. I'm currently using "grep keyword filename -b3" and would like to figure out how to do this more efficiently within the very limited tools I have.
How do I use bash to grep for the symbol "111AA2026", get the "record" name 3 lines above the matching line including the matched line itself for an XML file like this:
<record name="111111H2" />
<items>
<field name="Electronic Identifier" value="1"/>
<field name="Symbol" value="111AA2026"/>
<field name="Full Symbol" value="111AA202622MARFUT"/>
<field name="System Identifier" value="1"/>
<field name="System Identifier Description" value="Description"/>
</items>
<record name="111111N1" />
<items>
<field name="Electronic Identifier" value="2"/>
<field name="Symbol" value="111AA2026"/>
<field name="Full Symbol" value="111AA202621JULFUT"/>
<field name="System Identifier" value="2"/>
<field name="System Identifier Description" value="Description"/>
</items>
<record name="111111Q1" />
<items>
<field name="Electronic Identifier" value="3"/>
<field name="Symbol" value="111AA2026"/>
<field name="Full Symbol" value="111AA202621AUGFUT"/>
<field name="System Identifier" value="3"/>
<field name="System Identifier Description" value="Description"/>
</items>
<record name="111111U1" />
<items>
<field name="Electronic Identifier" value="4"/>
<field name="Symbol" value="111AA2026"/>
<field name="Full Symbol" value="111AA202621SEPFUT"/>
<field name="System Identifier" value="4"/>
<field name="System Identifier Description" value="Description"/>
</items>
<record name="111111Z1" />
<items>
<field name="Electronic Identifier" value="5"/>
<field name="Symbol" value="111AA2026"/>
<field name="Full Symbol" value="111AA202621DECFUT"/>
<field name="System Identifier" value="5"/>
<field name="System Identifier Description" value="Description"/>
</items>
Note that there are multiple different "Symbol" values in the actual file
Sample output
<record name="111111H2" />
<field name="Symbol" value="111AA2026"/>
--
<record name="111111N1" />
<field name="Symbol" value="111AA2026"/>
--
<record name="111111Q1" />
<field name="Symbol" value="111AA2026"/>
--
<record name="111111U1" />
<field name="Symbol" value="111AA2026"/>
--
<record name="111111Z1" />
<field name="Symbol" value="111AA2026"/>
The key challenge I have is grepping a matching result that gives me the matching line and 3 lines above, and not so much about how to get the attributes of an XML file
Upvotes: 1
Views: 116
Reputation: 58430
This might work for you (GNU sed):
sed -nE '/record/{:a;N;/Symbol/!ba;/111AA2026/s/(\n).*(\1.*)/\2\1--/p}' file
Gather up lines between record
and Symbol
and if those lines contain the literal 111AA2026
, print the first and last lines of the collection plus a delimiter --
.
Alternative using grep only:
grep -B3 '111AA2026' file | grep 'record\|"Symbol"\|--'
Upvotes: 1
Reputation: 179129
Not sure if this is what you're looking for, but it outputs something very similar to what you gave in the sample output.
cat temp.xml \
| grep -B3 '"111AA2026"' \
| sed -n '/<record/p;/"Symbol/p'
# The -n flag disables printing of all lines, which is what sed
# does by default, so we need to handle printing ourselves using
# the "p" command.
sed -n '
# [p]rint all lines that contain: <record
/<record/ p
# [p]rint all lines that contain: "Symbol
/"Symbol/ p
'
Upvotes: 2