Reputation: 8170
I have a very large input set that looks something like this:
Label: foo, Other text: text description...
<insert label> Item: item description...
<insert label> Item: item description...
Label: bar, Other text:...
<insert label> Item:...
Label: baz, Other text:...
<insert label> Item:...
<insert label> Item:...
<insert label> Item:...
...
I'd like to transform this to pull out the label name (e.g. "foo"
) and replace the tag "<insert label>"
on the following lines with the actual label.
Label: foo, Other text: text description...
foo Item: item description...
foo Item: item description...
Label: bar, Other text:...
bar Item:...
Label: baz, Other text:...
baz Item:...
baz Item:...
baz Item:...
...
Can this be done with sed or awk or other unix tool? If so, how might I do it?
Upvotes: 3
Views: 566
Reputation: 36292
One solution using sed
:
Content of script.sed
:
## When line beginning with the 'label' string.
/^Label/ {
## Save content to 'hold space'.
h
## Get the string after the label (removing all other characters)
s/^[^ ]*\([^,]*\).*$/\1/
## Save it in 'hold space' and get the original content
## of the line (exchange contents).
x
## Print and read next line.
b
}
###--- Commented this wrong behaviour ---###
#--- G
#--- s/<[^>]*>\(.*\)\n\(.*\)$/\2\1/
###--- And fixed with this ---###
## When line begins with '<insert label>'
/<insert label>/ {
## Append the label name to the line.
G
## And substitute the '<insert label>' string with it.
s/<insert label>\(.*\)\n\(.*\)$/\2\1/
}
Content of infile
:
Label: foo, Other text: text description...
<insert label> Item: item description...
<insert label> Item: item description...
Label: bar, Other text:...
<insert label> Item:...
Label: baz, Other text:...
<insert label> Item:...
<insert label> Item:...
<insert label> Item:...
Run it like:
sed -f script.sed infile
And result:
Label: foo, Other text: text description...
foo Item: item description...
foo Item: item description...
Label: bar, Other text:...
bar Item:...
Label: baz, Other text:...
baz Item:...
baz Item:...
baz Item:...
Upvotes: 2
Reputation: 786359
You can use awk like this:
awk '$1=="Label:" {label=$2; sub(/,$/, "", label);}
$1=="<insert" && $2=="label>" {$1=" "; $2=label;}
{print $0;}' file
Upvotes: 2
Reputation: 40803
Here is my label.awk file:
/^Label:/ {
label = $2
sub(/,$/, "", label)
}
/<insert label>/ {
sub(/<insert label>/, label)
}
1
To invoke:
awk -f label.awk data.txt
Upvotes: 5