Dave Rager
Dave Rager

Reputation: 8170

Transforming text with 'sed' or 'awk'

I have a very large input set that looks something like this:

Label: foo, Other text: text description...
   <insert label> Item: item description...
   <insert label> Item: item description...
Label: bar, Other text:...
   <insert label> Item:...
Label: baz, Other text:...
   <insert label> Item:...
   <insert label> Item:...
   <insert label> Item:...
...

I'd like to transform this to pull out the label name (e.g. "foo") and replace the tag "<insert label>" on the following lines with the actual label.

Label: foo, Other text: text description...
   foo Item: item description...
   foo Item: item description...
Label: bar, Other text:...
   bar Item:...
Label: baz, Other text:...
   baz Item:...
   baz Item:...
   baz Item:...
...

Can this be done with sed or awk or other unix tool? If so, how might I do it?

Upvotes: 3

Views: 566

Answers (3)

Birei
Birei

Reputation: 36292

One solution using sed:

Content of script.sed:

## When line beginning with the 'label' string.
/^Label/ {
    ## Save content to 'hold space'.
    h   

    ## Get the string after the label (removing all other characters)
    s/^[^ ]*\([^,]*\).*$/\1/

    ## Save it in 'hold space' and get the original content
    ## of the line (exchange contents).
    x   

    ## Print and read next line.
    b   
}
###--- Commented this wrong behaviour ---###    
#--- G
#--- s/<[^>]*>\(.*\)\n\(.*\)$/\2\1/

###--- And fixed with this ---###
## When line begins with '<insert label>'
/<insert label>/ {
    ## Append the label name to the line.
    G   

    ## And substitute the '<insert label>' string with it.
    s/<insert label>\(.*\)\n\(.*\)$/\2\1/
}

Content of infile:

Label: foo, Other text: text description...
   <insert label> Item: item description...
   <insert label> Item: item description...
Label: bar, Other text:...
   <insert label> Item:...
Label: baz, Other text:...
   <insert label> Item:...
   <insert label> Item:...
   <insert label> Item:...

Run it like:

sed -f script.sed infile

And result:

Label: foo, Other text: text description...
    foo Item: item description...
    foo Item: item description...
Label: bar, Other text:...
    bar Item:...
Label: baz, Other text:...
    baz Item:...
    baz Item:...
    baz Item:...

Upvotes: 2

anubhava
anubhava

Reputation: 786359

You can use awk like this:

awk '$1=="Label:" {label=$2; sub(/,$/, "", label);} 
     $1=="<insert" && $2=="label>" {$1=" "; $2=label;}
     {print $0;}' file

Upvotes: 2

Hai Vu
Hai Vu

Reputation: 40803

Here is my label.awk file:

/^Label:/ {
    label = $2
    sub(/,$/, "", label)
}

/<insert label>/ {
    sub(/<insert label>/, label)
}

1

To invoke:

awk -f label.awk data.txt

Upvotes: 5

Related Questions