causita
causita

Reputation: 1708

Move headers as part of detail with shell

I have an input file that I need to convert to be DB friendly.

Original:

H|cust_id|order_no
D|itemcode|itemdesc
D|itemcode|itemdesc

Desired output:

itemcode|itemdesc|cust_id|order_no
itemcode|itemdesc|cust_id|order_no

I'm reading about sed but can't understand it.

Upvotes: 1

Views: 43

Answers (2)

mklement0
mklement0

Reputation: 438813

Here's a possibly easier to understand solution using two sed commands (assumes input file file):

# Extract header values from the 1st line, dropping the 1st column.
headerValues=$(sed -E '1 s/[^|]*\|//; q' file)

# Append header values to all subsequent lines, dropping the 1st column.
sed -En '2,$ s/[^|]*\|(.*)/\1|'"$headerValues"'/p' file

# Additional requirement, added later by the OP:
# Store the first header value separately in shell variable $firstHeaderValue.
IFS='|' read -r firstHeaderValue rest <<<"$headerValues"

Note:

  • Assumes that each file has only one header line, which must be the first line, followed by detail lines.
  • Makes no assumptions about the width or number of columns.
  • The sed commands should work on both Linux and OSX/FreeBSD.

Upvotes: 0

Jonathan Leffler
Jonathan Leffler

Reputation: 754280

It's a whole heap easier in awk than sed:

awk -F'|' '$1=="H" { c=$2; o=$3 } $1=="D" { printf("%s|%s%|%s|%s\n", $2, $3, c, o) }'

If it has to be done in sed, then the hold space is the key. In a file script.sed:

sed -e '/^H|/{ s/^H|//; h; d; }' \
    -e '/^D|/{ s/^D|//; G; s/\n/|/; }'

The first line matches H lines, removes the H|, and saves what is left in the hold space, and deletes the line to prevent anything being printed.

The second line matches D lines, removes the D|, appends a newline and the contents of the hold space to the pattern space, replaces the newline with a pipe, and lets the result be printed.

Given input file data:

H|cust_id-1|order_no-11
D|itemcode-1|itemdesc-1
D|itemcode-2|itemdesc-2
H|cust_id-2|order_no-22
D|itemcode-3|itemdesc-3
D|itemcode-4|itemdesc-4

Sample output:

$ awk -F '|' '$1 == "H" { c=$2; o=$3 } $1 == "D" { printf("%s|%s%|%s|%s\n", $2, $3, c, o) }' data
itemcode-1|itemdesc-1|cust_id-1|order_no-11
itemcode-2|itemdesc-2|cust_id-1|order_no-11
itemcode-3|itemdesc-3|cust_id-2|order_no-22
itemcode-4|itemdesc-4|cust_id-2|order_no-22
$ sed -e '/^H|/{ s/^H|//; h; d; }' -e '/^D|/{ s/^D|//; G; s/\n/|/; }' data
itemcode-1|itemdesc-1|cust_id-1|order_no-11
itemcode-2|itemdesc-2|cust_id-1|order_no-11
itemcode-3|itemdesc-3|cust_id-2|order_no-22
itemcode-4|itemdesc-4|cust_id-2|order_no-22
$

Upvotes: 3

Related Questions