Kyle
Kyle

Reputation: 269

awk/sed/shell to merge/concatenate data

Trying to merge some data that I have. The input would look like so:

foo bar
foo baz boo
abc def
abc ghi

And I would like the output to look like:

foo bar baz boo
abc def ghi

I have some ideas using some arrays in a shell script, but I was looking for a more elegant or quicker solution.

Upvotes: 3

Views: 1475

Answers (6)

gustaf
gustaf

Reputation: 1

Based on fgm's pure Bash snippet:

text='
foo bar
foo baz boo
abc def
abc ghi
'

count=0
oneline=""
firstword=""
while IFS=" " read -a line ; do
   let count++
   if [[ $count -eq 1 ]]; then
      firstword="${line[0]}"
      oneline="${line[@]}"
   else
      if [[ "$firstword" == "${line[0]}" ]]; then
         unset line[0] # remove first word of line
         oneline="${oneline} ${line[@]}"
      else
         printf "%s\n" "${oneline}"
         oneline="${line[@]}"
         firstword="${line[0]}"
      fi
  fi
done <<< "$text"

Upvotes: 0

Fritz G. Mehner
Fritz G. Mehner

Reputation: 17188

Pure Bash, for truly alternating lines:

infile="paste.dat"

toggle=0
while read -a line ; do
  if [ $toggle -eq 0 ] ; then
    echo -n "${line[@]}"
  else
    unset line[0]               # remove first element
    echo  " ${line[@]}"
  fi
  ((toggle=1-toggle))
done < "$infile"

Upvotes: 0

glenn jackman
glenn jackman

Reputation: 246807

An awk solution

awk '
    {key=$1; $1=""; x[key] = x[key] $0}
    END {for (key in x) {print key x[key]}}
' filename

Upvotes: 2

Jerry Coffin
Jerry Coffin

Reputation: 490138

While pixelbeat's answer works, I can't say I'm very enthused about it. I think I'd use awk something like this:

    { for (i=2; i<=NF; i++) { lines[$1] = lines[$1] " " $i;} }  
END { for (i in lines) printf("%s%s\n", i, lines[i]); }

This shouldn't require pre-sorting the data, and should work fine regardless of the number or length of the fields (short of overflowing memory, of course). Its only obvious shortcoming is that its output is in an arbitrary order. If you need it sorted, you'll need to pipe the output through sort (but getting back to the original order would be something else).

Upvotes: 2

pixelbeat
pixelbeat

Reputation: 31718

How about join?

file="file"
join -a1 -a2 <(sort "$file" | sed -n 1~2p) <(sort "$file" | sed -n 2~2p)

The seds there are just splitting the file on odd and even lines

Upvotes: 3

soulmerge
soulmerge

Reputation: 75704

if the length of the first field is fixed, you can use uniq with the -w option. Otherwise you night want to use awk (warning: untested code):

awk '
    BEGIN{last='';}
    {
        if ($1==last) {
            for (i = 1; i < NF;i++) print $i;
        } else {
            print "\n", $0;
            last = $1;
        }
    }'

Upvotes: 0

Related Questions