Greg Rov
Greg Rov

Reputation: 347

Insert a space to separate a database

Good morning, I have the following set, but with thousands of more information:

215 22221121110110110101 
212 22221121110110110101  
468 22221121110110110101
1200 22221121110110110101 
400 22221121110110110101 
100 22221121110110110101 
200 22221121110110110101

And I need to separate it into columns this way:

215 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1 
212 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1 
468 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
1200 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
400 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
100 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
200 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1

I tried to use a simple sed, but don't work

sed -i -e 's// /g'

Upvotes: 0

Views: 122

Answers (10)

Thor
Thor

Reputation: 47099

How about coreutils:

paste -d ''                                \
  <(cut -d' ' -f1 infile                 ) \
  <(cut -d' ' -f2 infile | sed 's/./ &/g')

Output:

215 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
212 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
468 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
1200 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
400 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
100 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
200 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1

Upvotes: 1

potong
potong

Reputation: 58430

This might work for you (GNU sed):

sed 's/ /\n/;h;s/\B/ /g;H;g;s/\n.*\n/ /' file

Replace the first space by a newline, copy the line, replace all non-word boundaries with a space, append the change line to the copy and then rearrange the line.

Upvotes: 1

ctac_
ctac_

Reputation: 2471

Another approach with bash

while read a b;do
  printf "%s" $a
  while read -n1 c;do
    printf " %c" "$c"
  done<<<$b
  echo
done<lefile

Upvotes: 1

Akshay Hegde
Akshay Hegde

Reputation: 16997

Using awk's gsub(regexp, replacement [, target])

awk '{gsub(/./," &",$2); print $1 $2}' infile

Explanation:

  • gsub(/./,"& ",$2) match any char (except for line terminators) and replace it with the same, along with single space in second column of current record read.

The Dot Matches (Almost) Any Character. In regular expressions, the dot or period is one of the most commonly used metacharacters. The dot matches a single character, without caring what that character is. The only exception are line break characters.

  • If the special character & appears in replacement, it stands for the precise substring that was matched by regexp.

Test Results:

$ cat infile
215 22221121110110110101 
212 22221121110110110101  
468 22221121110110110101
1200 22221121110110110101 
400 22221121110110110101 
100 22221121110110110101 
200 22221121110110110101

$ awk '{gsub(/./," &",$2); print $1 $2}' infile
215 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
212 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
468 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
1200 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1 
400 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
100 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
200 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1

Upvotes: 2

Sundeep
Sundeep

Reputation: 23667

speed comparison of some of the answers

$ perl -0777 -ne 'print $_ x 1000000' ip.txt > f1
$ du -h f1
169M    f1

time given for two consecutive runs

$ time perl -lane 'push @F, split //, pop @F; print "@F"' f1 > t1
real    0m34.004s
real    0m33.729s

$ time perl -lane 'print join " ",$F[0],split //,$F[1]' f1 > t2
real    0m23.291s
real    0m23.935s

$ time LC_ALL=C awk '{gsub(/./," &",$2); print $1 $2}' f1 > t3
real    0m30.834s
real    0m30.723s


$ diff -s t1 t2
Files t1 and t2 are identical
$ diff -s t1 t3
Files t1 and t3 are identical

Upvotes: 1

RavinderSingh13
RavinderSingh13

Reputation: 133528

Could you please try following with GNU awk and do let me know if this helps you.

awk '{num=split($2,a,"");printf $1;for(i=0;i<=num;i++){printf("%s%s",a[i],i==num?RS:FS)};}'  Input_file

Upvotes: 2

karakfa
karakfa

Reputation: 67507

to eliminate extra space at the end of line by other solutions you can use this

$ awk '{print $1 gensub(/./," &","g",$2)}'

Upvotes: 3

thanasisp
thanasisp

Reputation: 5975

you can use GNU awk gensub function.

gawk '{$2=gensub(/./, "& ", "g", $2)}1' file

Upvotes: 3

choroba
choroba

Reputation: 241898

Perl to the rescue!

perl -lane 'push @F, split //, pop @F; print "@F"'
  • -n reads the input line by line
  • -l removes newlines from input and adds them back to output
  • -a splits each line on whitespace into the @F array
  • pop removes the last element of an array and returns it, in this case it returns the second "word"
  • split turns a string into a list, with // it splits the string into individual characters
  • push is dual to pop, it adds the elements to the end of an array (in this case, it adds individual characters to the array currently containing only the first column)
  • when printing an array in double quotes, by default the members are separated by spaces.

Upvotes: 5

Adam Wright
Adam Wright

Reputation: 49376

Try

sed -i -e 's/\(.\)/\1 /g'

That is, capture character by character, then replace the capture with itself, plus a space.

Upvotes: 0

Related Questions