Reputation: 347
Good morning, I have the following set, but with thousands of more information:
215 22221121110110110101
212 22221121110110110101
468 22221121110110110101
1200 22221121110110110101
400 22221121110110110101
100 22221121110110110101
200 22221121110110110101
And I need to separate it into columns this way:
215 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
212 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
468 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
1200 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
400 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
100 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
200 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
I tried to use a simple sed, but don't work
sed -i -e 's// /g'
Upvotes: 0
Views: 122
Reputation: 47099
How about coreutils
:
paste -d '' \
<(cut -d' ' -f1 infile ) \
<(cut -d' ' -f2 infile | sed 's/./ &/g')
Output:
215 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
212 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
468 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
1200 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
400 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
100 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
200 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
Upvotes: 1
Reputation: 58430
This might work for you (GNU sed):
sed 's/ /\n/;h;s/\B/ /g;H;g;s/\n.*\n/ /' file
Replace the first space by a newline, copy the line, replace all non-word boundaries with a space, append the change line to the copy and then rearrange the line.
Upvotes: 1
Reputation: 2471
Another approach with bash
while read a b;do
printf "%s" $a
while read -n1 c;do
printf " %c" "$c"
done<<<$b
echo
done<lefile
Upvotes: 1
Reputation: 16997
Using awk
's gsub(regexp, replacement [, target])
awk '{gsub(/./," &",$2); print $1 $2}' infile
Explanation:
gsub(/./,"& ",$2)
match any char (except for line terminators) and replace it with the same, along with single space in second column of current record read.The Dot Matches (Almost) Any Character. In regular expressions, the dot or period is one of the most commonly used metacharacters. The dot matches a single character, without caring what that character is. The only exception are line break characters.
&
appears in replacement, it stands for the precise substring that was matched by regexp
.Test Results:
$ cat infile
215 22221121110110110101
212 22221121110110110101
468 22221121110110110101
1200 22221121110110110101
400 22221121110110110101
100 22221121110110110101
200 22221121110110110101
$ awk '{gsub(/./," &",$2); print $1 $2}' infile
215 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
212 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
468 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
1200 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
400 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
100 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
200 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
Upvotes: 2
Reputation: 23667
speed comparison of some of the answers
$ perl -0777 -ne 'print $_ x 1000000' ip.txt > f1
$ du -h f1
169M f1
time given for two consecutive runs
$ time perl -lane 'push @F, split //, pop @F; print "@F"' f1 > t1
real 0m34.004s
real 0m33.729s
$ time perl -lane 'print join " ",$F[0],split //,$F[1]' f1 > t2
real 0m23.291s
real 0m23.935s
$ time LC_ALL=C awk '{gsub(/./," &",$2); print $1 $2}' f1 > t3
real 0m30.834s
real 0m30.723s
$ diff -s t1 t2
Files t1 and t2 are identical
$ diff -s t1 t3
Files t1 and t3 are identical
Upvotes: 1
Reputation: 133528
Could you please try following with GNU awk
and do let me know if this helps you.
awk '{num=split($2,a,"");printf $1;for(i=0;i<=num;i++){printf("%s%s",a[i],i==num?RS:FS)};}' Input_file
Upvotes: 2
Reputation: 67507
to eliminate extra space at the end of line by other solutions you can use this
$ awk '{print $1 gensub(/./," &","g",$2)}'
Upvotes: 3
Reputation: 5975
you can use GNU awk gensub function.
gawk '{$2=gensub(/./, "& ", "g", $2)}1' file
Upvotes: 3
Reputation: 241898
Perl to the rescue!
perl -lane 'push @F, split //, pop @F; print "@F"'
-n
reads the input line by line-l
removes newlines from input and adds them back to output-a
splits each line on whitespace into the @F array//
it splits the string into individual characterspop
, it adds the elements to the end of an array (in this case, it adds individual characters to the array currently containing only the first column)Upvotes: 5
Reputation: 49376
Try
sed -i -e 's/\(.\)/\1 /g'
That is, capture character by character, then replace the capture with itself, plus a space.
Upvotes: 0