Reputation: 743
I have a tab delimited file with three columns. Each of the row in the 3rd column holds a string that has 4 names, each separated from the other by space (' '), but in some cases there are more than one space separated between the names. I'd like to use a unix-bash command line to print column 1, column 2, name1, name2, name3, name4, name5, all separated by tab.
My desired output would look like this:
avov2323[tab]rogoc232[tab]Roy[tab]Don[tab]Mike[tab]Ned[tab]Lee
cdso3432[tab]fokfd543[tab]Tom[tab]Gil[tab]Rose[tab]Dan[tab]Sam
awk -F "\t" '{print $3}' file.txt
;awk -F " " '{print $1}' $a;although - this command line doesn't work for me... as all the names from column 3 get cramped to each other in $a.
Upvotes: 1
Views: 3470
Reputation: 1941
Just for sake of completeness, I also wrote an awk oneliner, which won't touch any spaces in first two columns. It also preserves empty columns:
awk <inputFile -F '\t' 'BEGIN{OFS="\t"} {gsub(/ +/,OFS,$3); print $1,$2,$3}'
Edit: Regarding improvement mentioned in comment - yes, it is possible to split any column, even the middle one, though a more versatile script would be necessary. It's not a oneliner however and looks quite awkward when put in one line. I'm pretty sure it still could be somewhat optimized. With formatting:
BEGIN {
FS=OFS="\t";
splitAt=3;
}{
gsub(/ +/,OFS,$splitAt);
line=$1;
for(i=2;i<splitAt;i++)
line=line""OFS""$i;
line=line""OFS""$splitAt;
for(i=splitAt+1;i<=NF;i++)
line=line""OFS""$i;
print line;
}
And in charge:
awk <inputFile 'BEGIN{FS=OFS="\t"; splitAt=2;} {gsub(/ +/,OFS,$splitAt); line=$1; for(i=2;i<splitAt;i++) line=line""OFS""$i; line=line""OFS""$splitAt; for(i=splitAt+1;i<=NF;i++) line=line""OFS""$i; print line ;}'
Could be refactored to provide splitAt
as a parameter to script.
Upvotes: 1
Reputation: 1941
Use tr
to translate:
tr <inputFile " " "\t" | tr -s "\t" >outputFile
Edit: As Glenn Jackman pointed out, it would be better to first squeeze spaces, then change remaining spaces to tabs.
tr <inputFile -s " " | tr " " "\t" >outputFile
It's still vulnerable to spaces in first two columns though.
Upvotes: 3
Reputation: 74615
You could use awk:
$ cat file
avov2323 rogoc232 Roy Don Mike Ned Lee
cdso3432 fokfd543 Tom Gil Rose Dan Sam
$ awk '{$1=$1}1' OFS='\t' file
avov2323 rogoc232 Roy Don Mike Ned Lee
cdso3432 fokfd543 Tom Gil Rose Dan Sam
$1=$1
just touches each record so the new output format is applied. 1
evaluates to true, so each line is printed. Awk treats any number of whitespace characters as the input field separator, so as you can see, the number of spaces between each name is not a problem.
To overwrite the original file, you can use a temporary file:
awk '{$1=$1}1' OFS='\t' file > tmp && mv tmp file
Upvotes: 1