Reputation: 2157
I have a tab-delimited text file with 3 columns.
In some of the columns there are single or multiple spaces that I want to remove. But I want to keep the tab separation between each column and also the newline character.
I tried
perl -lape 's/\s+//sg
but that removes all whitespaces, also the tab spaces
My file looks like this
col1 col2 col3
1 test test
2 test test
3 test test
And I want
col1 col2 col3
1 test test
2 test test
3 test test
So I only want to keep the tabspaces between the different columns, but not the single spaces. I hope this is clear.
Upvotes: 3
Views: 2236
Reputation: 246744
With awk, to reformat the output to use specifically a tab character,
awk -v OFS='\t' '{$1=$1}1' file
The odd-looking $1=$1
forces awk to rewrite the current record using the Output Field Separator (tab)
Upvotes: 1
Reputation: 53478
If it's just spaces, you can use ' '
instead of \s
.
E.g.
s/ //g;
Of course, given you're doing lape
and -a
means 'autosplit on whitespace' you could just:
perl -ane 'print join ("\t", @F );'
Upvotes: 4
Reputation: 289505
Just remove spaces, not \s
which also matches tabs:
sed 's/ \+//g' file
And if you want to remove these spaces just if they occur after a tab, say:
sed 's/\t */\t/g' file
From perldoc perlretut:
\s matches a whitespace character, the set [\ \t\r\n\f] and others
Upvotes: 7
Reputation: 61513
You can create you own character class that is the negation of all things that are not spaces and a tab, this character class represents all characters that are whitespace - tabs:
perl -lape 's/[^\S\t]+//sg'
[ ... ]
defines a character class
^
inside of [ ... ]
negates this character class
\S
represents everything not in \s
\t
represents a tab character
Upvotes: 2