Altimus Prime
Altimus Prime

Reputation: 2327

Using Sed to remove spaces that are NOT between letters

Yesterday I discover Sed and it's amazing. I can handle certain easy regex expressions and literals but I'm not sure how to only remove spaces that are NOT between two letters (a-zA-Z).

For example:

Input:

"Mal                        ","","Mr    ","123","  ","   Lauren Hills","Dr  ","  ","      ","        ",

Output:

"Mal","","Mr","123","","Lauren Hills","Dr","","","",

So far I've tried adapting commands that I found here, here and here.

The closest I've got is:

sed 's/ \{1,\}//g' test.csv > test.bak

which removes the significant spaces between words, like the space between Lauren and Hills.

Upvotes: 0

Views: 791

Answers (5)

Ed Morton
Ed Morton

Reputation: 203807

$ sed 's/ *" */"/g' file
"Mal","","Mr","123","","Lauren Hills","Dr","","","",

Upvotes: 0

ctac_
ctac_

Reputation: 2481

You can use this one too.

sed 's/" */"/g;s/ *"/"/g'

Upvotes: 1

Kaushik Nayak
Kaushik Nayak

Reputation: 31676

Add " also in the pattern

sed -e 's/ \{1,\}"/"/g' -e 's/" \{1,\}/"/g' test.csv > test.bak

Explanation:

-e option is used to apply more than one sed operation

The first part replaces 1 or more space characters and a " with a single ".

The second part replaces " and 1 or more space characters by a single "

SO, it removes leading and trailing spaces within quotes.

Upvotes: 1

Barmar
Barmar

Reputation: 781503

Do it in three steps. One removes spaces when the character to the left is a letter and the character to the right is not, the next step does the opposite, and the final step removes spaces when both are not letters. The only combination we don't removeis when both surrounding characters are letters.

sed -e 's/\([a-z]\) \{1,\}\([^a-z]\)/\1\2/ig' -e 's/\([^a-z]\) \{1,\}\([a-z]\)/\1\2/ig' -e 's/\([^a-z]\) \{1,\}\([^a-z]\)/\1\2/ig' test.csv > test.bak

Upvotes: 1

choroba
choroba

Reputation: 241938

Easier in Perl than sed:

perl -pe 's/\B | \B//g' < input > output

\B stands for "not at word boundary", i.e. it doesn't remove spaces that have letters before and after.

Upvotes: 5

Related Questions