emanuele
emanuele

Reputation: 2589

remove string between two character with sed

I have a file of this type:

16:00 [61]Al-Najma - Al-Rifaa [62]5.06 [63]3.55 [64]1.57 4

and i want remove all the strings inside square parentheses in order to obtain

16:00 Al-Najma - Al-Rifaa 5.06 3.55 1.57 4

I am trying with sed in this manner:

sed 's/\[.*]//g' file1 > file2

but i obtain

16:00 1.57 4

and with

sed 's/\[.[1234567890]]//g' file1 > file2

does not work if the string contains more than 2 digit.

how can i do this?

Upvotes: 0

Views: 11047

Answers (6)

TLP
TLP

Reputation: 67900

Your first regex does not work because the quantifier * is greedy, meaning it matches as many characters as possible. Since . also matches brackets, it continues to match until the last closing bracket ] it can find.

So you basically have two options: Use a non-greedy quantifier or restrict the types of characters you can match. You have tried the second solution. I would go with using a negated character class instead:

sed 's/\[[^]]*\]//g'

I'm not sure if sed has non-greedy quantifiers, but perl does:

perl -lpwe 's/\[.*?\]//g'

Upvotes: 1

potong
potong

Reputation: 58361

This might work for you:

echo "16:00 [61]Al-Najma - Al-Rifaa [62]5.06 [63]3.55 [64]1.57 4" |
sed 's/\[[^]]*\]//g'
16:00 Al-Najma - Al-Rifaa 5.06 3.55 1.57 4

Upvotes: 0

kev
kev

Reputation: 161604

using awk:

$ echo '16:00 [61]Al-Najma - Al-Rifaa [62]5.06 [63]3.55 [64]1.57 4' | awk -F '\[[0-9]*\]' '$1=$1'
16:00  Al-Najma - Al-Rifaa  5.06  3.55  1.57 4

Upvotes: 0

Birei
Birei

Reputation: 36252

You already got the sed answer, so I will add other one using awk:

awk '
  BEGIN { 
    FS = "\\[[^]]*\\]"; 
    OFS = " " 
  } 
  { 
    for (i=1; i<=NF; i++) 
      printf "%s", $i 
  } 
  END { 
    printf "\n" 
  }
' <<<"16:00 [61]Al-Najma - Al-Rifaa [62]5.06 [63]3.55 [64]1.57 4"

Output:

16:00 Al-Najma - Al-Rifaa 5.06 3.55 1.57 4

Upvotes: 0

J&#246;rg Beyer
J&#246;rg Beyer

Reputation: 3671

your pattern allows only one character, adding a star behind the pattern widens it to all matching characters.

sed 's/\[.[1234567890]]*//g' file1 > file2

alternative:

sed 's/\[^\]*//g' file1 > file2

that means: after the starting "[" everything but the "]" is OK, and that for as many characters as there come (the "*")

for further reading on sed: http://www.grymoire.com/Unix/Sed.html

Upvotes: 1

John3136
John3136

Reputation: 29266

Does escaping the closing ] help ?

sed 's/\[.*\]//g' file1 > file2

Upvotes: 0

Related Questions