user2589918
user2589918

Reputation: 566

Removing a set of letters from a string

I want to remove the vowels from the email id. Which function should I use? I am trying to find the difference between translate and replace in postgresql but didn't get the exact difference

Upvotes: 37

Views: 83514

Answers (3)

LeoRochael
LeoRochael

Reputation: 15177

TLDR;

To eliminate all vowels from an "e-mail id", the simplest expression I can think of is:

  • translate(email_id, 'aeiou', '')

In detail

To complement Bohemians's answer:

Both replace() and translate() can be used to:

  • replace characters
  • eliminate characters

And both functions accept three parameters:

  • A string being manipulated: the return string will be a (possibly) modified version of this string
  • A from string, containing something to be found in the manipulated string
  • A to string, containing something that might be present on the output string depending on the function used, the manipulated string, and the from string

The difference is that replace() can only replace whole sequences of characters, that must be found in the manipulated string in a specific order:

postgres=> select replace('foobarbaz', 'bar', 'FRED');
  replace   
------------
 fooFREDbaz
(1 row)

Even if replacing them with an empty string:

postgres=> select replace('foobarbaz', 'bar', '');
 replace 
---------
 foobaz
(1 row)

But if the characters in the from string cannot be found in that specific order inside the manipulated string, replace() returns a string identical to the manipulated one:

postgres=> select replace('foobarbaz', 'rab', '');
 replace 
---------
 foobarbaz
(1 row)

translate() on the other hand, deals not with a sub-string of characters that must be found in a specific sequence on the string being manipulated, but with sets of characters:

Each character in the manipulated string that is present in the from string is mapped to another character in the same position in the to string as it was found in the from string:

postgres=> select translate('foobarbaz', 'bar', '123');
 translate 
-----------
 foo12312z

postgres=> select translate('foobarbaz', 'rab', '123');
 translate 
-----------
 foo32132z
(1 row)

In the first translate() example above, the following mapping happened:

  • 'b' -> '1' (for both occurrences of 'b')
  • 'a' -> '2' (for both occurrences of 'a')
  • 'r' -> '3' (for the single occurrence '3')

Although translate() can be used to map characters as above, it can also be used to eliminate sets of characters. This happens if the to string is shorter than the from string:

postgres=> select translate('foobarbaz', 'rab', '1');
 translate 
-----------
 foo1z
(1 row)

postgres=> select translate('foobarbaz', 'rab', '');
 translate 
-----------
 fooz
(1 row)

In the first example above, the following mapping happened:

  • 'r' -> '1' (for the single occurrence of 'r')
  • 'b' -> eliminated (for both occurrences of 'b')
  • 'a' -> eliminated (for both occurrences of 'a')

Whereas in the second example above, all occurrences of the characters 'r', 'a' and 'b' are eliminated since the to string is empty.

So, to eliminate vowels from an email id, you can do:

  • translate(email_id, 'aeiou', '')

As long as all you care are about ascii vowels, as mentioned by Craig Ringer's answer.

Upvotes: 12

Bohemian
Bohemian

Reputation: 424983

translate() replaces a set of single characters (passed as a string) with another set of characters (also passed as a string), for example:

translate('abcdef', 'ace', 'XYZ') --> 'XbYdZf'

replace() replaces occurrences of a string of arbitrary length with another string:

replace('abcdef', 'bc', 'FOO') --> 'aFOOdef'

Upvotes: 54

Craig Ringer
Craig Ringer

Reputation: 324375

In this case you probably actually want regexp_replace.

Assuming by "vowel" you mean "Western European (English) language vowel letters" you might write:

SELECT regexp_replace('[email protected]', '[aeiou]', '', 'gi');

the gi in the fourth argument says "apply this regular expression globally to the whole input string not just to the first match, and make it case insensitive".

Remember that w and y are sometimes vowel-sounds, depending on their context, too. You won't be able to handle that with a regexp so it depends on whether or not you care for this purpose.

You're less likely to need to deal with other character sets if you're working with email addresses so a regexp might be OK for this.

In most cases mangling words with regular expressions would not be a good approach, though; for example, Russian in the Cyrillic alphabet uses A Э У О Ы Я Е Ё Ю И as vowels. Additionally, depending on the language, the same letter in the same script might or might not be a vowel! Keep reading here for more than you ever wanted to know.

Upvotes: 20

Related Questions