Valéry Stroeder
Valéry Stroeder

Reputation: 633

Postgres 9.0 + translate function and ascii codes

I use the translate function to process searches accent insensitive. To improve this request, i've created a matching index :

CREATE INDEX person_lastname_ci_ai_si
ON person
USING btree
(translate(upper(lastname::text), '\303\200\303\201\303\202\303\203\303\204\303\205\303\206\303\207\303\210\303\211\303\212\303\213\303\214\303\215\303\216\303\217\303\221\303\222\303\223\303\224\303\225\303\226\303\230\303\231\303\232\303\233\303\234\303\235\303\237\303\240\303\241\303\242\303\243\303\244\303\245\303\246\303\247\303\250\303\251\303\252\303\253\303\254\303\255\303\256\303\257\303\261\303\262\303\263\303\264\303\265\303\266\303\270\303\271\303\272\303\273\303\274\303\275\303\277'::text, 'AAAAAAACEEEEIIIINOOOOOOUUUUYSaaaaaaaceeeeiiiinoooooouuuuyy'::text)
);

It works fine with postgres 9.1 but it seems to don't work with 9.0. Postgres 9.0 seems to replace

'\303\200\303\201\303\202\303\203\303\204\303\205\303\206\303\207\303\210\303\211\303\212\303\213\303\214\303\215\303\216\303\217\303\221\303\222\303\223\303\224\303\225\303\226\303\230\303\231\303\232\303\233\303\234\303\235\303\237\303\240\303\241\303\242\303\243\303\244\303\245\303\246\303\247\303\250\303\251\303\252\303\253\303\254\303\255\303\256\303\257\303\261\303\262\303\263\303\264\303\265\303\266\303\270\303\271\303\272\303\273\303\274\303\275\303\277'

by

ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜÝßàáâãäåæçèéêëìíîïñòóôõöøùúûüýÿ

Then, because my code perform searches using ascii codes, it doesn't use the index..

Is there a way to avoid postgres to convert ascii codes to characters when creating index ?

For example :

select '\303\200\303\201\303\202\303\203\303\204\303\205\303\206\303\207\303\210\303\211\303\212\303\213\303\214\303\215\303\216\303\217\303\221\303\222\303\223\303\224\303\225\303\226\303\230\303\231\303\232\303\233\303\234\303\235\303\237\303\240\303\241\303\242\303\243\303\244\303\245\303\246\303\247\303\250\303\251\303\252\303\253\303\254\303\255\303\256\303\257\303\261\303\262\303\263\303\264\303\265\303\266\303\270\303\271\303\272\303\273\303\274\303\275\303\277'

;

Result

ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜÝßàáâãäåæçèéêëìíîïñòóôõöøùúûüýÿ

How can i have this result ?

\303\200\303\201\303\202\303\203\303\204\303\205\303\206\303\207\303\210\303\211\303\212\303\213\303\214\303\215\303\216\303\217\303\221\303\222\303\223\303\224\303\225\303\226\303\230\303\231\303\232\303\233\303\234\303\235\303\237\303\240\303\241\303\242\303\243\303\244\303\245\303\246\303\247\303\250\303\251\303\252\303\253\303\254\303\255\303\256\303\257\303\261\303\262\303\263\303\264\303\265\303\266\303\270\303\271\303\272\303\273\303\274\303\275\303\277

Upvotes: 1

Views: 1200

Answers (1)

vyegorov
vyegorov

Reputation: 22905

Starting from version 9.1, PostgreSQL standard_conforming_strings option defaults to ON.

This means that backslash \ character is treated as-is and not as escaping symbol, this is done to prevent SQL injection attacks; this follows SQL standard recommendations. It is still possible to use \ to get special characters, but only within string constants.

For the pre-9.1 versions of PostgreSQL I suppose these options are possible:

  1. Change system-wide standard_conforming_strings option to ON, but this will affect whole cluster and may give unexpected results in other areas;

  2. Change standard_conforming_strings option on a per-user basis, using ALTER ROLE ... SET standard_conforming_strings TO on;, this one also may have side effects;

  3. Use plain SET standard_conforming_strings TO on; as a first command you issue in your session before creating the index;

  4. Double all your backslashes so that are treated as a literal \ symbol in your CREATE INDEX ... statement.

Let me know if this helps.

Upvotes: 1

Related Questions