Thilak
Thilak

Reputation: 367

regexp_count in redshift/postgresql doesn't supports dynamic parameter for setting pattern value

The following prepared statement

String pattern = "[\\w\\-\\.]+@([\\w\\-]+\\.)+[\\w\\-]{2,4}";
PreparedStatement statement = conn.prepareStatement("SELECT sum(case when regexp_count(email, ?) > 0 then 1 else 0 end) AS email_1 FROM \"testschema\".\"test_table\"")
statement.setObject(1, pattern);

throwing following exception,

java.sql.SQLException: [Amazon](500310) Invalid operation: The pattern must be a valid UTF-8 literal character expression
Details: 
 -----------------------------------------------
  error:  The pattern must be a valid UTF-8 literal character expression
  code:      8001
  context:   
  query:     1976234
  location:  cg_expr_fn_builder.cpp:3542
  process:   padbmaster [pid=5571]
  -----------------------------------------------;
    at com.amazon.redshift.client.messages.inbound.ErrorResponse.toErrorException(Unknown Source)
    at com.amazon.redshift.client.PGMessagingContext.handleErrorResponse(Unknown Source)
    at com.amazon.redshift.client.PGMessagingContext.handleMessage(Unknown Source)
    at com.amazon.jdbc.communications.InboundMessagesPipeline.getNextMessageOfClass(Unknown Source)
    at com.amazon.redshift.client.PGMessagingContext.doMoveToNextClass(Unknown Source)
    at com.amazon.redshift.client.PGMessagingContext.getErrorResponse(Unknown Source)
    at com.amazon.redshift.client.PGClient.handleErrorsScenario2ForPrepareExecution(Unknown Source)
    at com.amazon.redshift.client.PGClient.handleErrorsPrepareExecute(Unknown Source)
    at com.amazon.redshift.client.PGClient.executePreparedStatement(Unknown Source)
    at com.amazon.redshift.dataengine.PGQueryExecutor.executePreparedStatement(Unknown Source)
    at com.amazon.redshift.dataengine.PGQueryExecutor.execute(Unknown Source)
    at com.amazon.jdbc.common.SPreparedStatement.executeWithParams(Unknown Source)
    at com.amazon.jdbc.common.SPreparedStatement.executeQuery(Unknown Source)
Caused by: com.amazon.support.exceptions.ErrorException: [Amazon](500310) Invalid operation: The pattern must be a valid UTF-8 literal character expression
Details: 
 -----------------------------------------------
  error:  The pattern must be a valid UTF-8 literal character expression
  code:      8001
  context:   
  query:     1976234
  location:  cg_expr_fn_builder.cpp:3542
  process:   padbmaster [pid=5571]
  -----------------------------------------------;
    ... 13 more

I believe it's not replacing ? with the dynamic param. How to overcome this issue.? I have tried (?) and $1 as a placeholder but no luck. I don't want to change the query or construct the query without dynamic parameters as it's used for other Database Drivers as well.

Upvotes: 0

Views: 782

Answers (3)

rio
rio

Reputation: 763

the best I've found is to use jinja macro in dbt and explode as column oriented your table adding a column for each pattern and then select what is true (maybe with un-pivot to filter out the un-match) ex

    SELECT
        description,

{% set patterns = dbt_utils.get_column_values(table=source('dev_schema', 'configs'), column='pattern') %}

{% for pattern in patterns %}
        
'{{ pattern }}' as pattern,
        regexp_count(description,'{{ pattern }}', 1, 'p')>0 "{{ pattern }}_case_sensitive_true__word_boundry_false",
        regexp_count(description,'{{ pattern }}', 1, 'i')>0 "{{ pattern }}_case_sensitive__false_word_boundry__false",
        regexp_count(description,'\\s{{ pattern }}\\s', 1, 'p')>0 "{{ pattern }}_case_sensitive__true_word_boundry__true",
        regexp_count(description,'\\s{{ pattern }}\\s', 1, 'i')>0 "{{ pattern }}_case_sensitive__false_word_boundry__true"
         {% if not loop.last %},{% endif %}
{% endfor %}

    FROM {{ ref('bigtable') }} j
   

quite a HACK but it works when pattern is a few thousands ofc :)

Upvotes: 0

Mark Rotteveel
Mark Rotteveel

Reputation: 108961

The error says it must be a literal character expression. In another words, it cannot be a parameter, because a parameter is not a literal character expression.

I would suggest filing a feature request with Amazon to get this changed. As a workaround, you can use a literal in your query text.

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 520958

Try binding the regex literal as a string:

String pattern = "[\\w\\-\\.]+@([\\w\\-]+\\.)+[\\w\\-]{2,4}";
PreparedStatement statement = conn.prepareStatement("SELECT SUM(CASE WHEN REGEXP_COUNT(email, ?) > 0 THEN 1 ELSE 0 END) AS email_1 FROM \"testschema\".\"test_table\"");
statement.setString(1, pattern);

Upvotes: 0

Related Questions