Dan
Dan

Reputation: 31

Snowflake: Sqlalchemy ORM update variant field

I have an issue with updating a python model that represents a table in snowflake using sqlalchemy. One of the columns is a variant field and when I attempt to update the model and save it back to the database snowflake complains that the value is not a variant. I've tried passing in the data as a dictionary as well as a json string, neither works.

The model property is defined as:

data = db.Column(snowsql.VARIANT)

snowsql being an alias for the snowflake-sqlalchemy package, which is installed but it appears to only provide type dialects and not much else.

Does anyone have any experience doing this?

Update:

The data that I am adding is a python list of dictionaries:

new_data['data'] = [{'component': None, 'display_name': 'blah', 'key': None, 'row': {'display_name': 'blah', 'type': 'blah'}, 'sub_type': None, 'type': 'blah'}]

Doing a json.dumps when setting the variant property:

new_blah = Blah(
  data = json.dumps(new_data['data'])
)

Results in:

Expression type does not match column data type, expecting VARIANT but got VARCHAR(8905) for column DATA

The json produced by the dumps:

"[{\"component\":null,\"display_name\":\"blah\",\"key\":null,\"row\":{\"display_name\":\"blah\",\"type\":\"blah\"},\"sub_type\":null,\"type\":\"blah\"}]"

Alternatively, attempting to assign the list without doing a json.dumps:

sqlalchemy.exc.ProgrammingError: (snowflake.connector.errors.ProgrammingError) 252004: None: Failed processing pyformat-parameters; 'dict' object has no attribute 'replace'

Upvotes: 3

Views: 2666

Answers (1)

Elias Mi
Elias Mi

Reputation: 711

As far as I can tell, what you're trying is not currently possible. I was having the same issue and even though I got quite a bit into SQLAlchemy Custom Types I can't currently solve it.

The issue is the following:

  1. To insert into a VARIANT column, you need to use the Snowflake function PARSE_JSON.
  2. To use said function, you need to be in the context of a SELECT clause, i.e. you can't use it with the INSERT ... VALUES format (see Snowflake Community)

I'm not aware of any way to force SQLAlchemy to create a subquery, as it would be required.

In case this changes in the future, here's how far I got:

class SnowflakeJSON(TypeDecorator):
    impl = VARIANT

    def process_bind_param(self, value, dialect):
        return json_serialize(value)

    def process_result_value(self, value, dialect):
        return json_deserialize(value)

    def bind_expression(self, bindparam):
        return func.PARSE_JSON(bindparam, type_=self)

    def copy(self, **kw):
        return SnowflakeJSON()

This generates a query like:

INSERT INTO my_table (key, data) 
VALUES ('001', PARSE_JSON('{"test": "hello world"}'));

However, as per the snowflake community entry above, the VALUES format does not work, you need the following format instead:

INSERT INTO my_table (key, data) 
SELECT '001', PARSE_JSON('{"test": "hello world"}');

As a fallback, you can always use a column type which stores your JSON objects as VARCHAR:

class SnowflakeJSON(TypeDecorator):
    impl = String

    def process_bind_param(self, value, dialect):
        return json_serialize(value)

    def process_result_value(self, value, dialect):
        return json_deserialize(value)

    def copy(self, **kw):
        return SnowflakeJSON()

With that, you're obviously losing out on the benefits of storing your JSON string as a VARIANT data type. However, you can parse this at a later stage, even in a query:

SELECT PARSE_JSON(text)['test'] from my_table;

Upvotes: 2

Related Questions