oscilatingcretin
oscilatingcretin

Reputation: 10919

Teradata SQLA: Row size or Sort Key size overflow

When doing a select of all columns from a table consisting of 86 columns in SQLA, I always get the error Row size or Sort Key size overflow. The only way to avoid this error is to trim down the number of columns in the select, but this is an unconventional solution. There has to be a way to select all columns from this table in one select statement.

Bounty

I am adding this bounty because I cannot hack my way past this issue any longer. There has to be a solution to this. Right now, I am selecting from a table with Unicode columns. I am assuming this is causing the row size to exceed capacity. When I remove Session Character Set=UTF8 from my connection string, I get the error of The string contains an untranslatable character. I am using NET data provider 14.0.0.1. Is there a way to increase the size?

Update

Rob, you never cease to impress! You suggestion of using UTF16 works. It even works in SQLA after I update my ODBC config. I think my problem all along is my lack of understanding of ASCII, Latin, UTF8, and UTF16.

We also have an 80-column table that consists of all Latin columns, a few of which are `varchar(1000)'. I get the same error in SQLA when selecting from it in UTF8 and UTF16, but I can select from it just fine after updating my character set to ASCII or Latin mode in my ODBC config.

Rob, can you provide insight as to what's happening here? My theory is that, because it's in the Latin set, using UTF8 or UTF16 causes a conversion to a larger set of bytes which results in the error, especially for the varchar(1000)'s. If I use Latin as my session character set, no conversion is done and I get the string in its native encoding. As for the issue in question, UTF8 fails because the encoding cannot be "downgraded"?

Per request, here is the DDL of the table in question:

CREATE MULTISET TABLE mydb.mytable ,NO FALLBACK ,
     NO BEFORE JOURNAL,
     NO AFTER JOURNAL,
     CHECKSUM = DEFAULT,
     DEFAULT MERGEBLOCKRATIO
     (
      FIELD1 VARCHAR(214) CHARACTER SET LATIN CASESPECIFIC NOT NULL,
      FIELD2 VARCHAR(30) CHARACTER SET UNICODE CASESPECIFIC,
      FIELD3 VARCHAR(60) CHARACTER SET UNICODE CASESPECIFIC NOT NULL,
      FIELD4 VARCHAR(4000) CHARACTER SET UNICODE CASESPECIFIC,
      FIELD5 VARCHAR(900) CHARACTER SET UNICODE CASESPECIFIC,
      FIELD6 VARCHAR(900) CHARACTER SET UNICODE CASESPECIFIC,
      FIELD7 VARCHAR(900) CHARACTER SET UNICODE CASESPECIFIC,
      FIELD8 VARCHAR(900) CHARACTER SET UNICODE CASESPECIFIC,
      FIELD9 VARCHAR(900) CHARACTER SET UNICODE CASESPECIFIC,
      FIELD10 VARCHAR(900) CHARACTER SET UNICODE CASESPECIFIC,
      FIELD11 VARCHAR(3600) CHARACTER SET UNICODE CASESPECIFIC,
      FIELD12 VARCHAR(3600) CHARACTER SET UNICODE CASESPECIFIC,
      FIELD13 VARCHAR(3600) CHARACTER SET UNICODE CASESPECIFIC,
      FIELD14 VARCHAR(3600) CHARACTER SET UNICODE CASESPECIFIC)
PRIMARY INDEX ( FIELD1 );

Upvotes: 1

Views: 6703

Answers (1)

Rob Paller
Rob Paller

Reputation: 7786

Without seeing your table definition have you considered using UTF16 instead of UTF8 for your SESSION CHARSET?

Some more research on your error message found this post suggesting that UTF16 may afford you the ability to return records that UTF8 otherwise will not.

Edit: If you recall from the link that I shared above, for a given VARCHAR(n) the bytes to store would be as follows:

  • LATIN: n bytes
  • UTF8: n*3 bytes
  • UTF16: n*2 bytes

This would mean that a VARCHAR(4000) UNICODE field in a UTF8 session should require 12KB. If you have to deal with UNICODE data consistently it may be to your advantage to leave or change your default session character set to UTF16. In my experience I have not had to work with UNICODE data so I couldn't tell you if what the pitfalls to changing your character set may introduce for LATIN data elsewhere in your database(s).

Hope this helps.

Upvotes: 3

Related Questions