Reputation: 36166
I have a SSAS tabular mode cube that reads data from an Actian Matrix database using ODBC. The project processes fine when I'm using a data set with 1 Million rows but when I try to use a bigger one (300 Million rows), the process runs for around 15 minutes and fails with the message:
The operation failed because the source database does not exist, the source table does not exist, or because you do not have access to the data source.
More Details:
OLE DB or ODBC error: [ParAccel][ODBC Driver][PADB]57014:ERROR: Query (25459) cancelled on user's request
DETAIL: Query (25459) cancelled on user's request
; 57014.
An error occurred while processing the partition 'XXXX' in table 'YYYY'.
The current operation was cancelled because another operation in the transaction failed.
The message says that the database doesn't exist but it doesn't make sense because it works perfectly fine on the first case (and the difference is just a "where clause" to limit the number of rows)
I'm using a server that has 96 Gb of FREE ran and I can see all the memory being consumed while the "processing" process is running. When it is all consumed, it runs for a few axtra seconds and fails. Also, I know for a fact that the 300 Million row dataset exported to a csv file has 36 Gb on its raw format, so it should fit full in memory without any compression.
I can also guarantee that the query works fine on its own on the source database so the "Query (25459) cancelled on user's request" message also doesn't make much sense.
Does anyone have any idea on what may be going on?
Upvotes: 1
Views: 915
Reputation: 1
Memory consumption on the derivative of the input rows (the resulting cube) cannot be estimated on the byte size of the input. It is a function of the Cartesian graph product of all distinct values of the cube dimensions.
If you were building a cube with 2 input rows over 2 dimensions and 2 measurements:
State|City|Population
---------------------
NY|New York|8406000
CA|Los Angeles|3884000
State|City|Population|Number of records
---------------------------------------
NULL|NULL|12290000|2
NY|NULL|8406000|1
NY|New York|8406000|1
CA|NULL|3884000|1
CA|Los Angeles|3884000|1
NULL|Los Angeles|3884000|1
NULL|New York|8406000|1
You can't expect the output being generated as the input data rows are processed to be equivalent in size. If the ODBC driver keeps the entire input in memory before it lets you read it, then you would have to account for both the input and the output to reside in memory until the cube generation is complete.
This answer is much clearer on the subject: How to calculate the likely size of an OLAP cube
Upvotes: 0