WSO2 DAS spark script

Question

I'm trying to deploy new data publisher car. I looked at tthe APIM_LAST_ACCESS_TIME_SCRIPT.xml spark script (used by api manager) and didn't understand the difference between the two temporaries tables created: API_LAST_ACCESS_TIME_SUMMARY_FINAL and APILastAccessSummaryData

Gokul · Accepted Answer

The two Spark temporary tables represent different JDBC tables (possibly in different datasources), where one of them acts as the source for Spark and the other acts as the destination.

To illustrate this better, have a look at the simplified script in question:

create temporary table APILastAccessSummaryData using CarbonJDBC options (dataSource "WSO2AM_STATS_DB", tableName "API_LAST_ACCESS_TIME_SUMMARY", ... );

CREATE TEMPORARY TABLE API_LAST_ACCESS_TIME_SUMMARY_FINAL USING CarbonAnalytics OPTIONS (tableName "API_LAST_ACCESS_TIME_SUMMARY", ... );

INSERT INTO TABLE APILastAccessSummaryData select ... from API_LAST_ACCESS_TIME_SUMMARY_FINAL;

As you can see, we're first creating a temporary table in Spark with the name APILastAccessSummaryData, which represents an actual relational DB table with the name API_LAST_ACCESS_TIME_SUMMARY in the WSO2AM_STATS_DB datasource. Note the using CarbonJDBC keyword, which can be used to directly map JDBC tables within Spark. Such tables (and their rows) are not encoded, and can be read by the user.

Second, we're creating another Spark temporary table with the name API_LAST_ACCESS_TIME_SUMMARY_FINAL. Here however, we're using the CarbonAnalytics analytics provider, which will mean that this table will not be a vanilla JDBC table, but an encoded table similar to the one from your previous question.

Now, from the third statement, you can see that we're reading (SELECT) a number of fields from the second table API_LAST_ACCESS_TIME_SUMMARY_FINAL and inserting them (INSERT INTO) into the first, which is APILastAccessSummaryData. This represents the Spark summarisation process.

For more details on the differences between the CarbonAnalytics and CarbonJDBC analytics providers or on how Spark handles such tables in general, have a look at the documentation page for Spark Query Language.

WSO2 DAS spark script

Answers (1)

Related Questions