Jason
Jason

Reputation: 2076

Tables created in Snappy shell do not show up in JDBC or Pulse

SnappyData v.0-5

The issue I am having is that my JDBC Connection's Table metadata and Pulse Web App do not see the table I created below.

I create a table in SnappyData using the shell and a csv file.

Data is here (roads.csv):

"roadId","name"
"1","Road 1"
"2","Road 2"
"3","Road 3"
"4","Road 4"
"5","Road 5"
"6","Road 6"
"7","Road 7"
"8","Road 8"
"9","Road 9"
"10","Road 10"

==========================================================

snappy>  CREATE TABLE STAGING_ROADS
(road_id string, name string)
USING com.databricks.spark.csv
OPTIONS(path '/home/ubuntu/data/example/roads.csv', header 'true');


snappy> select * from STAGING_ROADS

Returns 10 rows.

I have a SnappyData JDBC connection (DBVisualizer & SquirrelSQL show same). I cannot see that table in the "TABLES" list from metadata. However, if I do a "select * from STAGING_ROADS".

Returns 10 rows with CLOBs, which btw are completely unusable.

road_id  |      name
=====================
CLOB        CLOB
CLOB        CLOB
CLOB        CLOB
CLOB        CLOB
CLOB        CLOB
CLOB        CLOB
CLOB        CLOB
CLOB        CLOB
CLOB        CLOB
CLOB        CLOB

Second, the Pulse Web App does not register that I create the table when I did it from the snappy> shell. However, if I run a CREATE TABLE command from the JDBC client, it shows up there fine.

Am I doing something incorrectly? How can I get metadata about the tables I create in snappy> shell to show up in JDBC and Pulse as well?

Upvotes: 1

Views: 148

Answers (2)

Sumedh
Sumedh

Reputation: 383

The issue I am having is that my JDBC Connection's Table metadata and Pulse Web App do not see the table I created below.

This is a known issue (https://jira.snappydata.io/browse/SNAP-303). The JDBC metadata shows only the items in the store and not the external table. While the metadata issue is being tracked, Pulse webapp will not be able to see such external tables since it is designed to monitor the snappydata store.

A note: the "CREATE TABLE" DDL has been changed to "CREATE EXTERNAL TABLE" (https://github.com/SnappyDataInc/snappydata/pull/311) for sources outside of store to make things clearer.

How can I get metadata about the tables I create in snappy> shell to show up in JDBC and Pulse as well?

It will show up for internal SnappyData sources: column and row tables. For other providers in USING, they will not show up as mentioned.

CSV tables are usually useful for only loading data into column or row tables as in the example provided by @jagsr.

Upvotes: 2

jagsr
jagsr

Reputation: 535

Didn't think creating a table using SQL where Spark.csv is the data source has been tested. Here is a related JIRA - https://jira.snappydata.io/browse/SNAP-416.

We have been suggesting folks to use a Spark Job to load the data in parallel. You can do this using the spark-shell also.

        stagingRoadsDataFrame = snappyContext.read
        .format("com.databricks.spark.csv") 
        .option("header", "true") // Use first line of all files as header
        .option("inferSchema", "true") // Automatically infer data types
        .load(someFile)
// Save Dataframe as a Row table
      stagingRoadsDatFrame.write.format("row").options(props).saveAsTable("staging_roads")

That said, could you try (perhaps this might work)-

 CREATE TABLE STAGING_ROADS (road_id varchar(100), name varchar(500))

Note that there is no 'String' as a data type in SQL. By default, with no knowledge of the max length we convert this to a CLOB. We are working to resolve this issue too.

Upvotes: 1

Related Questions