Reputation: 163
I just started to use Spark-SQL to load data from a H2 database, here is what I did following the Spark-SQL document:
>>> sqlContext = SQLContext(sc)
>>> df = sqlContext.load(source="jdbc",driver="org.h2.Driver", url="jdbc:h2:~/test", dbtable="RAWVECTOR")
But it didn't work and gave errors, I think the problem is that the username and password are not specified in the function.
This is parameters from the document from Spark-SQL 1.3.1:
url
The JDBC URL to connect to.
dbtable
The JDBC table that should be read. Note that anything that
is valid in a FROM
clause of a SQL query can be used. For example,
instead of a full table you could also use a subquery in
parentheses.
driver
The class name of the JDBC driver needed to connect to this
URL. This class with be loaded on the master and workers before
running an JDBC commands to allow the driver to register itself with
the JDBC subsystem.
partitionColumn
, lowerBound
, upperBound
, numPartitions
These options must all be specified if any of them is specified. They describe how to partition the table when reading in parallel from multiple workers. partitionColumn must be a numeric column from the table in question.
But I didn't find any clue how to pass the database user name and password to the sqlContext.load function. Any one has similar case or clues?
Thanks.
Upvotes: 3
Views: 9942
Reputation: 163
I figured it out. Just do
df = sqlContext.load(
source="jdbc",driver="org.h2.Driver",
url="jdbc:h2:tcp://localhost/~/test?user=sa&password=1234",
dbtable="RAWVECTOR"
)
And when you create the database, use same pattern:
conn = DriverManager.getConnection(
"jdbc:h2:tcp://localhost/~/"+dbName+"?user=sa&password=1234", null, null
);
And, here is a blog about how to use the API.
Upvotes: 5