Reputation: 111
I was trying to load data from Oracle database using Spark data source API.
Since I need to load data by query, I used the query below which I put together from some examples online:
Map<String, String> options = new HashMap<>();
options.put("driver", MYSQL_DRIVER);
options.put("user", MYSQL_USERNAME);
options.put("password", MYSQL_PWD);
options.put("url", MYSQL_CONNECTION_URL);
options.put("dbtable", "(select emp_no, emp_id from employees) as employees_data");
options.put("lowerBound", "10001");
options.put("upperBound", "499999");
options.put("numPartitions", "10");
DataFrame jdbcDF = sqlContext.load("jdbc", options);
This gets an exception:
Exception in thread "main" java.sql.SQLSyntaxErrorException: ORA-00933: SQL command not properly ended
I doubt that we can't give "as employees_data" for an Oracle query, so what am I doing wrong?
Upvotes: 1
Views: 2505
Reputation: 11
Try this...
Map < String, String > oracle_options = new HashMap<>()
oracle_options.put("driver", "oracle.jdbc.OracleDriver");
oracle_options.put("url", "jdbc:oracle:thin:username/password@//hostName/instanceName);
oracle_options.put("dbtable", "tableName");
DataFrame dataFrame = hContext.read().format("jdbc").options(oracle_options).load().select(String col1,String col2.....));
Where hContext is HiveContex instance. if you are using selection means where condition use as follows:
DataFrame dataFrame = hContext.read().format("jdbc").options(oracle_options).load().select(String col1,String col2.....)).where(String expr);
Upvotes: 0
Reputation: 191245
I doubt that we can't give "as employees_data" for an Oracle query
You may doubt it, but you can't use AS
for a table alias in Oracle. You can for column aliases, where it is optional, but it is not allowed for table aliases. You can see that in the syntax diagram.
Assuming Spark doesn't mind the alias itself, you can just remove the AS
:
options.put("dbtable", "(select emp_no, emp_id from employees) employees_data");
Upvotes: 5