Reputation: 245
I have a HIVE-table with a column name similar to: column_"COLUMN_NAME"
My original query is as follows.
spark.sql("SELECT from_unixtime(unix_timestamp(substr(time, 1, 23), 'dd.MM.yyyy HH:mm:ss.SSS')) AS timestamp, cast(regexp_replace(column_"COLUMN_NAME"",',','.') as float) AS Column FROM table")
This does, evidently, not work. I've tried various ways to escape the quotation-marks in the column name, but neither backslash nor backticks solves the issue.
Any ideas?
Upvotes: 4
Views: 15474
Reputation: 43504
You have two options here, but in both cases you need to wrap the column name containing the double quote in backticks.
data = [
('01.01.2019 12:34:56.78910', '123,456')
]
df = spark.createDataFrame(data, ['time', 'column_"COLUMN_NAME"'])
df.show()
#+-------------------------+--------------------+
#|time |column_"COLUMN_NAME"|
#+-------------------------+--------------------+
#|01.01.2019 12:34:56.78910|123,456 |
#+-------------------------+--------------------+
# register this as a temp table
df.createOrReplaceTempView("table")
query = """SELECT
from_unixtime(unix_timestamp(substr(time, 1, 23), 'dd.MM.yyyy HH:mm:ss.SSS')) AS timestamp,
cast(regexp_replace(`column_"COLUMN_NAME"`,',','.') as float) AS Column
FROM table"""
spark.sql(query).show()
#+-------------------+-------+
#| timestamp| Column|
#+-------------------+-------+
#|2019-01-01 12:34:56|123.456|
#+-------------------+-------+
query = "SELECT from_unixtime(unix_timestamp(substr(time, 1, 23), 'dd.MM.yyyy HH:mm:ss.SSS')) AS timestamp, cast(regexp_replace(`column_\"COLUMN_NAME\"`,',','.') as float) AS Column FROM table"
spark.sql(query).show()
#Same as above
Upvotes: 2
Reputation: 1528
Try this:
df.show()
+----+--------------------+
|col1|column_"COLUMN_NAME"|
+----+--------------------+
| 1| 123|
| 2| 245|
+----+--------------------+
from pyspark.sql import HiveContext
sqlCtx= HiveContext(sc)
df.registerTempTable("table")
sqlCtx= HiveContext(sc)
qry = """select col1,`column_"COLUMN_NAME"` from table"""
sqlCtx.sql(qry).show()
Output:
+----+--------------------+
|col1|column_"COLUMN_NAME"|
+----+--------------------+
| 1| 123|
| 2| 245|
+----+--------------------+
Upvotes: 0