Reputation: 2075
When I create a table using SQL in Spark, for example:
sql('CREATE TABLE example SELECT a, b FROM c')
How can I pull that table into the python namespace (I can't think of a better term) so that I can update it? Let's say I want to replace NaN
values in the table like so:
import pyspark.sql.functions as F
table = sql('SELECT * FROM example')
for column in columns:
table = table.withColumn(column,F.when(F.isnan(F.col(column)),F.col(column)).otherwise(None))
Does this operation update the original example
table created with SQL? If I were to run sql('SELECT * FROM example')show()
would I see the updated results? When the original CREATE TABLE example ...
SQL runs, is example
automatically added to the python namespace?
Upvotes: 0
Views: 111
Reputation: 5124
The sql
function returns a new DataFrame
, so the table is not modified. If you want to write a DataFrame
's contents into a table created in spark, do it like this:
table.write.mode("append").saveAsTable("example")
But what you are doing is actually changing the schema of a table, in that case
table.createOrReplaceTempView("mytempTable")
sql("create table example2 as select * from mytempTable");
Upvotes: 1