Reputation: 69
I am reading an example code from pyspark documentation
https://spark.apache.org/docs/latest/api/python/pyspark.sql.html#pyspark.sql.SQLContext
In an example code, it creates a dataframe like this
df = spark.createDataFrame([('2015-04-08',)], ['dt'])
df.select(add_months(df.dt, 1).alias('next_month')).collect()
[Row(next_month=datetime.date(2015, 5, 8))]
I am wondering why there must be a comma after '2015-04-08' while there is only one column. I feel it may has something to do with tuple type, but would like to learn more.
Upvotes: 2
Views: 1776
Reputation: 4089
Single element tuple has additional comma(',') to distinguish them with the arithmetic expression (1). Below example should give more clarity.
Airthmetic expresion:
a = (1)
type(a)
#int
Tuple with single element :
b = (1,)
type(b)
#tuple
you can define zero element tuple with empty brackets.
zero_element_tuple = ()
type(zero_element_tuple)
#tuple
Only single element tuple require additional comma (',') to distinguish them with arithmetic expression, multiple elements tuple does not require additional comma at end.
Upvotes: 1