GeorgeOfTheRF
GeorgeOfTheRF

Reputation: 8854

How to create spark dataframe with column name which contains dot/period?

I have data in a list and want to convert it to a spark dataframe with one of the column names containing a "."

I wrote the below code which ran without any errors.

input_data = [('retail', '2017-01-03T13:21:00', 134),
                     ('retail', '2017-01-03T13:21:00', 100)]
rdd_schema = StructType([StructField('business', StringType(), True), \
                         StructField('date', StringType(), True), \
                         StructField("`US.sales`", FloatType(), True)])
input_mock_df = spark.createDataFrame(input_mock_rdd_map, rdd_schema)

The below code returns the column names

input_mock_df.columns

But any operations on this dataframe is giving error for example

input_mock_df.count()

How do I make a valid spark dataframe which contains a "."?

Note:

Upvotes: 1

Views: 9784

Answers (1)

Ankit Kumar Namdeo
Ankit Kumar Namdeo

Reputation: 1464

I have ran the below code

input_data = [('retail', '2017-01-03T13:21:00', 134),
                 ('retail', '2017-01-03T13:21:00', 100)]
rdd_schema = StructType([StructField('business', StringType(), True), \
                     StructField('date', StringType(), True), \
                     StructField("US.sales", IntegerType(), True)])

input_mock_df = sqlContext.createDataFrame(input_data, rdd_schema)

input_mock_df.count()

and it works fine returning the count as 2. Please try and reply

Upvotes: 1

Related Questions