Alberto Bonsanto
Alberto Bonsanto

Reputation: 18042

Is it possible to instantiate a DataFrame with a unicode column?

I am trying to create a DataFrame that has a column that stores unicode data instead of standard python string, because my language has some additional accented letters, e.g. ñ, á, é and others.

I tried the following.

x = sqlContext.createDataFrame([u"A", u"B", u"C"], ["letters"])

And showed the next exception.

TypeError: Can not infer schema for type: <.type 'unicode'>

Then I read the type documentation and didn't find compatible data type, therefore I ask if someone knows if it's possible to achieve this?

Upvotes: 1

Views: 1271

Answers (1)

zero323
zero323

Reputation: 330413

The problem is how you provide the elements not unicode data. Even if you have only a single column every element should be of supported type like Row, list or tuple:

df = sqlContext.createDataFrame([(u"A", ), (u"B", ), (u"C", )], ["letters"])

Upvotes: 1

Related Questions