Douglas M
Douglas M

Reputation: 1126

numpy to spark error: TypeError: Can not infer schema for type: <class 'numpy.float64'>

While trying to convert a numpy array into a Spark DataFrame, I receive Can not infer schema for type: <class 'numpy.float64'> error. The same thing happens with numpy.int64 arrays.

Example:

df = spark.createDataFrame(numpy.arange(10.))

TypeError: Can not infer schema for type: <class 'numpy.float64'>

Upvotes: 0

Views: 5026

Answers (2)

blackbishop
blackbishop

Reputation: 32670

Or without using pandas:

df = spark.createDataFrame([(float(i),) for i in numpy.arange(10.)])

Upvotes: 1

Douglas M
Douglas M

Reputation: 1126

A quick conversion to a pandas DataFrame works nicely:

import pandas
import numpy
df = spark.createDataFrame(pandas.DataFrame(numpy.arange(10.)))

Upvotes: 0

Related Questions