Allianz1
Allianz1

Reputation: 31

Traing a keras model on spark by uisng a 3d-shape spark dataframe

I am training to write a distributed keras model in which both data and the model itself could be run on spark in parallel. The problem is that, I am working with 3-d shape image data as input, but I am not sure how to make dataframe from my 3d shape numpy array to train my keras model by system ml. I am using a pretrained xception keras models so my input need to be in the shape of (300,300,3)

I am following the link

from sklearn import datasets, neighbors
from pyspark.sql import DataFrame, SQLContext
import systemml as sml
import pandas as pd
import os, imp
sqlCtx = SQLContext(sc)
digits = datasets.load_digits()
X_digits = digits.data
y_digits = digits.target + 1
n_samples = len(X_digits)
# Split the data into training/testing sets and convert to PySpark DataFrame
X_df = sqlCtx.createDataFrame(pd.DataFrame(X_digits[:.9 * n_samples]))
y_df = sqlCtx.createDataFrame(pd.DataFrame(y_digits[:.9 * n_samples]))
ml = sml.MLContext(sc)
# Get the path of MultiLogReg.dml
scriptPath = os.path.join(imp.find_module("systemml")[1], 'systemml-java', 'scripts', 'algorithms', 'MultiLogReg.dml')
script = sml.dml(scriptPath).input(X=X_df, Y_vec=y_df).output("B_out")
beta = ml.execute(script).get('B_out').toNumPy()

but the problem when I use a 3d-shape numpy array, I cant convert it to spark dataframe, it ask me to give a 2d shape array.

l=np.array([[[1,2],[3,4],[5,6]],[[1,2],[3,4],[5,6]]])
l.shape
(2, 3, 2)
X_df = sqlCtx.createDataFrame(pd.DataFrame(l))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\KMOB\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py", line 424, in __init__
    copy=copy)
  File "C:\Users\KMOB\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\construction.py", line 146, in init_ndarray
    values = prep_ndarray(values, copy=copy)
  File "C:\Users\KMOB\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\construction.py", line 249, in prep_ndarray
    raise ValueError('Must pass 2-d input')
ValueError: Must pass 2-d input

Does anyone have any solution for this?

Upvotes: 1

Views: 210

Answers (0)

Related Questions