Rahul rajan
Rahul rajan

Reputation: 1266

Getting Error while writing a Large datarame of 60K rows into csv in Pandas

I am trying to build an prediction model. I am trying to do that in 2 parts

  1. Preprocesing of the data in python file(.ipynb) and saving this preprocesed data into a csv file
    1. Calling this preprocessed file in the Step 1 Model Prediction (.ipynb) file.

Preprocessing File

#saving preproccesed dataframe to  csv
train.to_csv('C:/Users/Documents/Tesfile_Preprocessed.csv')

Error 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\sparse\array.py in __getitem__(self, key)
    417             return self._get_val_at(key)
    418         elif isinstance(key, tuple):
--> 419             data_slice = self.values[key]
    420         else:
    421             if isinstance(key, SparseArray):



IndexError: too many indices for array

Prediction Model File

X_test=pd.read_csv('C:/Users/Documents/Tesfile_Preprocessed.csv')
predicted_dtree = fit_decision.predict(X_test)

How can I Solve This isue

Upvotes: 1

Views: 63

Answers (1)

Alvaro Silvino
Alvaro Silvino

Reputation: 9753

A better approach is to use to_pickle:

train.to_pickle('C:/Users/Documents/Tesfile_Preprocessed.pickle')

Too many indices' means you've given too many index values. You've given 2 values as you're expecting data to be a 2D array. Numpy is complaining because data is not 2D (it's either 1D or None).

I recommend you to check the array dimensions before accessing it.

self.values.shape

or

len(self.values) 

This is NOT related to your Pandas, you are trying to access an not existing index on your array.

Please try to change the sep during the export.

train.to_csv('C:/Users/Documents/Tesfile_Preprocessed.csv', sep='\t')

Upvotes: 1

Related Questions