Reputation: 147
when i try to create dataframe from two columns i.e. pids and SalePrice I get error "Exception: Data must be 1-dimensional". I think the error is coming because these two data series are in different format like below. Please help how can i make these data series same
ksubmission = pd.DataFrame({'Id':pids,'SalePrice':predictions_kaggle})
Exception: Data must be 1-dimensional
pids.shape
(1459,)
predictions_kaggle.shape
(1459, 1)
predictions_kaggle is in below format
array([[115901.20520943],
[144313.70246636],
[165320.94012928],
...,
[155759.14767572],
[111175.64223766],
[249104.99042467]])
while pids is in below format
0 1461
1 1462
2 1463
3 1464
4 1465
...
1454 2915
1455 2916
1456 2917
1457 2918
1458 2919
Name: Id, Length: 1459, dtype: int64
Upvotes: 1
Views: 139
Reputation: 864
The problem here is that your predictions_kaggle
array is not a 1-D array but rather a 2-D one. As proof, the shape of a 1-D array should be in the form (n,)
but instead you have (n,1)
which indicates that each line of your array is a single value inside an array. A quick fix to this is by flattening the array, which will turn it into a 1-D array:
ksubmission = pd.DataFrame({'Id':pids,'SalePrice':predictions_kaggle.flatten()})
Hope this helps.
Upvotes: 1
Reputation: 7224
I think you need to do this if the lengths are the same:
import pandas as pd
import numpy as np
pd.DataFrame(predictions_kaggle, index=pids).reset_index().rename(columns={'index': 'Id', 0:'SalePrice'})
or
pd.DataFrame({'Id':pids,'SalePrice':np.ndarray.flatten(predictions_kaggle)})
Upvotes: 1