Luckasino
Luckasino

Reputation: 424

Reshape arrays for DataFrame

I have three arrays with shapes (6, 1) and I would like to create a dataframe with three columns with values from arrays, however, I stack on the reshaping. I've tried several approaches. What am I missing?

code snippet:

array_1
Out:
array([[-1.05960895],
   [-1.02044895],
   [-1.14015499],
   [-1.4261115 ],
   [-1.86607347],
   [-1.02244409]])

array_2
Out:
array([[50.21621],
   [50.21565],
   [50.21692],
   [50.21636],
   [50.21763],
   [50.21707]])

array_3
Out:
array([[15.33107],
   [15.3293 ],
   [15.3309 ],
   [15.32913],
   [15.33073],
   [15.32896]])

arr = np.array([array_1, array_2, array_3]).reshape(3, 6)
​df = pd.DataFrame(data = arr, columns=['Sigma', 'x', 'y'])

ValueError: Shape of passed values is (3, 6), indices imply (3, 3)

Upvotes: 0

Views: 493

Answers (3)

AcaNg
AcaNg

Reputation: 706

Keep in mind that np.reshape() will re-order your data and change the values in a column:

>>> arr = np.array([array_1, array_2, array_3]).reshape(6,3)
>>> pd.DataFrame(data = arr, columns=['Sigma', 'x', 'y'])
       Sigma          x          y
0  -1.059609  -1.020449  -1.140155
1  -1.426111  -1.866073  -1.022444
2  50.216210  50.215650  50.216920
3  50.216360  50.217630  50.217070
4  15.331070  15.329300  15.330900
5  15.329130  15.330730  15.328960

If you want to preserve the value order, you can use numpy.hstack

>>> pd.DataFrame(data=np.hstack((array_1,array_2,array_3)), columns=['Sigma', 'x', 'y'])
      Sigma         x         y
0 -1.059609  50.21621  15.33107
1 -1.020449  50.21565  15.32930
2 -1.140155  50.21692  15.33090
3 -1.426111  50.21636  15.32913
4 -1.866073  50.21763  15.33073
5 -1.022444  50.21707  15.32896

Upvotes: 2

Niv Dudovitch
Niv Dudovitch

Reputation: 1658

Full example + solution:

Create data:

array_1 = np.array([[-1.05960895],
   [-1.02044895],
   [-1.14015499],
   [-1.4261115 ],
   [-1.86607347],
   [-1.02244409]])
array_1 = [item for sublist in array_1 for item in sublist]

array_2 = np.array([[50.21621],
   [50.21565],
   [50.21692],
   [50.21636],
   [50.21763],
   [50.21707]])
array_2 = [item for sublist in array_2 for item in sublist]


array_3 = np.array([[15.33107],
   [15.3293 ],
   [15.3309 ],
   [15.32913],
   [15.33073],
   [15.32896]])
array_3 = [item for sublist in array_3 for item in sublist]


# Solution:
data = {'Sigma':array_1, 'x':array_2, 'y':array_3}
df = pd.DataFrame(data = data, columns=['Sigma', 'x', 'y'])

Result df:enter image description here

Upvotes: 1

Amith Lakkakula
Amith Lakkakula

Reputation: 516

reshape(n_cols, n_rows). You just passed the arguments incorrectly.

# Assuming each array is a column in Dataframe.
arr = np.array([array_1, array_2, array_3]).reshape(6, 3)

Upvotes: 0

Related Questions