Reputation: 693
I am newbie on python and I loaded a big data from a csv into a pandas dataframe. However, I cannot find a method to create a 2d array for each row of the dataframe where each row of the new np array correspond to X range of values. For example, in my code:
import pandas as pd
import numpy as np
data = pd.read_csv("categorization/dataAll10Overfit.csv",header=None)
#print(data)
rec = data.iloc[:,0:3968] # outputs i rows x 3969 columns
There are 3968 values in each row of the dataframe and I would like to create a 124x32 numpy array so each block of 124 values become a row in the 2d np array. I know C# and there it will work to fill the new array using a for loop but I guess there should be a one-line function in python to split all the data of the dataframe's arrow into a new np array. If this question is duplicated, please refer me to the other post. Thanks in advance
Upvotes: 2
Views: 2158
Reputation: 886
If you want all 2D arrays within one 3D array you can do:
arr = np.zeros((data.shape[0], 124, 32))
for idx, row in data.iterrows():
arr[idx] = np.asarray(row).reshape(124, 32)
Or as a one-liner list of arrays:
arr = [np.asarray(row).reshape(124, 32) for idx, row in data.iterrows()]
Upvotes: 1
Reputation: 38982
I assume you don't want to replace the array in place.
nested_record = pd.DataFrame(columns=['record'], index=range(3968))
for i in range(3968):
nested_record['records'].iloc[i] = data.iloc[i].reshape(124, 32)
Upvotes: 1