Pablo Gonzalez
Pablo Gonzalez

Reputation: 693

python - 2D numpy array from a pandas dataframe row with delimited range

I am newbie on python and I loaded a big data from a csv into a pandas dataframe. However, I cannot find a method to create a 2d array for each row of the dataframe where each row of the new np array correspond to X range of values. For example, in my code:

import pandas as pd
import numpy as np

data = pd.read_csv("categorization/dataAll10Overfit.csv",header=None)
#print(data)
rec = data.iloc[:,0:3968] # outputs i rows x 3969 columns

There are 3968 values in each row of the dataframe and I would like to create a 124x32 numpy array so each block of 124 values become a row in the 2d np array. I know C# and there it will work to fill the new array using a for loop but I guess there should be a one-line function in python to split all the data of the dataframe's arrow into a new np array. If this question is duplicated, please refer me to the other post. Thanks in advance

Upvotes: 2

Views: 2158

Answers (2)

Nyps
Nyps

Reputation: 886

If you want all 2D arrays within one 3D array you can do:

arr = np.zeros((data.shape[0], 124, 32))

for idx, row in data.iterrows():
    arr[idx] = np.asarray(row).reshape(124, 32)

Or as a one-liner list of arrays:

arr = [np.asarray(row).reshape(124, 32) for idx, row in data.iterrows()]

Upvotes: 1

Oluwafemi Sule
Oluwafemi Sule

Reputation: 38982

I assume you don't want to replace the array in place.

nested_record =  pd.DataFrame(columns=['record'], index=range(3968))

for i in range(3968):
    nested_record['records'].iloc[i] = data.iloc[i].reshape(124, 32)

Upvotes: 1

Related Questions