Reputation: 89
Suppose I have a data frame of length 5800000
which is a concatenation of 100 files where each file has 58000 rows. I have an array fv
of shape (100, 10, 58000)
which I want to add to the data frame by adding 10 columns. df
has a length of 5800000
with two columns but only focuses on the first column index, i.e df.shape[0]
list_ = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
fv = np.zeros(len(data), len(list_), int(df.shape[0]/len(files))
def add_fv_to_dataframe(_data, list_):
for index in range(len(_data):
for name_index, name_in_list in enumerate(list_):
calculate something
calcs_ = _data[index]
fv[index, name_index, :] = calcs_
# add the calculated values to the dataframe
df['fv_{}'.format(name_in_list)] = pd.Series(fv.reshape(-1, (10,1)), index=df.index)
I would like to have my final data frame in the form;
df[0] | df[1] | fv_a | fv_b | fv_c | fv_d | fv_e | fv_f | fv_g | fv_h | fv_i | fv_j |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
: | : | : | : | : | : | : | : | : | : | : | : |
5800000 | 5800000 | 5800000 | 5800000 | 5800000 | 5800000 | 5800000 | 5800000 | 5800000 | 5800000 | 5800000 | 5800000 |
Upvotes: 1
Views: 71
Reputation: 93191
Use np.swapaxes
:
for i, data in enumerate(np.swapaxes(fv, 0, 1)):
df[f"fv_{i}"] = np.ravel(data)
Upvotes: 1