Reputation: 3299
When transforming multiple ndarray
to a df
as per the code below
import numpy as np
import pandas as pd
ls_a = ['TA', 'BAT', 'T']
xxx = ['xx', 'cc']
feature_no = len(ls_a)
windows_no = len(xxx)
sub_iti = np.repeat([['s1']], (feature_no * windows_no), axis=0).reshape(-1, 1)
tw = np.repeat([xxx], feature_no, axis=1).reshape(-1, 1)
col_iti = np.repeat([ls_a], windows_no, axis=0).reshape(-1, 1)
df=pd.DataFrame ({'sub_iti': sub_iti,'tw': tw,'col_iti': col_iti})
, the compiler return an error
ValueError: If using all scalar values, you must pass an index
Based on OP, the argument index
was inputed as below
df=pd.DataFrame (
{'sub_iti': sub_iti,
'tw': tw,
'col_iti': col_iti},index=range(0,3*2) )
However, the compiler return diff rent error
Exception: Data must be 1-dimensional
May I know how to address this issue?
Upvotes: 1
Views: 7996
Reputation: 150785
All of your sub_iti, tw, col_iti
are 2D numpy arrays. However, when you do:
df=pd.DataFrame ({'sub_iti': sub_iti,
'tw': tw,
'col_iti': col_iti} )
Pandas expected them to be 1D
numpy arrays or lists, since that's how columns of a DataFrame should be. You can try:
df=pd.DataFrame ({'sub_iti': sub_iti.tolist(),
'tw': tw.tolist(),'col_iti': col_iti.tolist()})
Output:
sub_iti tw col_iti
0 [s1] [xx] [TA]
1 [s1] [xx] [BAT]
2 [s1] [xx] [T]
3 [s1] [cc] [TA]
4 [s1] [cc] [BAT]
5 [s1] [cc] [T]
But I do think that you should remove the lists inside each cell, and use ravel()
instead of tolist()
:
df=pd.DataFrame ({'sub_iti': sub_iti.ravel(),
'tw': tw.ravel(),'col_iti': col_iti.ravel()})
Output:
sub_iti tw col_iti
0 s1 xx TA
1 s1 xx BAT
2 s1 xx T
3 s1 cc TA
4 s1 cc BAT
5 s1 cc T
Upvotes: 1