Reputation: 1855
I have a set of data associated with Npts points. Some of that data are scalar values, such as color, some of the data are multi-dimensional, such as 3d position. I am trying to bundle this data into a pandas data structure, and get a variety of error messages depending on how I try to do it.
Here's some mock data:
Npts=100
pos = np.random.uniform(0, 250, Npts*3).reshape(Npts, 3)
colors = np.random.uniform(-1, 1, Npts)
Using a dictionary as input, the color data alone bundles up into a Data Frame just fine:
df_colors = pandas.DataFrame({'colors':colors})
But the position information does not:
df_pos = pandas.DataFrame({'pos':pos})
This returns the following unhelpful error message:
ValueError: If using all scalar values, you must must pass an index
And what I really want to do is bundle both position and color information together:
df_agg = pandas.DataFrame({'pos':pos, 'colors':colors})
But this does not work, and returns the following equally cryptic error:
Exception: Data must be 1-dimensional
Surely it is possible to bundle multi-dimensional data with pandas, as well as data with mixed dimension. Does anyone know the API for this behavior?
Upvotes: 1
Views: 3849
Reputation: 4089
The problem is that pos
has dimensions of (100,3). To turn it into a column, you need an array of dimensions (100,).
One option is to create an individual column for each of the dimensions:
df_agg = pandas.DataFrame({'posX':pos[:,0], 'posY':pos[:,1], 'posZ':pos[:,2], 'colors':colors})
Another options is to cast each coordinate into a 3-tuple:
posTuple = tuple(map(tuple,pos))
df_aggV2 = pandas.DataFrame({'pos':posTuple, 'colors':colors})
Upvotes: 1