Andre Guerra
Andre Guerra

Reputation: 1177

Why is the .as_matrix() call causing an error while computing the cross product?

Can anyone shed some light on why would this array value assignment cause an error on numpy.cross()?

Assume dfAnalysis is a pandas dataframe that contains x_rel, y_rel and z_rel as column labels as float values as their data type entries.

When extracting data from it like in the code snippet below...

A = dfAnalysis.iloc[0][['x_rel','y_rel','z_rel']].as_matrix()
B = dfAnalysis.iloc[1][['x_rel','y_rel','z_rel']].as_matrix()

I get the following error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-11-f153b94e791d> in <module>()
      7 B = dfAnalysis.iloc[1][['x_rel','y_rel','z_rel']].as_matrix()
      8 
----> 9 np.cross(A,B)

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/numpy/core/numeric.py in cross(a, b, axisa, axisb, axisc, axis)
   1819             cp0 -= tmp
   1820             multiply(a2, b0, out=cp1)
-> 1821             multiply(a0, b2, out=tmp)
   1822             cp1 -= tmp
   1823             multiply(a0, b1, out=cp2)

TypeError: ufunc 'multiply' output (typecode 'O') could not be coerced to provided output parameter (typecode 'd') according to the casting rule ''same_kind''

If the 2 arrays are build like this, the error is fixed, but I don't understand why. Can someone explain?

A = np.array([dfAnalysis.iloc[0]['x_rel'],
              dfAnalysis.iloc[0]['y_rel'],
              dfAnalysis.iloc[0]['z_rel']])
B = np.array([dfAnalysis.iloc[1]['x_rel'],
              dfAnalysis.iloc[1]['y_rel'],
              dfAnalysis.iloc[1]['z_rel']])

np.cross(A,B)

Link to Jupyter Notebook Link to CSV file

Upvotes: 0

Views: 730

Answers (2)

Warren Weckesser
Warren Weckesser

Reputation: 114821

The pandas code is creating A and B as numpy arrays with data type object instead of arrays of floating point values:

In [168]: A = df.iloc[0][['x_rel', 'y_rel', 'z_rel']].as_matrix()

In [169]: A
Out[169]: array([213.86051031592066, 127.52721826173365, 14.120000000000005], dtype=object)

A numpy array of type object is an array that contains arbitrary python objects. In this case, the objects are themselves floating point values, so the arrays mostly look and act like arrays of floats. However, many numpy functions, including cross, cannot handle object arrays. One way to fix this is to convert the data type of the arrays to numpy.float64 using the astype() method:

In [170]: a = A.astype(np.float64)

In [171]: a
Out[171]: array([ 213.86051032,  127.52721826,   14.12      ])

In [172]: b = B.astype(np.float64)

In [173]: b
Out[173]: array([ 213.70062319,  127.21119974,   14.12      ])

In [174]: np.cross(a, b)
Out[174]: array([  4.46218149,  -2.25760625, -47.19392108])

An alternative is to create an array using just the relevant columns:

In [193]: rel = df[['x_rel', 'y_rel', 'z_rel']].as_matrix()

In [194]: rel.dtype
Out[194]: dtype('float64')

In [195]: np.cross(rel[0], rel[1])
Out[195]: array([  4.46218149,  -2.25760625, -47.19392108])

Upvotes: 1

Renato Girardi Gasoto
Renato Girardi Gasoto

Reputation: 11

using as_matrix() directly from imported csv casts your matrix with dtype object.

>>> A = dfAnalysis.iloc[0][['x_rel','y_rel','z_rel']].as_matrix() # extract entry as numpy array
>>> B = dfAnalysis.iloc[1][['x_rel','y_rel','z_rel']].as_matrix()
>>> A
array([213.86051031592066, 127.52721826173365, 14.120000000000005], dtype=object)

Change your lines as below, that converts it to float64:

>>> A = pd.to_numeric(dfAnalysis.iloc[0][['x_rel','y_rel','z_rel']]).as_matrix()
>>> B = pd.to_numeric(dfAnalysis.iloc[1][['x_rel','y_rel','z_rel']]).as_matrix()
>>> B
array([ 213.70062319,  127.21119974,   14.12      ])

Upvotes: 1

Related Questions