dataframe to np.array - IndexError: tuple index out of range

Question

i am trying to convert the below pandas dataframe,

pd.DataFrame({'PE': [115.45, 8], 'PE FY1': [11, 12], 'EV/Sales':[0.4, 1.9], 'EV/EBIT':[16, 9.8],
                'EV/EBITDA': [10.8, 7.5]})

to an np.array but within multiple square brackets - as below - but have so far been unsuccesful. I am unsure of the correct name of this structure but need it in this for scikit learn. I the below works for what i am trying to do, just a matter of getting there.

q = np.array([[[115.45,11.00,0.40,16.00,10.80]], [[8.00,12.00,1.90, 9.80,7.50]]])

whatever i try i either end up with normal brackets in the wrong place or IndexError: tuple index out of range as i run it through the regressor and tree interpretor - as below

Latest_feature_values = pd.DataFrame({'PE': [115.45, 8], 'PE FY1': [11, 12], 'EV/Sales':[0.4, 1.9], 'EV/EBIT':[16, 9.8],
                'EV/EBITDA': [10.8, 7.5]})

Latest_feature_values = np.array(Latest_feature_values.values)

cs95 · Accepted Answer

v = df.values
v

array([[ 115.45,   11.  ,    0.4 ,   16.  ,   10.8 ],
       [   8.  ,   12.  ,    1.9 ,    9.8 ,    7.5 ]])

If by multiple brackets, you mean that you want to expand the dimensions by 1 (so as to get an output of shape (2, 1, 5)), you have a few options -

np.expand_dims(v, 1)

Or,

v[:, np.newaxis]

Or,

v[:, None]

Or (limited to 2D arrays, needs changes for N-D arrays),

i, j = v.shape
v.reshape((i, 1, j))

array([[[ 115.45,   11.  ,    0.4 ,   16.  ,   10.8 ]],

       [[   8.  ,   12.  ,    1.9 ,    9.8 ,    7.5 ]]])

dataframe to np.array - IndexError: tuple index out of range

Answers (1)

Related Questions