JustInTime
JustInTime

Reputation: 2776

Aggregating data in pandas

I am trying to explore Pandas library and stopped by an example that I frequently face and I think pandas had the solution for it. Given the folloing code:

In [63]: d1 = np.random.rand(3,3)
In [63]: d2 = np.random.rand(3,3)

In [64]:s1 = pandas.Series(d1,index = [['a1']*d1.shape[0],
                             [4]*d1.shape[0],
                             range(d1.shape[0])])

Out[64]:a1  4  0    [ 0.00881133  0.71344668  0.03611378]
               1    [ 0.37328776  0.63195947  0.23000941]
               2    [ 0.68466443  0.85891677  0.31740809]

In [65]: s2 = pandas.Series(d2,index = [['a2']*d2.shape[0],
                             [5]*d2.shape[0],
                             range(d2.shape[0])])
Out[65]:a2  5  0    [ 0.00881133  0.71344668  0.03611378]
               1    [ 0.37328776  0.63195947  0.23000941]
               2    [ 0.68466443  0.85891677  0.31740809]

s = s1.append(s2)

a1  4  0    [ 0.00881133  0.71344668  0.03611378]
       1    [ 0.37328776  0.63195947  0.23000941]
       2    [ 0.68466443  0.85891677  0.31740809]
    5  0    [ 0.00881133  0.71344668  0.03611378]
       1    [ 0.37328776  0.63195947  0.23000941]
       2    [ 0.68466443  0.85891677  0.31740809]

How to obtain a list of all the data matrices alone without their labels?

Upvotes: 0

Views: 484

Answers (2)

ariddell
ariddell

Reputation: 3423

s.values will do the trick.

From the documentation: DataFrame.values Convert the frame to its Numpy-array matrix representation."

I think you mean pandas.DataFrame above (not Series). Series.values exists as well.

Upvotes: 2

Jason Strimpel
Jason Strimpel

Reputation: 15476

I'm getting errors running your code. However, to convert a pandas Series to a numpy array, use the pandas.Series.values method. Wes's documentation is very well done. Spend some time reviewing...

Upvotes: 1

Related Questions