Combine Rows in Pandas

Question

I have a pandas DataFrame like this

         100  200  300
283.1   0.01 0.02 0.40
284.1   0.02 0.03 0.42
285.1   0.05 0.01 0.8
286.1   0.06 0.02 0.9

I need to combine a certain number of consecutive rows and calculate the average value for each column and a new index as the average of the indices I used, in order to obtain something like this:

         100   200  300
283.6   0.015 0.025 0.41  
285.6   0.055 0.015 0.85

Is there a way to to this with pandas?

Zero · Accepted Answer

Yes -- you could do this in Pandas. Here's one way to do it.

Let's say out, our initial dataframe df is like

   index   100   200   300
0  283.1  0.01  0.02  0.40
1  284.1  0.02  0.03  0.42
2  285.1  0.05  0.01  0.80
3  286.1  0.06  0.02  0.90

Now, calculate the length of dataframe

N = len(df.index)
N
4

We create a grp column -- to be used for aggregation,

where for 2 rows aggregation use [x ]*2 and for n-consecutive rows use [x]*n

df['grp'] = list(itertools.chain.from_iterable([x]*2 for x in range(0, N/2)))
df
   index   100   200   300  grp
0  283.1  0.01  0.02  0.40    0
1  284.1  0.02  0.03  0.42    0
2  285.1  0.05  0.01  0.80    1
3  286.1  0.06  0.02  0.90    1

Now, get the means by grouping the grp column like this --

df.groupby('grp').mean()
     index    100    200   300
grp
0    283.6  0.015  0.025  0.41
1    285.6  0.055  0.015  0.85

Combine Rows in Pandas

Answers (2)

Related Questions