Reputation: 469
I have a pandas DataFrame like this
100 200 300
283.1 0.01 0.02 0.40
284.1 0.02 0.03 0.42
285.1 0.05 0.01 0.8
286.1 0.06 0.02 0.9
I need to combine a certain number of consecutive rows and calculate the average value for each column and a new index as the average of the indices I used, in order to obtain something like this:
100 200 300
283.6 0.015 0.025 0.41
285.6 0.055 0.015 0.85
Is there a way to to this with pandas?
Upvotes: 2
Views: 2441
Reputation: 2330
A simple way:
>>> print df
index 100 200 300
0 283.1 0.01 0.02 0.40
1 284.1 0.02 0.03 0.42
2 285.1 0.05 0.01 0.80
3 286.1 0.06 0.02 0.90
break the DataFrame
up into the portions that you want and find the mean
of the relevant columns:
>>> pieces = [df[:2].mean(), df[2:].mean()]
then put the pieces back together using concat:
>>> avgdf = pd.concat(pieces, axis=1).transpose()
index 100 200 300
0 283.6 0.015 0.025 0.41
1 285.6 0.055 0.015 0.85
Alternatively, you can recombine the data with a list comprehension [i for i in pieces]
or a generator expression:
>>> z = (i for i in pieces)
and use this to create your new DataFrame
:
>>> avgdf = pd.DataFrame(z)
Finally, to set the index:
>>> avgdf.set_index('index', inplace=True)
>>> print avgdf
100 200 300
index
283.6 0.015 0.025 0.41
285.6 0.055 0.015 0.85
Upvotes: 0
Reputation: 77027
Yes -- you could do this in Pandas. Here's one way to do it.
Let's say out, our initial dataframe df
is like
index 100 200 300
0 283.1 0.01 0.02 0.40
1 284.1 0.02 0.03 0.42
2 285.1 0.05 0.01 0.80
3 286.1 0.06 0.02 0.90
Now, calculate the length of dataframe
N = len(df.index)
N
4
We create a grp
column -- to be used for aggregation,
where for 2 rows
aggregation use [x ]*2
and for n-consecutive rows
use [x]*n
df['grp'] = list(itertools.chain.from_iterable([x]*2 for x in range(0, N/2)))
df
index 100 200 300 grp
0 283.1 0.01 0.02 0.40 0
1 284.1 0.02 0.03 0.42 0
2 285.1 0.05 0.01 0.80 1
3 286.1 0.06 0.02 0.90 1
Now, get the means by grouping the grp
column like this --
df.groupby('grp').mean()
index 100 200 300
grp
0 283.6 0.015 0.025 0.41
1 285.6 0.055 0.015 0.85
Upvotes: 3