Reputation: 87
I have the following pandas dataframe df
with 2 columns, which looks like:
0 0
1. 22
2. 34
3. 21
4. 21
5. 92
I would like to integrate the area under this curve if we were to plot the first columns as the x-axis and the second column as the y-axis. I have tried doing this using the integrated module from scipy
(from scipy import integrate
), and applied as follows as I have seen in examples online:
print(df.integrate)
However, it seems the integrate function does not work. I'm receiving the error:
Dataframe object has no attribute integrate
How would I go about this?
Thank you
Upvotes: 0
Views: 3693
Reputation: 655
Try this
import pandas as pd
import numpy as np
def integrate(x, y):
area = np.trapz(y=y, x=x)
return area
df = pd.DataFrame({'x':[0, 1, 2, 3, 4, 4, 5],'y':[0, 1, 3, 3, 5, 6, 7]})
x = df.x.values
y = df.y.values
print(integrate(x, y))
Upvotes: 0
Reputation: 36608
You want numerical integration given a fixed sample of data. The Scipy package lists a handful of methods to do this: https://docs.scipy.org/doc/scipy/reference/integrate.html#integrating-functions-given-fixed-samples
For your data, the trapezoidal is probably the most straight forward. You provide the y
and x
values to the function. You did not post the column names of your data frame, so I am using the 0-index for x and the 1-index for y values
from scipy.integrate import trapz
trapz(df.iloc[:, 1], df.iloc[:, 0])
Upvotes: 1
Reputation: 105
Since integrate is a scipy method not a pandas method, you need to invoke it as follows:
from scipy.integrate import trapz, simps
print(trapz(*args))
https://docs.scipy.org/doc/scipy/reference/tutorial/integrate.html
Upvotes: 0