thereiswaldo
thereiswaldo

Reputation: 87

Python integration of Pandas dataframe

I have the following pandas dataframe df with 2 columns, which looks like:

0  0
1. 22
2. 34
3. 21
4. 21
5. 92

I would like to integrate the area under this curve if we were to plot the first columns as the x-axis and the second column as the y-axis. I have tried doing this using the integrated module from scipy (from scipy import integrate), and applied as follows as I have seen in examples online:

print(df.integrate)

However, it seems the integrate function does not work. I'm receiving the error:

Dataframe object has no attribute integrate

How would I go about this?

Thank you

Upvotes: 0

Views: 3693

Answers (3)

Rishin Rahim
Rishin Rahim

Reputation: 655

Try this

import pandas as pd
import numpy as np

def integrate(x, y):
    area = np.trapz(y=y, x=x)
    return area

df = pd.DataFrame({'x':[0, 1, 2, 3, 4, 4, 5],'y':[0, 1, 3, 3, 5, 6, 7]})
x = df.x.values
y = df.y.values
print(integrate(x, y))

Upvotes: 0

James
James

Reputation: 36608

You want numerical integration given a fixed sample of data. The Scipy package lists a handful of methods to do this: https://docs.scipy.org/doc/scipy/reference/integrate.html#integrating-functions-given-fixed-samples

For your data, the trapezoidal is probably the most straight forward. You provide the y and x values to the function. You did not post the column names of your data frame, so I am using the 0-index for x and the 1-index for y values

from scipy.integrate import trapz

trapz(df.iloc[:, 1], df.iloc[:, 0])

Upvotes: 1

drew_psy
drew_psy

Reputation: 105

Since integrate is a scipy method not a pandas method, you need to invoke it as follows:

from scipy.integrate import trapz, simps
print(trapz(*args))

https://docs.scipy.org/doc/scipy/reference/tutorial/integrate.html

Upvotes: 0

Related Questions