halfmpty
halfmpty

Reputation: 3

Python Riemann Sum does not yield zero for equal positive and negative areas

I wrote a program to approximate an integral using a Riemann sum and graph it using matplotlib in Python. For functions with equal areas above and below the x-axis, the resulting area should be zero, but my program outputs a very small number instead.

The following code graphs the odd function f(x) = x^3 from -1 to 1, so the area should be zero. My code instead approximates it to be 1.68065561477562 e^-15.

What is causing this? Is it a rounding error in delta_x, x, or y? I know I could just round the value to zero, but I'm wondering if there is another problem or way to solve this.

I have tried using the Decimal.decimal class for delta_x, but I just got an even smaller number.

The Python code:

import matplotlib.pyplot as plt
import numpy as np

# Approximates and graphs integral using Riemann Sum


# example function: f(x) = x^3
def f_x(x):
    return x**3

# integration range from a to b with n rectangles
a, b, n = -1, 1, 1000

# calculate delta x, list of x-values, list of y-values, and approximate area under curve
delta_x = (b - a) / n

x = np.arange(a, b+delta_x, delta_x)

y = [f_x(i) for i in x]

area = sum(y) * delta_x

# graph using matplotlib
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(x, y)
ax.bar(x, y, delta_x, alpha=.5)
plt.title('a={}, b={}, n={}'.format(a, b, n))
plt.xlabel('A = {}'.format(area))
plt.show()

Upvotes: 0

Views: 1483

Answers (3)

ImportanceOfBeingErnest
ImportanceOfBeingErnest

Reputation: 339250

You need to be aware that what you are computing is not a Riemann integral in the original sense. You are dividing the interval into n bins, but then sum over n+1 bins (here n = 1000 but len(x) == 1001). So the result may be close to what you expect, but it is certainly not a good way to get there.

Using the Riemann sum you would divide your interval into n bins, and then sum over the values of those n bins. You have the choice whether to compute the left Riemann sum, the right Riemann sum, or possibly taking the midpoints.

import numpy as np

def f_x(x):
    return x**3

# integration range from a to b with n rectangles
a, b, n = -1, 1, 1000

delta_x = (b - a) / float(n)

x_left = np.arange(a, b, delta_x)
x_right = np.arange(a+delta_x, b+delta_x, delta_x)
x_mid = np.arange(a+delta_x/2., b+delta_x/2., delta_x)

print len(x_left),  len(x_right), len(x_mid)          ### 1000 1000 1000


area_left = f_x(x_left).sum() * delta_x
area_right = f_x(x_right).sum() * delta_x
area_mid = f_x(x_mid).sum() * delta_x

print area_left     # -0.002
print area_right    #  0.002
print area_mid      # 1.81898940355e-15

While the midpoint sum already gives a good result, for symmetric functions it would be a good idea to choose n even, and take the average of the left and right sum,

print 0.5*(area_right+area_left)   # 1.76204537072e-15

This is equally close to 0.

Now it is worthwhile noting that numpy.arange produces some errors by itself. A better choice would be using numpy.linspace

x_left = np.linspace(a, b-delta_x, n)
x_right = np.linspace(a+delta_x, b, n)
x_mid = np.linspace(a+delta_x/2., b-delta_x/2., n)

yielding

print area_left     # -0.002
print area_right    #  0.002
print area_mid      # 8.52651282912e-17
print 0.5*(area_right+area_left)   # 5.68121938382e-17

5.68121938382e-17 is pretty close to 0. The reason why it is not entirely 0 is indeed floating point inaccuracies.

The famous example of that would be 0.1 + 0.2 - 0.3 which results in 5.5e-17 instead of 0. This is to show that this simply operation introduces the same error of the order of 1e-17 as the Riemann integration.

Upvotes: 1

eugenhu
eugenhu

Reputation: 1238

Yes, this is just due to floating point inaccuracy.

For constructing the partition, np.linspace() may be more appropriate since np.arange() may or may not include the endpoint depending on how it's rounded when using non-integer step sizes.

From numpy.arange() documentation:

When using a non-integer step, such as 0.1, the results will often not be consistent. It is better to use linspace for these cases.

Upvotes: 1

MattS
MattS

Reputation: 138

Your code seems to be valid to me. It sums the y returned by f_x, and accounts for the approximation error of having only 1000 subintervals by adding delta_x to the second argument of arange. Unfortunately, I think that roundoff error is at play here.

Upvotes: 0

Related Questions