Reputation: 5791
I convert an oscilloscope dataset with millions of values into a pandas DataFrame. Next step is to plot it. But Matplotlib needs on my fairly powerful machine ~50 seconds to plot the DataFrame.
import pandas as pd
import matplotlib.pyplot as plt
import readTrc
datX, datY, m = readTrc.readTrc('C220180104_ch2_UHF00000.trc')
srx, sry = pd.Series(datX), pd.Series(datY)
df = pd.concat([srx, sry], axis = 1)
df.set_index(0, inplace = True)
df.plot(grid = 1)
plt.show()
Now I found out that there is a way to make matplotlib faster with large datasets by using 'Agg'.
import matplotlib
matplotlib.use('Agg')
import pandas as pd
import matplotlib.pyplot as plt
import readTrc
datX, datY, m = readTrc.readTrc('C220180104_ch2_UHF00000.trc')
srx, sry = pd.Series(datX), pd.Series(datY)
df = pd.concat([srx, sry], axis = 1)
df.set_index(0, inplace = True)
df.plot(grid = 1)
plt.show()
Unfortunately no plot is shown. The process of processing the plot takes ~5 seconds (a big improvement) but no plot is shown. Is this method not compatible with pandas?
Upvotes: 0
Views: 7330
Reputation: 51
You can use Ploty and Lenspy (was built to solve this exact problem). Here is an example of how you can plot 10m points on scatter plot. This plot runs super fast on my 2016 MacBook.
import numpy as np
import plotly.graph_objects as go
from lenspy import DynamicPlot
# First, let's create a very large figure
x = np.arange(1, 11, 1e-6)
y = 1e-2*np.sin(1e3*x) + np.sin(x) + 1e-3*np.sin(1e10*x)
fig = go.Figure(data=[go.Scattergl(x=x, y=y)])
fig.update_layout(title=f"{len(x):,} Data Points.")
# Use DynamicPlot.show to view the plot
plot = DynamicPlot(fig)
plot.show()
# Plot will be available in the browser at http://127.0.0.1:8050/
For your use case (again, I cannot test this since I don’t have access to your dataset):
import pandas as pd
import matplotlib.pyplot as plt
import readTrc
from lenspy import DynamicPlot
import plotly.graph_objects as go
datX, datY, m = readTrc.readTrc('C220180104_ch2_UHF00000.trc')
srx, sry = pd.Series(datX), pd.Series(datY)
fig = go.Figure(data=[go.Scattergl(x=srx, y=sry)])
fig.update_layout(title=f"{len(x):,} Data Points.")
# Use DynamicPlot.show to view the plot
plot = DynamicPlot(fig)
plot.show()
Disclaimer: I am the creator of Lenspy
Upvotes: 1