Jaffer Wilson
Jaffer Wilson

Reputation: 7273

Not getting the proper graph comparison using Python

I am trying to compare and get a proper point of intersection between the two CSV files. I am using the graph depiction for better understanding.
But I am getting very diminished image of one graph as compared to another.
See the following:

Here is the data: trade-volume.csv
Here is the real graph:
images 1 Here is the data: miners-revenue.csv
Here is the real graph:
image 2

Here is the program I wrote for comparison:

import pandas as pd
import matplotlib.pyplot as plt


dat2 = pd.read_csv("trade-volume.csv", parse_dates=['time'])
dat3 = pd.read_csv("miners-revenue.csv", parse_dates=['time'])


dat2['timeDiff'] = (dat2['time'] - dat2['time'][0]).astype('timedelta64[D]')
dat3['timeDiff'] = (dat3['time'] - dat3['time'][0]).astype('timedelta64[D]')

fig, ax = plt.subplots()

ax.plot(dat2['timeDiff'], dat2['Value'])
ax.plot(dat3['timeDiff'], dat3['Value'])

plt.show()

I got the output like the following:
image 3

As one can see the orange color graph is very low and I could not understand the points as it is lower. I am willing to overlap the graphs and then check.

Please help me make it possible with my existing code, if no alteration required.

Upvotes: 0

Views: 39

Answers (2)

DavidG
DavidG

Reputation: 25362

The problem comes down to your y axis. One has a maximum of 60,000,000 while the other has a maximum of 6,000,000,000. Trying to plot these on the same graph is going to lead to one "looking" like a straight line even though it isn't if you zoom in.

A possible solution is to use a second y axis (you can change the color of the lines using the color= argument in ax.plot():

import pandas as pd
import matplotlib.pyplot as plt

dat2 = pd.read_csv("trade-volume.csv", parse_dates=['time'])
dat3 = pd.read_csv("miners-revenue.csv", parse_dates=['time'])

dat2['timeDiff'] = (dat2['time'] - dat2['time'][0]).astype('timedelta64[D]')
dat3['timeDiff'] = (dat3['time'] - dat3['time'][0]).astype('timedelta64[D]')

fig, ax = plt.subplots()

ax.plot(dat2['timeDiff'], dat2['Value'], color="blue")

ax2=ax.twinx()
ax2.plot(dat3['timeDiff'], dat3['Value'], color="red")

plt.show()

Upvotes: 3

ImportanceOfBeingErnest
ImportanceOfBeingErnest

Reputation: 339170

Both data live on very different scales. You may normalize both in order to compare them.

import pandas as pd
import matplotlib.pyplot as plt


dat2 = pd.read_csv("trade-volume.csv", parse_dates=['time'])
dat3 = pd.read_csv("miners-revenue.csv", parse_dates=['time'])


dat2['timeDiff'] = (dat2['time'] - dat2['time'][0]).astype('timedelta64[D]')
dat3['timeDiff'] = (dat3['time'] - dat3['time'][0]).astype('timedelta64[D]')

fig, ax = plt.subplots()

ax.plot(dat2['timeDiff'], dat2['Value']/dat2['Value'].values.max())
ax.plot(dat3['timeDiff'], dat3['Value']/dat3['Value'].values.max())

plt.show()

Upvotes: 3

Related Questions