Reputation: 49
I have an excel file and I wanna plot a specific range of rows in a specific column I could get the access to the column and rows that I wanted but whenever I run my code this error message turns up
x and y must have same first dimension, but have shapes (23,) and (1,) the first row of the consumption sheet is :
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import math
start_date = "2017-07-24 00:00:00"
end_date = "2019-03-09 23:00:00"
data_generation2 = pd.read_excel(r"C:\\Users\\user\\Desktop\\Master\\Thesis\\Tarek\\Parent.xlsx", index_col="Timestamp")
data_consumption2 = pd.read_excel(r"C:\\Users\\user\\Desktop\\Master\\Thesis\\Tarek\\Parent.xlsx", index_col="Timestamp", sheetname="Consumption")
data_generation = data_generation2.loc[start_date:end_date]
data_consumption = data_consumption2.loc[start_date:end_date]
data_consumption2["Timestamp"] =
pd.to_datetime(data_consumption2["Timestamp"], format="%d/%m/%y %H%M:%S")
df_to_plot = data_consumption2.loc[(data_consumption2.Timestamp >= start_date) & (data_consumption2.Timestamp < end_date)]
df_to_plot.set_index(df_to_plot.Timestamp)
df_to_plot = df_to_plot[df_to_plot.columns[1:]]
I wanna the output to be the plot of the first 720 rows in the column 2053G
with respect to the time column.
Upvotes: 0
Views: 169
Reputation: 5502
I will try to give you some tips.
Once you have read your data, you should transform the date
column (string) to a datetime
format. The to_datetime
function do the job (doc).
Then, you need to filter by date
. As you have done, you can use loc
to do that.
Now, we want to plot the dataframe. Pandas module provides a plot
method (doc). Let's try to use it. By default, it will plot all columns. In our case, we want to plot consumption over date. One way to do is to transform the date
columns as an index. The set_index
method offer you this possibily (doc).
Once it's done, you are ready to make the plot !
Here is an exemple assuming having the following dataframe:
import pandas as pd
import matplotlib.pyplot as plt
start_date = "2019-01-01 00:00:00"
end_date = "2019-01-02 00:00:00"
data_generation2 = pd.read_excel("your_path")
# transform date column (type string) to datetime type
data_generation2["Date"] = pd.to_datetime(data_generation2["Date"], format="%d/%m/%y %H%M:%S")
print(data_generation2)
# Date House1 House2 House3 House4 ... House17 House18 House19 House20
# 0 2019-01-01 00: 00: 00 5.087228 6.416912 0.997723 4.094832 ... 0.462354 5.105688 7.504992 4.688377
# 1 2019-01-01 00: 01: 00 5.343098 3.205145 8.874018 0.567703 ... 5.593759 3.584948 5.503817 7.027469
# 2 2019-01-01 00: 02: 00 5.894551 6.411896 3.208524 6.582306 ... 9.149066 5.485656 0.820700 6.391030
# 3 2019-01-01 00: 03: 00 5.674840 6.558950 3.043355 6.759310 ... 2.270176 3.513524 1.833294 9.889432
# 4 2019-01-01 00: 04: 00 9.022593 4.088398 6.082084 8.095797 ... 9.270887 4.947194 2.505969 7.396191
# 5 2019-01-01 00: 05: 00 9.217690 1.950666 5.684551 7.782337 ... 6.990156 1.430377 2.269080 7.343031
# ... ...
# 715 2019-01-30 01: 09: 00 6.892563 0.553806 0.744839 3.151984 ... 9.418145 2.575782 0.801205 4.290434
# 716 2019-01-30 02: 00: 00 0.894091 8.872986 8.445927 2.781972 ... 0.832707 4.264321 6.972225 4.155494
# 717 2019-01-30 02: 01: 00 0.079396 5.270916 0.452875 5.248232 ... 5.331906 4.026245 6.598595 1.637325
# 718 2019-01-30 02: 02: 00 5.287946 4.745940 8.479177 7.713943 ... 0.642650 9.735159 2.499371 7.824247
# 719 2019-01-30 02: 03: 00 1.376087 6.262784 5.621571 4.937615 ... 0.801079 5.506654 6.878042 2.430062
# [720 rows x 23 columns]
# Filter according the date interval
df_to_plot = data_generation2.loc[(data_generation2.Date >= start_date) & (data_generation2.Date < end_date)]
# Transform the columns of date as index
df_to_plot.set_index(df_to_plot.Date)
df_to_plot = df_to_plot[df_to_plot.columns[1:]] # Remove date columns
print(df_to_plot)
# [720 rows x 21 columns]
# House1 House2 House3 ... House17 House18 House19 House20
# 0 5.087228 6.416912 0.997723 ... 0.462354 5.105688 7.504992 4.688377
# 1 5.343098 3.205145 8.874018 ... 5.593759 3.584948 5.503817 7.027469
# 2 5.894551 6.411896 3.208524 ... 9.149066 5.485656 0.820700 6.391030
# 3 5.674840 6.558950 3.043355 ... 2.270176 3.513524 1.833294 9.889432
# 4 9.022593 4.088398 6.082084 ... 9.270887 4.947194 2.505969 7.396191
# 5 9.217690 1.950666 5.684551 ... 6.990156 1.430377 2.269080 7.343031
# ...
# 20 8.908322 7.452523 9.046442 ... 9.786092 6.370266 5.635464 8.560888
# 21 7.666228 1.294614 2.034965 ... 0.091431 3.210088 9.038868 1.492585
# 22 8.266986 6.209395 9.689687 ... 6.500988 6.133358 2.237199 7.289748
# 23 2.687560 6.852061 5.136214 ... 0.815944 8.267767 0.244232 7.705181
# [24 rows x 20 columns]
# create a figure (2 x 2)
fig, axes = plt.subplots(nrows=2, ncols=2)
# Daily chart
ax = df_to_plot.plot(ax=axes[0, 0], title="Daily consumption", legend=False)
ax.set_xlabel("Days")
ax.set_ylabel("Consumption")
# Cumulative chart
ax = df_to_plot.sort_index().cumsum().plot(ax=axes[0, 1], title="Cumuative consumption")
ax.set_xlabel("Days")
ax.set_ylabel("Cumulative consumption")
ax.legend(ncol=3)
# Histogram chart house 1
ax = df_to_plot.House1.plot(ax=axes[1, 0], kind="bar",
title="House 1 daily consumption", rot=1)
ax.set_xlabel("Days")
ax.set_ylabel("Consumption")
# Pie chart
df_to_plot.sum().sort_values().plot(
ax=axes[1, 1], kind="pie", title="Pie chart consumption")
# Main title
fig.suptitle("From " + start_date +' to ' +end_date + ' summary')
plt.show()
The main task you have to do is to rebuilt your dataframe. I can not to it for you. Once it's done, the previous code can give you some ideas.
Upvotes: 1