Reputation: 115
So i have the following (i'm working with pandas btw.) when i print the head. As you can see i already get back the first entry of every day.
first
Timestamp
2021-03-11 596319.8125
2021-03-12 596349.5625
2021-03-13 602205.1875
2021-03-14 609445.5625
2021-03-15 609479.6250
And now i want to get the yield from every day, which means i have to subtract the value (in this case Energy = first) of day 1 from day 2, and than the value of day 2 from day 3 and so on and so forth.
Here's my code so far, i know this only can be a noob problem but i'm very new to programming so i'm thankful for every answer.
import os
import glob
import pandas as pd
# set working directory
os.chdir("Path to CSVs")
# find all csv files in the folder
# use glob pattern matching -> extension = 'csv'
# save result in list -> all_filenames
extension = 'csv'
all_filenames = [i for i in glob.glob('*.{}'.format(extension))]
# print(all_filenames)
# combine all files in the list
combined_csv = pd.concat([pd.read_csv(f, sep=';') for f in all_filenames])
# Format CSV
# Transform Timestamp column into datetime
combined_csv['Timestamp'] = pd.to_datetime(combined_csv.Timestamp)
# Read out first entry of every day of every month
combined_csv = combined_csv.resample('D', on='Timestamp')['HtmDht_Energy'].agg(['first'])
# To get the yield of day i have to subtract day 2 HtmDht_Energy - day 1 HtmDht_Energy
#
#
#
#
print(combined_csv.head())
# export as excel file
# combined_csv.to_excel('energy_data.xlsx', index=False)
After i tried the solution from below with the following code:
import os
import glob
import pandas as pd
# set working directory
os.chdir("C:/Users/winklerm/OneDrive - SOLID Solar Energy Systems GmbH/Desktop/CSVs")
# find all csv files in the folder
# use glob pattern matching -> extension = 'csv'
# save result in list -> all_filenames
extension = 'csv'
all_filenames = [i for i in glob.glob('*.{}'.format(extension))]
# print(all_filenames)
# combine all files in the list
combined_csv = pd.concat([pd.read_csv(f, sep=';') for f in all_filenames])
# Format CSV
# Transform Timestamp column into datetime
combined_csv['Timestamp'] = pd.to_datetime(combined_csv.Timestamp)
# Read out first entry of every day of every month
combined_csv = combined_csv.resample('D', on='Timestamp')['HtmDht_Energy'].agg(['first'])
# To get the yield of day i have to subtract day 2 HtmDht_Energy - day 1 HtmDht_Energy
combined_csv.assign(tagesertrag=combined_csv["first"]-combined_csv["first"].shift())
#
#
#
print(combined_csv.head())
# export as excel file
# combined_csv.to_excel('energy_data.xlsx', index=False)
But my print still looks like this:
first
Timestamp
2021-03-11 596319.8125
2021-03-12 596349.5625
2021-03-13 602205.1875
2021-03-14 609445.5625
2021-03-15 609479.6250
Process finished with exit code 0
Upvotes: 0
Views: 59
Reputation: 31226
This is a simple case of subtracting shift()
value
df = pd.read_csv(io.StringIO(""" Timestamp first
2021-03-11 596319.8125
2021-03-12 596349.5625
2021-03-13 602205.1875
2021-03-14 609445.5625
2021-03-15 609479.6250"""), sep="\s+")
df.assign(yieldc=df["first"]-df["first"].shift())
Timestamp | first | yieldc | |
---|---|---|---|
0 | 2021-03-11 | 596320 | nan |
1 | 2021-03-12 | 596350 | 29.75 |
2 | 2021-03-13 | 602205 | 5855.62 |
3 | 2021-03-14 | 609446 | 7240.38 |
4 | 2021-03-15 | 609480 | 34.0625 |
Upvotes: 1