How to iterate over every data entry and subtract each one from another? [Python]

Question

So i have the following (i'm working with pandas btw.) when i print the head. As you can see i already get back the first entry of every day.

                     first
Timestamp              
2021-03-11  596319.8125
2021-03-12  596349.5625
2021-03-13  602205.1875
2021-03-14  609445.5625
2021-03-15  609479.6250

And now i want to get the yield from every day, which means i have to subtract the value (in this case Energy = first) of day 1 from day 2, and than the value of day 2 from day 3 and so on and so forth.

Here's my code so far, i know this only can be a noob problem but i'm very new to programming so i'm thankful for every answer.

  import os
    import glob
    import pandas as pd
    
    # set working directory
    os.chdir("Path to CSVs")
    
    # find all csv files in the folder
    # use glob pattern matching -> extension = 'csv'
    # save result in list -> all_filenames
    extension = 'csv'
    all_filenames = [i for i in glob.glob('*.{}'.format(extension))]
    # print(all_filenames)
    
    # combine all files in the list
    combined_csv = pd.concat([pd.read_csv(f, sep=';') for f in all_filenames])
    # Format CSV
    # Transform Timestamp column into datetime
    combined_csv['Timestamp'] = pd.to_datetime(combined_csv.Timestamp)
    # Read out first entry of every day of every month
    combined_csv = combined_csv.resample('D', on='Timestamp')['HtmDht_Energy'].agg(['first'])
    # To get the yield of day i have to subtract day 2 HtmDht_Energy - day 1 HtmDht_Energy
    #
    #
    #
    #
    print(combined_csv.head())
    # export as excel file
    # combined_csv.to_excel('energy_data.xlsx', index=False)

After i tried the solution from below with the following code:

 import os
    import glob
    import pandas as pd
    
    # set working directory
    os.chdir("C:/Users/winklerm/OneDrive - SOLID Solar Energy Systems GmbH/Desktop/CSVs")
    
    # find all csv files in the folder
    # use glob pattern matching -> extension = 'csv'
    # save result in list -> all_filenames
    extension = 'csv'
    all_filenames = [i for i in glob.glob('*.{}'.format(extension))]
    # print(all_filenames)
    
    # combine all files in the list
    combined_csv = pd.concat([pd.read_csv(f, sep=';') for f in all_filenames])
    # Format CSV
    # Transform Timestamp column into datetime
    combined_csv['Timestamp'] = pd.to_datetime(combined_csv.Timestamp)
    # Read out first entry of every day of every month
    combined_csv = combined_csv.resample('D', on='Timestamp')['HtmDht_Energy'].agg(['first'])
    # To get the yield of day i have to subtract day 2 HtmDht_Energy - day 1 HtmDht_Energy
    combined_csv.assign(tagesertrag=combined_csv["first"]-combined_csv["first"].shift())
    #
    #
    #
    print(combined_csv.head())
    # export as excel file
    # combined_csv.to_excel('energy_data.xlsx', index=False)

But my print still looks like this:

                  first
Timestamp              
2021-03-11  596319.8125
2021-03-12  596349.5625
2021-03-13  602205.1875
2021-03-14  609445.5625
2021-03-15  609479.6250

Process finished with exit code 0

Rob Raymond · Accepted Answer

This is a simple case of subtracting shift() value

df = pd.read_csv(io.StringIO(""" Timestamp                    first
2021-03-11  596319.8125
2021-03-12  596349.5625
2021-03-13  602205.1875
2021-03-14  609445.5625
2021-03-15  609479.6250"""), sep="\s+")

df.assign(yieldc=df["first"]-df["first"].shift())

	Timestamp	first	yieldc
0	2021-03-11	596320	nan
1	2021-03-12	596350	29.75
2	2021-03-13	602205	5855.62
3	2021-03-14	609446	7240.38
4	2021-03-15	609480	34.0625

How to iterate over every data entry and subtract each one from another? [Python]

Answers (1)

Related Questions