Reputation: 463
I have a dataframe, myDF, one column of which I wish to set to zero using a combination of conditions from other columns and indexing with a second dataframe, criteriaDF.
myDF.head():
DateTime GrossPowerMW USDateTime_string DateTime_timestamp \
0 01/01/1998 00:00 17.804 01/01/1998 00:00 1998-01-01 00:00:00
1 01/01/1998 01:00 18.751 01/01/1998 01:00 1998-01-01 01:00:00
2 01/01/1998 02:00 20.501 01/01/1998 02:00 1998-01-01 02:00:00
3 01/01/1998 03:00 22.222 01/01/1998 03:00 1998-01-01 03:00:00
4 01/01/1998 04:00 24.437 01/01/1998 04:00 1998-01-01 04:00:00
Month Day Hour GrossPowerMW_Shutdown
0 1 3 0 17.804
1 1 3 1 18.751
2 1 3 2 20.501
3 1 3 3 22.222
4 1 3 4 24.437
criteriaDF:
STARTTIME ENDTIME
Month
1 9.0 12.0
2 9.0 14.0
3 9.0 14.0
4 9.0 14.0
5 9.0 13.0
6 9.0 14.0
7 9.0 13.0
8 9.0 12.0
9 9.0 14.0
10 9.0 13.0
11 9.0 13.0
12 9.0 11.0
myDF is then run through the following for loop:
month = 1
for month in range (1, 13):
shutdown_hours = range(int(criteriaDF.iloc[month]['STARTTIME']), int(criteriaDF.iloc[month]['ENDTIME']))
myDF.loc[(myDF["Month"].isin([month])) & (myDF["Hour"].isin(shutdown_hours)) & (myDF["Day"].isin(shutdown_days)), "GrossPowerMW_Shutdown"] *= 0
month = month + 1
This gives the below error:
Traceback (most recent call last):
File "", line 1, in runfile('myscript.py', wdir='C:myscript')
File "C:\ProgramData\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 880, in runfile execfile(filename, namespace)
File "C:\ProgramData\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 87, in execfile exec(compile(scripttext, filename, 'exec'), glob, loc)
File "myscript.py", line 111, in gross_yield, curtailed_yield, shutdown_loss, df_testing = calculate_loss(input_file, input_shutdownbymonth, shutdown_days) #Returning df for testing/interrogation only. Delete once finished.
File "myscript.py", line 79, in calculate_loss shutdown_hours = range(int(criteriaDF.iloc[month]['STARTTIME']), int(criteriaDF.iloc[month]['ENDTIME']))
File "C:\ProgramData\Anaconda2\lib\site-packages\pandas\core\indexing.py", line 1328, in __getitem__ return self._getitem_axis(key, axis=0)
File "C:\ProgramData\Anaconda2\lib\site-packages\pandas\core\indexing.py", line 1749, in _getitem_axis self._is_valid_integer(key, axis)
File "C:\ProgramData\Anaconda2\lib\site-packages\pandas\core\indexing.py", line 1638, in _is_valid_integer raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds
However the script works if I set
month = 0
for month in range (0, 12)
However this does not fit with my dataframe's indexing on the Column ['Month'] which runs 1 - 12 not 0 -> 11.
To confirm my understanding is that
range (1, 13)
returns
[1,2,3,4,5,6,7,8,9,10,11,12].
I have also tried manually running the code line by line with the code in the for loop with month = 12. So I am uncertain why using month in rage (1, 13) is not working, noting that 12 is the highest integer in the list range (1,13).
What is the error in my code or my approach?
Upvotes: 1
Views: 8568