Reputation: 134
I'm learning python and with it pandas and some tools about Data Science. Doing the exercises of a book I wrote the above code on IPython but I receive an error message when the block is executed:
for i in range(len(df1)):
if (df1['Temperature'][i]-df1['Temperature'][i-1]) > 0.1:
print (df1['Temperature'][i])
Traceback (most recent call last):
File "<ipython-input-140-9f31dd23b324>", line 2, in <module>
if (df1['Temperature'][i]-df1['Temperature'][i-1]) > 0.1:
File "D:\Programas\Anaconda\lib\site-packages\pandas\core\series.py", line 766, in __getitem__
result = self.index.get_value(self, key)
File "D:\Programas\Anaconda\lib\site-packages\pandas\core\indexes\base.py", line 3103, in get_value
tz=getattr(series.dtype, 'tz', None))
File "pandas\_libs\index.pyx", line 106, in pandas._libs.index.IndexEngine.get_value
File "pandas\_libs\index.pyx", line 114, in pandas._libs.index.IndexEngine.get_value
File "pandas\_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 964, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: -1
Where df1['Temperature'] is a Data Frame such that Temperature is one of its columns. The code intending to compare two consecutive values of that column and verify the numeric difference between them and print the temperature given a statement. What am I doing wrong?
Upvotes: 0
Views: 65
Reputation: 57105
As a rule, you should not use loops like that in Pandas. Pandas works best when your code is vectorized:
big_difference = (df1["Temperature"] - df1["Temperature"].shift(-1)) > 0.1
print(df1[big_difference]["Temperature"])
Upvotes: 1
Reputation: 18218
In statement below:
if (df1['Temperature'][i]-df1['Temperature'][i-1]) > 0.1:
when i
is 0 then, in df1['Temperature'][i-1]
the value of i-1
becomes -1
index which is the error message trying to tell.
One way may be to change the range such that i
starts from 1
since, it looks for i-1
anyways so, it may not skip 0
index. You can try:
for i in range(1, len(df1)):
Note: you mentioned comparing the consecutive rows, may be you can use absolute value if you do not care about whether it is increasing or decreasing.
Upvotes: 1