Reputation: 967
My dataframe can be reproduced like this:
import pandas as pd
link = 'https://raw.githubusercontent.com/timothylombard/RSB/master/RSBdata.csv'
df = pd.read_csv(link)
df['Date'] = pd.to_datetime(df['Date'])
df.set_index('Date', inplace=True)
The picture of this data frame looks like this-
What I want to do
Then I want to compare and report differences in column values between the change row and prior to the change row. Looking at the picture, I want to compare data indexed on 2018-01-13 with 2017-01-10
So far, I've been able to use .shift to add a new column
df['PriorConnections'] = df['Connections'].shift()
Then add another column with values of difference as in:
df['Connections_Diff'] = df['Connections'] - df['PriorConnections']
I am also able to identify the change row by using-
cr = df.loc[df.Connections_Diff > 0]
df.loc[cr]
How do I find the row before df.loc[cr] ?
Upvotes: 1
Views: 537
Reputation: 1092
Could you check this way?
>> df = pd.DataFrame({'Col1': [10, 20, 10, 15, 15],
'Col2': [13, 23, 18, 33, 48],
'Col3': [17, 27, 22, 37, 52]})
>> series_to_check = df['Col1']
>> [(i, i-1) for i in range(1,len(series_to_check)-1) if series_to_check[i]!=series_to_check[i-1] ]
>> [(1, 0), (2, 1), (3, 2)]
>> # returns a list of tuples [(`changed_row_index`, `previous_row_index`)]
It essentially checks each item in the Series with its previous data.
Upvotes: 2
Reputation: 468
You can create a list with all indexes of the changed_rows.
So, you can do:
list_changed_rows = []
for i in range(1, df.shape[0]):
if df.iloc[i,2] != df.iloc[i-1,2]:
list_changed_rows.append(i)
To analyse them, you can just do:
for i in list_changed_rows:
row_before = df.iloc[[i-1]]
row_changed = df.iloc[[i]]
# code
Upvotes: 1