user4687733
user4687733

Reputation:

Python - Iterrow through pandas dataframe and assign and conditionally update datetime variable

I'm a Python novice and was wondering if anyone could help me.

I want to iterate through datetime column in a pandas data frame, while for each iteration update a variable with the most recent time. Let's assume this is my data:

    Time
06:12:50
06:13:51
06:13:51
06:13:50
06:14:51
06:14:49

For my result, I want it to look something like this:

RecentTime:
   06:12:50
   06:13:51
   06:13:51
   06:13:51
   06:14:51
   06:14:51

I think the code should look something like this, but I have had trouble with it and can't figure out why. This is my code:

RecentTime = [] # Store list of most recent time for each row
Index: None       # Create empty variable
# Loop through 
for index, row in data.iterrows():
    index = row['Time']   # Save value as index
    if index >= row['Time']: # If time is greater than current row
    index = row['Time']
        RecentTime.append(index) # Append most recent variable into list
    else:
        continue

For some reason, this is my result:

RecentTime
  06:12:50
  06:13:51
  06:13:51
  06:13:50
  06:14:51
  06:14:49

Upvotes: 1

Views: 2425

Answers (1)

Michael
Michael

Reputation: 13914

Every time through the loop you are writing over the variable index before checking the inequality, so

if index >= row['Time']:

is not only always True, but you always set index equal to the current time prior to checking this inequality. Based on the pattern in your description where in the desired result times are never earlier than in the previous row, I think you're looking for something more like this:

RecentTime = [] # Store list of most recent time for each row
priortime = None
# Loop through 
for index, row in data.iterrows():
    currenttime = row['Time']
    if priortime is None:
        priortime = currenttime

    if priortime > currenttime: # If prior time is greater than current row
        currenttime = priortime

    priortime = currenttime    
    RecentTime.append(currenttime)

Finally, the line Index: None should throw the error SyntaxError: invalid syntax. Assuming you want to assign a value to a variable use Index = None. index, lower case, already is used in the dataframe loop to reference the index value in the dataframe, so even though your capitalized Index variable would not conflict, you should name it something else.

Upvotes: 1

Related Questions