Python: Create Timerseries Dummy Variable based on DateTime List

Question

I have two dataframes. my_index is containing the data for further analysis based on minute data my_index['TIME'] in format yyyy-mm-dd hh:mm:ss (total length 100.000 rows). The other dataframe release_plaincontains specific datetimes (same Time format) within the timespan of the other one (length 70). Both DateTimes are string format

Now I want to match the dates of release_plain against those with my_index and when there is a match write a 1 in a new column my_index['Dummy'] for a range 5 mins before and after the match (so in total eleven 1ns).

What I have so far:

release_plain = pd.read_csv(infile)
my_index = pd.read_csv(index_file)

datetime = release_plain['Date'].astype(str) + ' ' + release_plain['Time'].astype(str)
list_datetime = list(datetime)


for date_of_interest in list_datetime:
    if my_index.loc[my_index['TIME']==date_of_interest]:
        my_index['Dummy'] == 1
    else:
        my_index['Dummy'] == 0

But this returns:

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

And moreover, from what I have this will only create 1 Dummy for the specific DateTime but not the Dummy Range 5 mins before and after the Event.

vash_the_stampede · Accepted Answer

if my_index.loc[my_index['TIME']==date_of_interest]

Your brackets here don't seem to make sense, you are passing an evaluation as a key pretty much that reads if my_index.loc[True]: or if my_index.loc[False] not sure if you have keys that are True and False but I'm expecting that you don't, perhaps you meant this:

if my_index.loc[my_index['TIME']] == date_of_interest

Python: Create Timerseries Dummy Variable based on DateTime List

Answers (1)

Related Questions