pratyada
pratyada

Reputation: 79

Convert time column in pandas from float to actual time value

PROBLEM Statement #1 (EASY) I wanted to convert the time column of my dataframe to actual time value like "12:01:45" hh:mm:ss

Have tried : df_new["time_1"] = pd.to_datetime(df_new.TIME) This has given me a new row - but its showing the date value not the time. :(

then I tried df_new['Time_1'] = pd.to_datetime(df_new['TIME'], format='%H%M').dt.time But output is : ValueError: time data '0' does not match format '%H%M' (match)

expected >> time_1 in hh:mm:ss

sample data :

    PRIMARY_KEY     DATE    TIME        LATITUDE    LONGITUDE   ALTITUDE    DATE_1  time_1
0   b04acedfaa0a    43900   0.499964    43.738003   -79.815340  221.0   2020-03-10  1970-01-01
1   efc27d128dac    43900   0.500140    43.736450   -79.813030  221.0   2020-03-10  1970-01-01
2   19b344a8ef88    43900   0.500293    43.730568   -79.821304  216.0   2020-03-10  1970-01-01
3   0c67c3eefb73    43900   0.500431    43.742660   -79.831710  235.0   2020-03-10  1970-01-01

PROBLEM Statement #2 (COMPLEX & Challenging) After converting to actual time, I want to perform some action.

Input will be a primary key value (say 19b344a8ef88)

Operations, I want to make the pairs of primary keys which satisfies two condition :

a) near 50 mtrs distance (by using LAT & LONG) and

b) At nearly same time, say 10 mins of time difference

Output that I am expecting is in the form of :

Primary Key     Instances
b04acedfaa0a.   5
efc27d128dac.   3

means total 5 times b04acedfaa0a primary key was 50 mtrs near and at the same time with 19b344a8ef88.

Hope I am able to explain myself...........

Upvotes: 1

Views: 529

Answers (2)

Zolzaya Luvsandorj
Zolzaya Luvsandorj

Reputation: 635

For your first question, quick fix is to format your data in csv before you import. Before you load your csv into python, you could try changing 'DATE' column to 'Short date' and 'TIME' column to 'Time' in excel and save the csv (rename it so that you don't overwrite your original file) and then try importing in python?

Probably there's a clever way to do this in python.

For your second question, you may be able to get what you are after if you follow these steps:

  1. Join dataframes using 'Primary key' using pd.merge()
  2. Filter joined dataset to satisfy your extra conditions using df.query() (i.e. near 50 mtrs distance & max 10 mins of time difference)
  3. Get counts of each primary key to get your instances using df['Primary key'].value_counts()

Upvotes: 1

Andrew Gwilliam
Andrew Gwilliam

Reputation: 111

When you give to_datetime() a plain old integer or float value, it will see that value as the number of seconds since "the UNIX epoch", or 1 January 1970. That's why the values in time_1 are all 1970-01-01. You can read more about it here if you're interested.

To solve your problem, it looks like there are two thing you'll need to address.

First, be sure that the value you're storing in TIME is the right time from the Unix epoch or find another way to store the time.

Once that's worked out, you're storing a Timestamp in time_1, so you can get the time part of it by using something like df_new["time_1"].dt.time, which should give you something in the form "hh:mm:ss". You can store those values in another column.

Upvotes: 1

Related Questions