Python Pandas combining timestamp columns and fillna in read_csv

Question

I'm reading a csv file with Pandas. The format is:

Date        Time        x1      x2     x3     x4    x5
3/7/2012    11:09:22    13.5    2.3    0.4    7.3   6.4
                        12.6    3.4    9.0    3.0   7.0
                        3.6     4.4    8.0    6.0   5.0
                        10.6    3.5    1.0    3.0   8.0
...
3/7/2012    11:09:23    10.5    23.2   0.3    7.8   4.4
                        11.6    13.4   19.0   13.0  17.0
...

As you can see, not every row has a timestamp. Every row without a timestamp is from the same 1-second interval as the closest row above it that does have a timestamp.

I am trying to do 3 things: 1. combine the Date and Time columns to get a single timestamp column. 2. convert that column to have units of seconds. 3. fill empty cells to have the appropriate timestamp. The desired end result is an array with the timestamp, in seconds, at each row.

I am not sure how to quickly convert the timestamps into units of seconds, other then to do a slow for loop and use the Python builtin time.mktime method.

Then when I fill in missing timestamp values, the problem is that the cells in the Date and Time columns which did not have a timestamp each get a "nan" value and when merged give a cell with the value "nan nan". Then when I use the fillna() method, it doesn't interpret "nan nan" as being a nan.

I am using the following code to get the problem result (not including the part of trying to convert to seconds):

import pandas as pd
df = pd.read_csv('file.csv', delimiter=',', parse_dates={'CorrectTime':[0,1]}, usecols=[0,1,2,4,6], names=['Date','Time','x1','x3','x5'])
df.fillna(method='ffill', axis=0, inplace=True)

Thanks for your help.

Python Pandas combining timestamp columns and fillna in read_csv

Answers (1)

Related Questions