Sergio Espejo
Sergio Espejo

Reputation: 85

How to tell the hours elapsed in a csv file?

I want a column that tells me the time elapsed since he first row (5/1/2002 at 6:00AM), to the last (11/20/2006 at 2:00PM). How can I create an extra column that tells me the hours passed since the beginning of 5/1/2002? Here is my dataframe:

         Date  Time (HHMM)         Site  Plot  Replicate  Temperature  \
1      5/1/2002          600  Barre Woods    21          7          4.5
2      5/1/2002          600  Barre Woods    31          9          6.5
3      5/1/2002          600  Barre Woods    10          2          5.3
4      5/1/2002          600  Barre Woods     2          1          4.0
5      5/1/2002          600  Barre Woods    13          4          5.5
6      5/1/2002          600  Barre Woods    11          3          5.0
7      5/1/2002          600  Barre Woods    28          8          5.0
8      5/1/2002          600  Barre Woods    18          6          4.5
9      5/1/2002         1400  Barre Woods     2          1         10.3
10     5/1/2002         1400  Barre Woods    31          9          9.0
11     5/1/2002         1400  Barre Woods    13          4         11.0
12     5/1/2002         1400  Barre Woods    18          6          6.5
13     5/1/2002         1400  Barre Woods    11          3         10.3
14     5/1/2002         1400  Barre Woods    10          2         10.5
15     5/1/2002         1400  Barre Woods    28          8         10.3
16     5/1/2002         1400  Barre Woods    21          7         10.5
17     5/1/2002         1400  Barre Woods    16          5         10.3
18    5/22/2002          600  Barre Woods    13          4          6.5
19    5/22/2002          600  Barre Woods    18          6          5.8
20    5/22/2002          600  Barre Woods     2          1          5.5
...          ...          ...          ...   ...        ...          ...
710  11/20/2006         1400  Barre Woods    31          9          7.4

Upvotes: 1

Views: 181

Answers (1)

Laurent LAPORTE
Laurent LAPORTE

Reputation: 22952

Simple:

  • read the file,
  • parse the date and time,
  • calculate the delta with the first date/time,
  • write the result.

Here is an implementation using file-like objects for the demo:

import datetime
import io

data = """\
         Date  Time (HHMM)         Site  Plot  Replicate  Temperature
1      5/1/2002          600  Barre Woods    21          7          4.5
2      5/1/2002          600  Barre Woods    31          9          6.5
3      5/1/2002          600  Barre Woods    10          2          5.3
710  11/20/2006         1400  Barre Woods    31          9          7.4
"""

date_fmt = "%m/%d/%Y %H%M"
fisrt_date = None

with io.StringIO(data) as src_file, io.StringIO(data) as dst_file:

    # copy the header
    dst_file.write(next(src_file))

    for line in src_file:
        parts = line.strip().split()
        if not parts:
            dst_file.write(line)
            continue

        timestamp = parts[1] + " " + parts[2]
        curr_date = datetime.datetime.strptime(timestamp, date_fmt)
        if fisrt_date is None:
            fisrt_date = curr_date
        delta = curr_date - fisrt_date
        dst_file.write(line.strip() + "  " + str(delta) + "\n")

    print(dst_file.getvalue())

You get:

       Date  Time (HHMM)         Site  Plot  Replicate  Temperature
1      5/1/2002          600  Barre Woods    21          7          4.5  0:00:00
2      5/1/2002          600  Barre Woods    31          9          6.5  0:00:00
3      5/1/2002          600  Barre Woods    10          2          5.3  0:00:00
710  11/20/2006         1400  Barre Woods    31          9          7.4  1664 days, 8:00:00

Of course, you can change the format of the delta values.

Upvotes: 1

Related Questions