Reputation: 83
I have a Polars data frame in the following format:
shape: (500_259, 2)
┌───────────┬──────────┐
│ ms_of_day ┆ date │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═══════════╪══════════╡
│ 0 ┆ 20240729 │
│ 300000 ┆ 20240729 │
│ 600000 ┆ 20240729 │
│ 900000 ┆ 20240729 │
│ 1200000 ┆ 20240729 │
│ … ┆ … │
│ 85200000 ┆ 20240729 │
│ 85500000 ┆ 20240729 │
│ 85800000 ┆ 20240729 │
│ 86100000 ┆ 20240729 │
│ 86400000 ┆ 20240729 │
└───────────┴──────────┘
ms_of_day represents the milliseconds since midnight so 30000 = 12:05 AM, 60000 = 12:10 AM, etc. The date column is obviously in '%Y%m%d' format. The date column is straight forward enough to convert to a datetime column with...
.with_columns(
new_date = pl.col('date').cast(pl.String).str.to_datetime('%Y%m%d'))
shape: (500_259, 3)
┌───────────┬──────────┬─────────────────────┐
│ ms_of_day ┆ date ┆ new_date │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ datetime[μs] │
╞═══════════╪══════════╪═════════════════════╡
│ 0 ┆ 20240729 ┆ 2024-07-29 00:00:00 │
│ 300000 ┆ 20240729 ┆ 2024-07-29 00:00:00 │
│ 600000 ┆ 20240729 ┆ 2024-07-29 00:00:00 │
│ 900000 ┆ 20240729 ┆ 2024-07-29 00:00:00 │
│ 1200000 ┆ 20240729 ┆ 2024-07-29 00:00:00 │
│ … ┆ … ┆ … │
│ 85200000 ┆ 20240729 ┆ 2024-07-29 00:00:00 │
│ 85500000 ┆ 20240729 ┆ 2024-07-29 00:00:00 │
│ 85800000 ┆ 20240729 ┆ 2024-07-29 00:00:00 │
│ 86100000 ┆ 20240729 ┆ 2024-07-29 00:00:00 │
│ 86400000 ┆ 20240729 ┆ 2024-07-29 00:00:00 │
└───────────┴──────────┴─────────────────────┘
but I'm struggling with how to incorporate ms_of_day column into the new_date column so that the time is also represented correctly. For example, the second row should be 2024-07-29 00:05:00, third row 2024-07-29 00:10:00, etc.
Upvotes: 0
Views: 28