Reputation: 1117
In Polars we can select columns by their type -- see e.g. How to select columns by data type in Polars?
However, I cannot figure out how to select datetime
columns. From Polars' tutorial we have this example:
import polars as pl
import datetime at dt
df_trades = pl.DataFrame(
{
"time": [
dt.datetime(2020, 1, 1, 9, 1, 0),
dt.datetime(2020, 1, 1, 9, 1, 0),
dt.datetime(2020, 1, 1, 9, 3, 0),
dt.datetime(2020, 1, 1, 9, 6, 0),
],
"stock": ["A", "B", "B", "C"],
"trade": [101, 299, 301, 500],
}
)
The column types are printed like this:
┌─────────────────────┬───────┬───────┐
│ time ┆ stock ┆ trade │
│ --- ┆ --- ┆ --- │
│ datetime[μs] ┆ str ┆ i64 │
For the stock
column we can use
df_trades.select(pl.col(pl.String))
But what is the type for the time
column? Using dt.datetime
instead of pl.String
gives an error:
TypeError: argument 'name': 'type' object cannot be converted to 'PyString'
Upvotes: 2
Views: 1243
Reputation: 14670
You can select polars columns by datatypes as you did with pl.String
, but you have to pass a polars
datatype to make that work, not a python datetime.datetime
value/object.
Below we select all datetime datatypes that have "us"
resolution by passing the pl.Datetime("us")
datatype.
import polars as pl
from datetime import datetime
df_trades = pl.DataFrame(
{
"time1": [
datetime(2020, 1, 1, 9, 1, 0),
datetime(2020, 1, 1, 9, 1, 0),
datetime(2020, 1, 1, 9, 3, 0),
datetime(2020, 1, 1, 9, 6, 0),
],
"time2": [
datetime(2020, 1, 1, 9, 1, 0),
datetime(2020, 1, 1, 9, 1, 0),
datetime(2020, 1, 1, 9, 3, 0),
datetime(2020, 1, 1, 9, 6, 0),
],
"stock": ["A", "B", "B", "C"],
"trade": [101, 299, 301, 500],
}
)
df_trades.select(
pl.col(pl.Datetime("us"))
)
shape: (4, 2)
┌─────────────────────┬─────────────────────┐
│ time1 ┆ time2 │
│ --- ┆ --- │
│ datetime[μs] ┆ datetime[μs] │
╞═════════════════════╪═════════════════════╡
│ 2020-01-01 09:01:00 ┆ 2020-01-01 09:01:00 │
│ 2020-01-01 09:01:00 ┆ 2020-01-01 09:01:00 │
│ 2020-01-01 09:03:00 ┆ 2020-01-01 09:03:00 │
│ 2020-01-01 09:06:00 ┆ 2020-01-01 09:06:00 │
└─────────────────────┴─────────────────────┘
Upvotes: 1
Reputation: 24582
Starting from Polars 0.18.1 you can use polars.selectors.datetime
selector to select only datetime columns.
>>> import polars as pl
>>> import polars.selectors as cs
>>> import datetime as dt
>>>
>>> df_trades = pl.DataFrame(
... {
... "time": [
... dt.datetime(2020, 1, 1, 9, 1, 0),
... dt.datetime(2020, 1, 1, 9, 1, 0),
... dt.datetime(2020, 1, 1, 9, 3, 0),
... dt.datetime(2020, 1, 1, 9, 6, 0),
... ],
... "stock": ["A", "B", "B", "C"],
... "trade": [101, 299, 301, 500],
... }
... )
>>>
>>> print(df_trades.select(cs.datetime()))
shape: (4, 1)
┌─────────────────────┐
│ time │
│ --- │
│ datetime[μs] │
╞═════════════════════╡
│ 2020-01-01 09:01:00 │
│ 2020-01-01 09:01:00 │
│ 2020-01-01 09:03:00 │
│ 2020-01-01 09:06:00 │
└─────────────────────┘
Upvotes: 1