Reputation: 10974
I have a pandas.Series
that is an integer with 5 digits. The first 3 digits are days from an epoch, and the last 2 are half-hours. I want to slice the integer series, so that I have two Series
with the first 3 digits and the last 2 digits respectively.
Here is one way to do it, that requires two type conversions:
import pandas as pd
days_hours = pd.Series(npr.randint(low=1e4, high=99999, size=1000))
days = days_hours.astype('str').str.slice(start=0, stop=3).astype('int64')
hours = days_hours.astype('str').str.slice(start=3, stop=5).astype('int64')
This is very time-consuming given that on average my Series
are 25e6 rows each (there are 6 such Series
s). Is there a way that I can avoid the type conversions?
I tried an alternate solution which involved apply
ing a lambda
function to each element of the Series
but that took longer.
Upvotes: 0
Views: 1555
Reputation: 37053
It will be much quicker to do these operations arithmetically using integer division and the modulo operator:
days = days_hours // 100
hours = days_hours % 100
Upvotes: 4