Reputation: 101
I have a pandas dataframe with multiple columns containing timestamp values in string format. The values are like '0D_0H_1M_35S_0MS'. I would like to convert all these column values into milliseconds
Upvotes: 1
Views: 152
Reputation: 862481
First replace string to format for possible convert to timedeltas by Series.replace
with to_timedelta
, convert to seconds by Series.dt.total_seconds
and multiple by 1000
for miliseconds per columns in custom function in DataFrame.apply
:
cols = ['date','date1']
def parse(x):
s = x.replace({'D_':' Days ','H_':':','M_':':','S_':'.','MS':''}, regex=True)
return pd.to_timedelta(s).dt.total_seconds().mul(1000).astype(int)
df[cols] = df[cols].apply(parse)
print (df)
date date1
0 95000 95000
1 155000 155000
If format is never change is possible extract digits by Series.str.extractall
, then reshape by Series.unstack
, convert to integers and multiple by unit Series, last sum values.
Solution is inspired by deleted answer.
df = pd.DataFrame({'date':['0D_0H_1M_35S_0MS','0D_0H_2M_35S_0MS'],
'date1':['0D_0H_1M_35S_0MS','0D_0H_2M_35S_0MS']})
cols = ['date','date1']
def parse(x):
a = pd.Series([86400000, 3600000, 60000, 1000, 1])
return x.str.extractall('(\d+)')[0].unstack().astype(int).mul(a).sum(axis=1)
df[cols] = df[cols].apply(parse)
print (df)
date date1
0 95000 95000
1 155000 155000
Upvotes: 3