Shubham R
Shubham R

Reputation: 7644

Split dataframe column into different columns based on condition

i have a df

id   store_name   open_timings
1      abc        9 TO 1 -4 TO 11
2      bcd        8 TO 11 - 1 TO 3
3      xyz        8 TO 2
4      qwe        9 TO 2-4 TO 7

I want to reformat this df, such that, i create 3 more columns start_time,end_time,closed_time.

i.e

id   store_name   start_time   end_time    closed_time
1      abc         9             11           1-4
2      bcd         8              3           11-1
3      xyz         8              2           None
4      qwe         9              7           2-4

if possible can i convert this columns in 24 hr format,something like this.

id   store_name   start_time   end_time    closed_time
1      abc         9:00          11:00         13:00-16:00 
2      bcd         8:00          15:00         11:00 -13.00
3      xyz         8:00          14:00         nan
4      qwe         9:00          19:00         14:00-16:00 

The second part is optional,

Is there any way to get this?

i have a way to split the column open_timings into 2 by

df['temp1'], df['temp'] = df['open_timings'].str.split('-', 1).str

But cant get past that.

Upvotes: 2

Views: 927

Answers (1)

Bharath M Shetty
Bharath M Shetty

Reputation: 30605

You can do str.split i.e

ndf = df['open_timings'].str.split('TO',expand=True).rename(columns={0:"start_time", 1:'closed_time' , 2:'end_time'})
af = pd.concat([df,ndf],1)
  id store_name      open_timings start_time closed_time end_time
0   1        abc   9 TO 1 -4 TO 11         9        1 -4        11
1   2        bcd  8 TO 11 - 1 TO 3         8      11 - 1         3
2   3        xyz            8 TO 2         8           2     None
3   4        qwe     9 TO 2-4 TO 7         9         2-4         7

To drop the columns you can do af = af.drop(['open_timings'],1)

closed_time has some ambiguity but for start_time and end_time we can use

for i in cols:   
    if i == 'start_time':
        hr = 0
    else:
        hr = 12
    af[i] = pd.to_datetime(af[i], format='%H')
    af[i] = af[i].apply(lambda x : np.where(pd.isnull(x),np.nan,"{}:{}".format(x.hour+hr,x.minute)))
    id store_name start_time closed_time end_time
0   1        abc        9:0       1 -4      23:0
1   2        bcd        8:0     11 - 1      15:0
2   3        xyz        8:0           2      nan
3   4        qwe        9:0        2-4      19:0

Upvotes: 4

Related Questions