IsaacLevon
IsaacLevon

Reputation: 2580

concatenating a new column and renaming it

I'm trying to generate a new column based on another (hour column from the time column). The problem is that after concatenating, it's getting the same name ("time"). Moreover, when I try to change one, the other changes as well.

Why is that?

Here's the code

df['time'] = pd.to_datetime(df['time'])
hour_col = pd.Series(df['time']).copy()
hour_col = hour_col.apply(lambda t: t.hour)
df = pd.concat([df, hour_col], axis=1)

Name change:

df = df.rename(columns={ df.columns[3]: "hour" })

Upvotes: 1

Views: 30

Answers (1)

jezrael
jezrael

Reputation: 863166

Use dt.hour:

df['time'] = pd.to_datetime(df['time'])
df['hours'] = df['time'].dt.hour

But if really need your solution only rename column, converting to Series is not necessary, because each column of DataFrame is Series after selecting (print (type(df['time']))):

df = pd.DataFrame({'time':['10:20:30','20:03:04']})

df['time'] = pd.to_datetime(df['time'])
hour_col = df['time'].rename('hour')
hour_col = hour_col.apply(lambda t: t.hour)
df = pd.concat([df, hour_col], axis=1)
print (df)
                 time  hour
0 2018-11-24 10:20:30    10
1 2018-11-24 20:03:04    20

If want replace columns by position (e.g. because duplicated values and rename change both names like mentioned @Gla Avineri in comment) use:

df = pd.DataFrame({'time':['10:20:30','20:03:04'],
                   'a':[2,3],
                   'b':[-4,5]})

df['time'] = pd.to_datetime(df['time'])
hour_col = df['time']
hour_col = hour_col.apply(lambda t: t.hour)
df = pd.concat([df, hour_col], axis=1)

#converting to list because columns are immutable
cols = df.columns.tolist()
#set 4th value
cols[3] = 'hour'
#assign back
df.columns = cols

print (df)
                 time  a  b  hour
0 2018-11-24 10:20:30  2 -4    10
1 2018-11-24 20:03:04  3  5    20

Upvotes: 2

Related Questions