Change float values into integer values and then concatenate in pandas dataframe

Question

I have a dataframe named "sample" which has three columns: "birthDay", "birthMonth" and "birthYear" and containing float values as in following picture:

I want to add new column "dateOfBirth" and to have entries in integer format and to obtain following data frame:

I tried sample["dateOfBirth"] = sample["birthDay"].map(str). +"/"+ baseball["birthMonth"].map(str) +"/"+ baseball["birthYear"].map(str). But the result was as "11.0/3.0/1988.0" and "4.0/20.0/2001.0".

I would appreciate your help.

piRSquared · Accepted Answer

setup

sample = pd.DataFrame([
        [3., 11., 1988.],
        [20., 4., 2001.],
    ], columns=['birthDay', 'birthMonth', 'birthYear'])

option 1
make dateOfBirth a series of Timestamps

# dictionary map to rename to canonical date names
# enables convenient conversion using pd.to_datetime
m = dict(birthDay='Day', birthMonth='Month', birthYear='Year')
sample['dateOfBirth'] = pd.to_datetime(sample.rename(columns=m))

sample

option 2
If you insist on a string
use the dt accessor with strftime

# dictionary map to rename to canonical date names
# enables convenient conversion using pd.to_datetime
m = dict(birthDay='Day', birthMonth='Month', birthYear='Year')

sample['dateOfBirth'] = pd.to_datetime(sample.rename(columns=m)) \
                          .dt.strftime('%-m/%-d/%Y')

sample

option 3
If you really want to reconstruct from the values
using apply

f = '{birthMonth:0.0f}/{birthDay:0.0f}/{birthYear:0.0f}'.format
sample['dateOfBirth'] = sample.apply(lambda x: f(**x), 1)
sample

nulls
In the event that one or more of the date columns has a missing value:
Options 1 and 2 don't require any changes and are the recommended options anyway.
If you want to construct from floats, we can use a boolean mask and loc to assign.

sample = pd.DataFrame([
        [3., 11., 1988.],
        [20., 4., 2001.],
        [20., np.nan, 2001.],
    ], columns=['birthDay', 'birthMonth', 'birthYear'])

sample

f = '{birthMonth:0.0f}/{birthDay:0.0f}/{birthYear:0.0f}'.format
mask = sample[['birthDay', 'birthMonth', 'birthYear']].notnull().all(1)
sample.loc[mask, 'dateOfBirth'] = sample.apply(lambda x: f(**x), 1)
sample

timing
given sample

timing
given sample times 10,000

Change float values into integer values and then concatenate in pandas dataframe

Answers (2)

Related Questions