Samovian
Samovian

Reputation: 215

String Formatting using many pandas columns to create a new one

I would like to create a new columns in a pandas DataFrame just like I would do using a python f-Strings or format function. Here is an example:

df = pd.DataFrame({"str": ["a", "b", "c", "d", "e"],
                   "int": [1, 2, 3, 4, 5]})

print(df)

  str  int
0   a    1
1   b    2
2   c    3
3   d    4
4   e    5

I would like to obtain:

  str  int concat
0   a    1   a-01
1   b    2   b-02
2   c    3   c-03
3   d    4   d-04
4   e    5   e-05

So something like:

concat = f"{str}-{int:02d}"

but directly with elements of pandas columns. I imagine the solution is using pandas map, apply, agg but nothing successful.

Many thanks for your help.

Upvotes: 12

Views: 7096

Answers (4)

sammywemmy
sammywemmy

Reputation: 28699

You can use pandas' string concatenate method :

df['concat'] = df['str'].str.cat(df['int'].astype(str),sep='-0')

    str int concat
0   a   1   a-01
1   b   2   b-02
2   c   3   c-03
3   d   4   d-04
4   e   5   e-05

Upvotes: 1

Samovian
Samovian

Reputation: 215

I also just discovered that array indexing work on DataFrame columns

df["concat"] = df.apply(lambda x: f"{x[0]}-{x[1]:02d}", axis=1)

print(df)

  str  int concat
0   a    1   a-01
1   b    2   b-02
2   c    3   c-03
3   d    4   d-04
4   e    5   e-05

looks very sleek

Upvotes: 2

Dani Mesejo
Dani Mesejo

Reputation: 61910

You could use a list comprehension to build the concat column:

import pandas as pd

df = pd.DataFrame({"str": ["a", "b", "c", "d", "e"],
                   "int": [1, 2, 3, 4, 5]})

df['concat'] = [f"{s}-{i:02d}" for s, i in df[['str', 'int']].values]

print(df)

Output

  str  int concat
0   a    1   a-01
1   b    2   b-02
2   c    3   c-03
3   d    4   d-04
4   e    5   e-05

Upvotes: 4

jezrael
jezrael

Reputation: 862801

Use lsit comprehension with f-strings:

df['concat'] = [f"{a}-{b:02d}" for a, b in zip(df['str'], df['int'])]

Or is possible use apply:

df['concat'] = df.apply(lambda x: f"{x['str']}-{x['int']:02d}", axis=1)

Or solution from comments with Series.str.zfill:

df["concat"] = df["str"] + "-" + df["int"].astype(str).str.zfill(2)

print (df)
  str  int concat
0   a    1   a-01
1   b    2   b-02
2   c    3   c-03
3   d    4   d-04
4   e    5   e-05

Upvotes: 15

Related Questions