transforming multiple columns in data frame at once

Question

I have some data that I'm trying to clean up. That involves modifying some columns, combining other cols into new ones, etc. I am wondering if there is a way to do this in a succinct way in pandas or if each operation needs to be a separate line of code. Here is an example:

ex_df = pd.DataFrame(data = {"a": [1,2,3,4], "b": ["a-b", "c-d", "e-f", "g-h"]})

Say I want to create a new column called c which will be the first letter in each row of b, I want to transform b by removing the "-", and I want to create another col called d which will be the first letter of b concatenated with the entry in a in that same row. Right now I would have to do something like this:

ex_df["b"] = ex_df["b"].map(lambda x: "".join(x.split(sep="-")))
ex_df["c"] = ex_df["b"].map(lambda x: x[0])
ex_df["d"] = ex_df.apply(func=lambda s: s["c"] + str(s["a"]), axis=1)
ex_df
#   a   b   c   d
#0  1   ab  a   a1
#1  2   cd  c   c2
#2  3   ef  e   e3
#3  4   gh  g   g4

Coming from an R data.table background (which would combine all these operations into a single statement), I'm wondering how things are done in pandas.

llllllllll · Accepted Answer

You can use:

In [12]: ex_df.assign(
    ...:     b=ex_df.b.str.replace('-', ''),
    ...:     c=ex_df.b.str[0],
    ...:     d=ex_df.b.str[0] + ex_df.a.astype(str)
    ...: )
Out[12]: 
   a   b  c   d
0  1  ab  a  a1
1  2  cd  c  c2
2  3  ef  e  e3
3  4  gh  g  g4

transforming multiple columns in data frame at once

Answers (2)

Related Questions