Anand Siddharth
Anand Siddharth

Reputation: 977

How to merge 2 series in pandas where nulls

In a Dataframe/table I have data something like this

+----+-------+--------+
| id | name1 | name2  |
+----+-------+--------+
|  0 | John  |        |
|  1 |       | Nathan |
|  2 | Andy  |        |
+----+-------+--------+

I want to merge them to an another column where nulls or empties gets filled either of the name column something like this

+----+-------+--------+--------------+
| id | name1 | name2  | merged_names |
+----+-------+--------+--------------+
|  0 | John  |        | John         |
|  1 |       | Nathan | Nathan       |
|  2 | Andy  |        | Andy         |
+----+-------+--------+--------------+

How can I do this in pandas Dataframe

Upvotes: 2

Views: 47

Answers (2)

jpp
jpp

Reputation: 164713

You can use pd.Series.combine_first:

df = pd.DataFrame({'id': [0, 1, 2],
                   'name1': ['John', np.nan, 'Andy'],
                   'name2': [np.nan, 'Nathan', np.nan]})

df['merged'] = df['name1'].combine_first(df['name2'])

print(df)

   id name1   name2  merged
0   0  John     NaN    John
1   1   NaN  Nathan  Nathan
2   2  Andy     NaN    Andy

If your values are empty strings instead of NaN, you can convert to NaN first via:

df = df.replace('', np.nan)

Upvotes: 2

Joe
Joe

Reputation: 12417

If the empty cells are actually empty strings, you can do so:

df['merged_names'] = df1['name1'] + df2['name2']

If the empty cells are np.nan, you can use replace in this way:

df['merged_names'] = df1['name1'].replace(np.nan, '') + df2['name2'].replace(np.nan, '')

Upvotes: 2

Related Questions