Reputation: 1529
I have a pandas dataframe like this:
index integer_2_x integer_2_y
0 49348 NaN
1 26005 NaN
2 5 NaN
3 NaN 26
4 26129 NaN
5 129 NaN
6 NaN 26
7 NaN 17
8 60657 NaN
9 17031 NaN
I want to make a third column that looks like this by taking the numeric value in the first and the second and eliminating the NaN
. How do I do this?
index integer_2_z
0 49348
1 26005
2 5
3 26
4 26129
5 129
6 26
7 17
8 60657
9 17031
Upvotes: 0
Views: 133
Reputation: 21888
Maybe you can simply use the fillna
function.
# Creating the DataFrame
df = pd.DataFrame({'integer_2_x': [49348, 26005, 5, np.nan, 26129, 129, np.nan, np.nan, 60657, 17031],
'integer_2_y': [np.nan, np.nan, np.nan, 26, np.nan, np.nan, 26, 17, np.nan, np.nan]})
# Using fillna to fill a new column
df['integer_2_z'] = df['integer_2_x'].fillna(df['integer_2_y'])
# Printing the result below, you can also drop x and y columns if they are no more required
print(df)
integer_2_x integer_2_y integer_2_z
0 49348 NaN 49348
1 26005 NaN 26005
2 5 NaN 5
3 NaN 26 26
4 26129 NaN 26129
5 129 NaN 129
6 NaN 26 26
7 NaN 17 17
8 60657 NaN 60657
9 17031 NaN 17031
Upvotes: 0
Reputation: 1345
I used http://pandas.pydata.org/pandas-docs/stable/basics.html#general-dataframe-combine
import pandas as pd
import numpy as np
df = pd.read_csv("data", sep="\s*") # cut and pasted your data into 'data' file
df["integer_2_z"] = df["integer_2_x"].combine(df["integer_2_y"], lambda x, y: np.where(pd.isnull(x), y, x))
Output
index integer_2_x integer_2_y integer_2_z
0 0 49348 NaN 49348
1 1 26005 NaN 26005
2 2 5 NaN 5
3 3 NaN 26 26
4 4 26129 NaN 26129
5 5 129 NaN 129
6 6 NaN 26 26
7 7 NaN 17 17
8 8 60657 NaN 60657
9 9 17031 NaN 17031
Upvotes: 0
Reputation: 24742
One way is to use the update
function.
import pandas as np
import numpy as np
# some artificial data
# ========================
df = pd.DataFrame({'X':[10,20,np.nan,40,np.nan], 'Y':[np.nan,np.nan,30,np.nan,50]})
print(df)
X Y
0 10 NaN
1 20 NaN
2 NaN 30
3 40 NaN
4 NaN 50
# processing
# =======================
df['Z'] = df['X']
# for every missing value in column Z, replace it with value in column Y
df['Z'].update(df['Y'])
print(df)
X Y Z
0 10 NaN 10
1 20 NaN 20
2 NaN 30 30
3 40 NaN 40
4 NaN 50 50
Upvotes: 1