T3J45
T3J45

Reputation: 761

Adding a row from a dataframe into another by matching columns with NaN values in row pandas python

The Scenario:

I have 2 dataframes fc0 and yc0. Where fc0 is a Cluster and yc0 is another dataframe which needs to be merged in fc0.

The Nature of data is as follows:

fc0

uid         1         2         3         4         5         6  
234  235  4.000000  4.074464  4.128026  3.973045  3.921663  4.024864   
235  236  3.524208  3.125669  3.652112  3.626923  3.524318  3.650589   
236  237  4.174080  4.226267  4.200133  4.150983  4.124157  4.200052

yc0

iid  uid    1    2    5    6    9    15
0    944  5.0  3.0  4.0  3.0  3.0  5.0 

The Twist

I have 1682 columns in fc0 and I have few hundered values in yc0. Now I need the yc0 to go into fc0

In haste of resolving it, I even tried yc0.reset_index(inplace=True) but wasn't really helpful.

Expected Output

     uid         1         2         3         4         5         6  
234  235  4.000000  4.074464  4.128026  3.973045  3.921663  4.024864   
235  236  3.524208  3.125669  3.652112  3.626923  3.524318  3.650589   
236  237  4.174080  4.226267  4.200133  4.150983  4.124157  4.200052
944  5.0       3.0       NaN       NaN       4.0       3.0       3.0

References

Link1 Tried this, but landed up inserting NaN values for 1st 16 Columns and rest of the data shifted by that many columns

Link2 Couldn't match column keys, besides I tried it for row.

Link3 Merging doesn't match the columns in it.

Link4 Concatenation doesn't work that way.

Link5 Same issues with Join.

EDIT 1

fc0.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 235 entries, 234 to 468
Columns: 1683 entries, uid to 1682
dtypes: float64(1682), int64(1)
memory usage: 3.0 MB

and

yc0.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1 entries, 0 to 0
Columns: 336 entries, uid to 1007
dtypes: float64(335), int64(1)
memory usage: 2.7 KB

Upvotes: 1

Views: 1501

Answers (1)

Scott Boston
Scott Boston

Reputation: 153460

Here's a MVCE example. Does this small sample data show the functionality that you are expecting?

df1 = pd.DataFrame(np.random.randint(0,100,(5,4)), columns=list('ABCE'))

    A   B   C   E
0  81  57  54  88
1  63  63  74  10
2  13  89  88  66
3  90  81   3  31
4  66  93  55   4

df2 = pd.DataFrame(np.random.randint(0,100,(5,4)), columns=list('BCDE'))

    B   C   D   E
0  93  48  62  25
1  24  97  52  88
2  53  50  21  13
3  81  27   7  81
4  10  21  77  19

df_out = pd.concat([df1,df2])
print(df_out)

Output:

      A   B   C     D   E
0  81.0  57  54   NaN  88
1  63.0  63  74   NaN  10
2  13.0  89  88   NaN  66
3  90.0  81   3   NaN  31
4  66.0  93  55   NaN   4
0   NaN  93  48  62.0  25
1   NaN  24  97  52.0  88
2   NaN  53  50  21.0  13
3   NaN  81  27   7.0  81
4   NaN  10  21  77.0  19

Upvotes: 3

Related Questions