dany
dany

Reputation: 359

Pandas append DataFrame2 ROW to DataFrame1 ROW

I want to append rows from second DataFrame (df2) to first DataFrame (df1) depending whether in df1 column "isValid" is [T]rue.

I know how to iterate over df1 column and search for True values, but don't know how to easily append rows from second DataFrame. Originally my data have around 1000 lines and 40 columns, so I need to do operations automatically.

import pandas

df1 = pandas.read_csv('df1.csv', sep=';')
df2 = pandas.read_csv('df2.csv', sep=';')

print(df1.to_string(), '\n')
print(df2.to_string(), '\n')

columnSeriesObj = df1.iloc[:, 2]
n = 0
k = 0
for i in columnSeriesObj:
    if i == "T":
        print("True in row number", k)
        # APPEND n ROW from df2 to k ROW from df1
        n += 1
    k += 1

print('\n', df1.to_string())

Here are some test values:

df1.csv

DataA;DataB;isValid
1568;1104;F
1224;1213;F
1676;1246;F
1279;1489;T
1437;1890;T
1705;1007;F
1075;1720;F
1361;1983;F
1966;1751;F
1938;1564;F
1894;1684;F
1189;1803;F
1275;1138;F
1085;1748;T
1337;1775;T
1719;1975;F
1045;1187;F
1426;1757;F
1410;1363;F
1405;1025;F
1699;1873;F
1777;1464;F
1925;1310;T

df2.csv

Nr;X;Y;Z;A ;B;C
1;195;319;18;qwe;hjk;wsx
2;268;284;23;rty;zxc;edc
3;285;277;36;uio;vbn;rfv
4;143;369;34;asd;mlp;tgb
5;290;247;16;fgh;qaz;yhn

I want to df1 after appending look like this (screenshot from Excel):

enter image description here

Thank you for any suggestions! :D

Upvotes: 2

Views: 81

Answers (3)

Shubham Sharma
Shubham Sharma

Reputation: 71689

You can filter the index values in df1 where the column isValid equals T, then update the index of df2 with the filtered index values from df1 finally join it with df1:

m = df1['isValid'].eq('T')
idx = m[m].index[:len(df2)]
df1.join(df2.set_index(idx)).fillna('')

    DataA  DataB isValid Nr    X    Y   Z   A     B    C
0    1568   1104       F                                
1    1224   1213       F                                
2    1676   1246       F                                
3    1279   1489       T  1  195  319  18  qwe  hjk  wsx
4    1437   1890       T  2  268  284  23  rty  zxc  edc
5    1705   1007       F                                
6    1075   1720       F                                
7    1361   1983       F                                
8    1966   1751       F                                
9    1938   1564       F                                
10   1894   1684       F                                
11   1189   1803       F                                
12   1275   1138       F                                
13   1085   1748       T  3  285  277  36  uio  vbn  rfv
14   1337   1775       T  4  143  369  34  asd  mlp  tgb
15   1719   1975       F                                
16   1045   1187       F                                
17   1426   1757       F                                
18   1410   1363       F                                
19   1405   1025       F                                
20   1699   1873       F                                
21   1777   1464       F                                
22   1925   1310       T  5  290  247  16  fgh  qaz  yhn

Upvotes: 1

ValeKnappich
ValeKnappich

Reputation: 202

I suggest the following:

I created some dummy data, similar to yours:

import pandas as pd
import random

df = pd.DataFrame({"a": list(range(20)), "b": [random.choice(("T", "F")) for _ in range(20)]})

df2 = pd.DataFrame({"value1": list(range(5)), "nr": list(range(5))})

First you create a new column in the first dataframe that holds the incrementing ID ("Nr"). To do so, use the count generator from itertools.

from itertools import count

counter = count(start=1)
df["id"] = df.apply(lambda row: next(counter) if row["b"] == "T" else None, axis=1)

After that you can perform a join with the merge method.

df.merge(df2, left_on="id", right_on="nr", how="outer")

Upvotes: 1

raphael
raphael

Reputation: 2970

How about something like this:
(e.g. first find the overlapping index-values and then join the dataframes)

import pandas as pd
import numpy as np

df1 = pd.read_csv("df1.csv", sep=';')
df2 = pd.read_csv(r"df2.csv", sep=';')

# find intersecting indices
useidx = np.intersect1d(df2.index, 
                        df1[df1.isValid == 'T'].index)

# join relevant values
df_joined = df1.join(df2.loc[useidx])

df_joined then looks like this:

>>>      DataA  DataB isValid   Nr      X      Y     Z   A     B    C
>>> 0    1568   1104       F  NaN    NaN    NaN   NaN  NaN  NaN  NaN
>>> 1    1224   1213       F  NaN    NaN    NaN   NaN  NaN  NaN  NaN
>>> 2    1676   1246       F  NaN    NaN    NaN   NaN  NaN  NaN  NaN
>>> 3    1279   1489       T  4.0  143.0  369.0  34.0  asd  mlp  tgb
>>> 4    1437   1890       T  5.0  290.0  247.0  16.0  fgh  qaz  yhn
>>> 5    1705   1007       F  NaN    NaN    NaN   NaN  NaN  NaN  NaN
>>> 6    1075   1720       F  NaN    NaN    NaN   NaN  NaN  NaN  NaN
>>> 7    1361   1983       F  NaN    NaN    NaN   NaN  NaN  NaN  NaN

Upvotes: 0

Related Questions