hat20
hat20

Reputation: 13

How do I find all the children of an id in Python?

Input data is a csv file that has 3 columns namely fid, fname and parent id. These are the fid and parentid columns in my input file:

fid - Parent ID
id 1 - -1
id 2 - -1
id 3 - 1
id 4 - 8
id 5 - -1
id 6 - 2
id 7 - 2
id 8 - 1
id 9 - 4
id 10 - 13
id 11 - 9
id 12 - 3
id 13 - 5
id 14 - -1
id 15 - 12
id 16 - 12
id 17 - 12

This is what I've written

data=pd.read_csv("data.csv" )
df = pd.DataFrame(data)
Col=df.aggregate(lambda x: [x.tolist()], axis=0).map(lambda x:x[0])
fid=Col[0]
fname=Col[1]
pid=Col[2]
imf=[]
temp=[]
for i in range(len(pid)):
    j=0
    while j<len(pid):
        if j!=i:
            if pid[j]==fid[i]:
                temp.append(fid[j])
            if pid[j] in temp:
                temp.append(fid[j])
        j+=1
    if not temp:
        imf.append(None)
    else:
        temp = list(set(temp))
        imf.append(temp)
        temp=[]

Now, the output that I need is this:

id 1 - 3,8,12,15,16,17,4,9,11
id 2 - 6,7
id 3 - 12,15,16,17
id 4 - 9,11
id 5 - 13,10
id 6 - None
id 7 - None
id 8 - 4,9,11
id 9 - 11
id 10 - None
id 11 - None
id 12 - 15,16,17
id 13 - 10
id 14 - None
id 15 - None
id 16 - None
id 17 - None

The output I'm getting:

id 1 - 3,8,12,15,16,17
id 2 - 6,7
id 3 - 12,15,16,17
id 4 - 9,11
id 5 - 13
id 6 - None
id 7 - None
id 8 - 4,9,11
id 9 - 11
id 10 - None
id 11 - None
id 12 - 15,16,17
id 13 - 10
id 14 - None
id 15 - None
id 16 - None
id 17 - None

As you guys can see id 1 and 5 are not getting the complete values. I would like to know what changes I shall make in the code that gives me the output I require. Alternate methods of doing this are also welcomed.

Upvotes: 1

Views: 200

Answers (1)

orlp
orlp

Reputation: 117711

If this is your dataframe:

>>> df
    fid  parent
0     1      -1
1     2      -1
2     3       1
3     4       8
4     5      -1
5     6       2
6     7       2
7     8       1
8     9       4
9    10      13
10   11       9
11   12       3
12   13       5
13   14      -1
14   15      12
15   16      12
16   17      12

I believe this does what you want:

import networkx as nx

G = nx.from_pandas_edgelist(df, source="parent", target="fid", create_using=nx.DiGraph)
descendants = [(fid, sorted(nx.descendants(G, fid)) or None)
               for fid in df["fid"]]
result = pd.DataFrame(descendants, columns=["fid", "descendants"])

The result:

>>> result
    fid                       descendants
0     1  [3, 4, 8, 9, 11, 12, 15, 16, 17]
1     2                            [6, 7]
2     3                  [12, 15, 16, 17]
3     4                           [9, 11]
4     5                          [10, 13]
5     6                              None
6     7                              None
7     8                        [4, 9, 11]
8     9                              [11]
9    10                              None
10   11                              None
11   12                      [15, 16, 17]
12   13                              [10]
13   14                              None
14   15                              None
15   16                              None
16   17                              None

Upvotes: 3

Related Questions