Reputation: 13
Input data is a csv file that has 3 columns namely fid, fname and parent id. These are the fid and parentid columns in my input file:
fid - Parent ID
id 1 - -1
id 2 - -1
id 3 - 1
id 4 - 8
id 5 - -1
id 6 - 2
id 7 - 2
id 8 - 1
id 9 - 4
id 10 - 13
id 11 - 9
id 12 - 3
id 13 - 5
id 14 - -1
id 15 - 12
id 16 - 12
id 17 - 12
This is what I've written
data=pd.read_csv("data.csv" )
df = pd.DataFrame(data)
Col=df.aggregate(lambda x: [x.tolist()], axis=0).map(lambda x:x[0])
fid=Col[0]
fname=Col[1]
pid=Col[2]
imf=[]
temp=[]
for i in range(len(pid)):
j=0
while j<len(pid):
if j!=i:
if pid[j]==fid[i]:
temp.append(fid[j])
if pid[j] in temp:
temp.append(fid[j])
j+=1
if not temp:
imf.append(None)
else:
temp = list(set(temp))
imf.append(temp)
temp=[]
Now, the output that I need is this:
id 1 - 3,8,12,15,16,17,4,9,11
id 2 - 6,7
id 3 - 12,15,16,17
id 4 - 9,11
id 5 - 13,10
id 6 - None
id 7 - None
id 8 - 4,9,11
id 9 - 11
id 10 - None
id 11 - None
id 12 - 15,16,17
id 13 - 10
id 14 - None
id 15 - None
id 16 - None
id 17 - None
The output I'm getting:
id 1 - 3,8,12,15,16,17
id 2 - 6,7
id 3 - 12,15,16,17
id 4 - 9,11
id 5 - 13
id 6 - None
id 7 - None
id 8 - 4,9,11
id 9 - 11
id 10 - None
id 11 - None
id 12 - 15,16,17
id 13 - 10
id 14 - None
id 15 - None
id 16 - None
id 17 - None
As you guys can see id 1 and 5 are not getting the complete values. I would like to know what changes I shall make in the code that gives me the output I require. Alternate methods of doing this are also welcomed.
Upvotes: 1
Views: 200
Reputation: 117711
If this is your dataframe:
>>> df
fid parent
0 1 -1
1 2 -1
2 3 1
3 4 8
4 5 -1
5 6 2
6 7 2
7 8 1
8 9 4
9 10 13
10 11 9
11 12 3
12 13 5
13 14 -1
14 15 12
15 16 12
16 17 12
I believe this does what you want:
import networkx as nx
G = nx.from_pandas_edgelist(df, source="parent", target="fid", create_using=nx.DiGraph)
descendants = [(fid, sorted(nx.descendants(G, fid)) or None)
for fid in df["fid"]]
result = pd.DataFrame(descendants, columns=["fid", "descendants"])
The result:
>>> result
fid descendants
0 1 [3, 4, 8, 9, 11, 12, 15, 16, 17]
1 2 [6, 7]
2 3 [12, 15, 16, 17]
3 4 [9, 11]
4 5 [10, 13]
5 6 None
6 7 None
7 8 [4, 9, 11]
8 9 [11]
9 10 None
10 11 None
11 12 [15, 16, 17]
12 13 [10]
13 14 None
14 15 None
15 16 None
16 17 None
Upvotes: 3