Athithan
Athithan

Reputation: 13

all_simple_paths in networkx between two nodes is running long

I need to find the list of all possible path between two jobs 'Job_292' and 'Job_063'. I have a file which has the list of jobs in column1 and its successor in column2. The code is running for longer time. I guess it is going in some cycle, because even for around 15 million path the job took very lesser time.

Cycle in flow

    import networkx as nx

    G2 = nx.read_adjlist("input_ext.txt",create_using=nx.DiGraph())

    for path in nx.all_simple_paths(G2, source='Job_292', target='Job_063'):
        print(path)

Input file location https://jumpshare.com/v/hAHCbwZb8HTopRcfWsoh

Could please someone help in figuring out the issue or some alternate solution.

Thanks in advance.

Upvotes: 0

Views: 2897

Answers (2)

Adeeb Maharmah
Adeeb Maharmah

Reputation: 21

Add cutoff to your code to stop the search after n of paths found

source link

cutoff (integer, optional) – Depth to stop the search. Only paths of length <= cutoff are returned.

also separate for loop from getting as every loop python will recalculate the simple paths

for path in nx.all_simple_paths(G2, source='Job_292', target='Job_063'):
        print(path)

to be

paths = nx.all_simple_paths(G2, source='Job_292', target='Job_063')
for path in paths:
        print(path)

Upvotes: 0

Reblochon Masque
Reblochon Masque

Reputation: 36662

Cleaning the data to remove the underscore in the first column of node names works and produces a large output of paths.

Clean up phase:

re-create the data file with the underscore removed from the nodes in the first column.

raw = []
with open("cyclic_input_ext_adj.txt", encoding="utf-8") as graphfile:
    raw = graphfile.readlines()

raw = [line.strip() for line in raw]
graphdata = [line.split(' ') for line in raw]
graphdict = {g[0][:3]+g[0][4:]: g[1:] for g in graphdata}

with open("cyclic_input_ext_adj_clean.txt", 'w') as graphfile:
    for k, v in graphdict.items():
        line = k + ' ' + ' '.join(val for val in v) + '\n'
        graphfile.write(line)

the file now looks like this:

Job001 Job281
Job002 Job121 Job117 Job132
Job003 Job108 Job114
Job004 Job218
Job005 Job254
Job006 Job009 Job008 Job029 Job007 Job239 Job031
Job007 Job031 Job239
Job008 Job031 Job029 Job209
Job009 Job300
Job010 Job013
Job011 Job013
...

Search Paths phase:

create the graph, call nx.all_simple_paths with the corrected start and end nodes, on the corrected graph. Output the resulting paths in a file.

import networkx as nx

G2 = nx.read_adjlist("cyclic_input_ext_adj_clean.txt", create_using = nx.DiGraph())

with open("paths_Job292_to_Job063.txt", 'w') as result_paths:
    for path in nx.all_simple_paths(G2, source='Job292', target='Job063'):
        line = ' '.join([p for p in path]) + '\n'
        result_paths.write(line)

Output to file:

(the 10 first paths found)

    Job292 Job021 Job022 Job023 Job024 Job019 Job018 Job014 Job017 Job015 Job020 Job016 Job207 Job210 Job212 Job220 Job215 Job235 Job234 Job011 Job013 Job030 Job006 Job007 Job239 Job242 Job240 Job243 Job244 Job247 Job248 Job246 Job249 Job252 Job025 Job277 Job027 Job287 Job289 Job275 Job278 Job204 Job183 Job184 Job192 Job150 Job122 Job120 Job145 Job146 Job127 Job060 Job044 Job048 Job068 Job069 Job063
    Job292 Job021 Job022 Job023 Job024 Job019 Job018 Job014 Job017 Job015 Job020 Job016 Job207 Job210 Job212 Job220 Job215 Job235 Job234 Job011 Job013 Job030 Job006 Job007 Job239 Job242 Job240 Job243 Job244 Job247 Job248 Job246 Job249 Job252 Job025 Job277 Job027 Job287 Job289 Job275 Job278 Job204 Job183 Job184 Job192 Job150 Job122 Job120 Job145 Job146 Job127 Job392 Job393 Job395 Job396 Job404 Job400 Job070 Job044 Job048 Job068 Job069 Job063
    Job292 Job021 Job022 Job023 Job024 Job019 Job018 Job014 Job017 Job015 Job020 Job016 Job207 Job210 Job212 Job220 Job215 Job235 Job234 Job011 Job013 Job030 Job006 Job007 Job239 Job242 Job240 Job243 Job244 Job247 Job248 Job246 Job249 Job252 Job025 Job277 Job027 Job287 Job289 Job275 Job278 Job204 Job183 Job184 Job192 Job150 Job122 Job120 Job145 Job146 Job127 Job392 Job393 Job395 Job396 Job404 Job400 Job409 Job066 Job044 Job048 Job068 Job069 Job063
    Job292 Job021 Job022 Job023 Job024 Job019 Job018 Job014 Job017 Job015 Job020 Job016 Job207 Job210 Job212 Job220 Job215 Job235 Job234 Job011 Job013 Job030 Job006 Job007 Job239 Job242 Job240 Job243 Job244 Job247 Job248 Job246 Job249 Job252 Job025 Job277 Job027 Job287 Job289 Job275 Job278 Job204 Job183 Job184 Job192 Job150 Job122 Job120 Job145 Job146 Job127 Job392 Job393 Job395 Job396 Job404 Job400 Job405 Job401 Job071 Job044 Job048 Job068 Job069 Job063
    Job292 Job021 Job022 Job023 Job024 Job019 Job018 Job014 Job017 Job015 Job020 Job016 Job207 Job210 Job212 Job220 Job215 Job235 Job234 Job011 Job013 Job030 Job006 Job007 Job239 Job242 Job240 Job243 Job244 Job247 Job248 Job246 Job249 Job252 Job025 Job277 Job027 Job287 Job289 Job275 Job278 Job204 Job183 Job184 Job192 Job150 Job122 Job120 Job145 Job146 Job127 Job392 Job393 Job395 Job396 Job404 Job399 Job409 Job066 Job044 Job048 Job068 Job069 Job063
    Job292 Job021 Job022 Job023 Job024 Job019 Job018 Job014 Job017 Job015 Job020 Job016 Job207 Job210 Job212 Job220 Job215 Job235 Job234 Job011 Job013 Job030 Job006 Job007 Job239 Job242 Job240 Job243 Job244 Job247 Job248 Job246 Job249 Job252 Job025 Job277 Job027 Job287 Job289 Job275 Job278 Job204 Job183 Job184 Job192 Job150 Job122 Job120 Job145 Job146 Job127 Job392 Job393 Job395 Job396 Job404 Job399 Job065 Job044 Job048 Job068 Job069 Job063
    Job292 Job021 Job022 Job023 Job024 Job019 Job018 Job014 Job017 Job015 Job020 Job016 Job207 Job210 Job212 Job220 Job215 Job235 Job234 Job011 Job013 Job030 Job006 Job007 Job239 Job242 Job240 Job243 Job244 Job247 Job248 Job246 Job249 Job252 Job025 Job277 Job027 Job287 Job289 Job275 Job278 Job204 Job183 Job184 Job192 Job150 Job122 Job120 Job145 Job146 Job127 Job392 Job393 Job395 Job396 Job404 Job409 Job066 Job044 Job048 Job068 Job069 Job063
    Job292 Job021 Job022 Job023 Job024 Job019 Job018 Job014 Job017 Job015 Job020 Job016 Job207 Job210 Job212 Job220 Job215 Job235 Job234 Job011 Job013 Job030 Job006 Job007 Job239 Job242 Job240 Job243 Job244 Job247 Job248 Job246 Job249 Job252 Job025 Job277 Job027 Job287 Job289 Job275 Job278 Job204 Job183 Job184 Job192 Job150 Job122 Job120 Job145 Job146 Job127 Job392 Job393 Job395 Job396 Job404 Job402 Job406 Job403 Job062 Job044 Job048 Job068 Job069 Job063
    Job292 Job021 Job022 Job023 Job024 Job019 Job018 Job014 Job017 Job015 Job020 Job016 Job207 Job210 Job212 Job220 Job215 Job235 Job234 Job011 Job013 Job030 Job006 Job007 Job239 Job242 Job240 Job243 Job244 Job247 Job248 Job246 Job249 Job252 Job025 Job277 Job027 Job287 Job289 Job275 Job278 Job204 Job183 Job184 Job192 Job150 Job122 Job120 Job145 Job146 Job127 Job392 Job393 Job395 Job396 Job404 Job402 Job061 Job044 Job048 Job068 Job069 Job063
    Job292 Job021 Job022 Job023 Job024 Job019 Job018 Job014 Job017 Job015 Job020 Job016 Job207 Job210 Job212 Job220 Job215 Job235 Job234 Job011 Job013 Job030 Job006 Job007 Job239 Job242 Job240 Job243 Job244 Job247 Job248 Job246 Job249 Job252 Job025 Job277 Job027 Job287 Job289 Job275 Job278 Job204 Job183 Job184 Job192 Job150 Job122 Job120 Job145 Job146 Job127 Job392 Job393 Job395 Job396 Job404 Job402 Job409 Job066 Job044 Job048 Job068 Job069 Job063
...

I killed the process at 2992 paths found; you can find the first 1000 partial results in this pastebin.

Upvotes: 1

Related Questions