jrow
jrow

Reputation: 101

Argo Workflow to continue processing during fan-out

General question here, wondering if anyone has any ideas or experience trying to achieve something I am right now. I'm not entirely sure if its even possible in the argo workflow system...

I'm wondering if it is possible to continue a workflow regardless if a dynamic fanout has finished. By dynamic fanout I mean that B1/B2/B3 can go to B30 potentially.

I want to see if C1 can start when B1 has finished. The B stage is creating a small file which then in C stage I need to run an api request that it has finished and upload said file. But in this scenario B2/B3 still are processing.

And finally, D1 would have to wait for all of C1/2/3-C# to finish to complete

Diagram what I'm trying to achieve

#           *
#           | 
#          A1 (generates a dynamic list that can change depending on the inputs) 
#           | 
#        /  |  \ 
#       B1  B2  B3 +++ B#
#       |   |    |
#       C1         +++ C#
#       *   *   *
#        \  |  /
#         \ | /
#           D1

I was viewing https://github.com/argoproj/argo-workflows/blob/master/docs/enhanced-depends-logic.md but I cant wrap my head around if this is what I need to achieve this. Especially if the fan-out steps are dynamic.

Seems to me that it would bind C stage to the entirety of B stage and require for for B to finish

Upvotes: 1

Views: 1184

Answers (1)

crenshaw-dev
crenshaw-dev

Reputation: 8392

Something like this should work:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
spec:
  templates:
    - name: main
      steps:
        - - name: A
            template: A
        - - name: B_C
            template: B_C
            arguments:
              parameters:
                - name: item
                  value: "{{item}}"
            withParam: "{{steps.A.outputs.parameters.items}}"
        - - name: D
            template: D
    - name: A
      # container or script spec here
      outputs:
        parameters:
          - name: items
            valueFrom:
              path: /tmp/items.json
    - name: B_C
      inputs:
        parameters:
          - name: item
      steps:
        - - name: B
            template: B
            arguments:
              parameters:
                - name: item
                  value: "{{inputs.parameters.item}}"
        - - name: C
            template: C
            arguments:
              artifacts:
                - name: file
                  from: "{{steps.B.outputs.artifacts.file}}"
    - name: B
      inputs:
        parameters:
          - name: item
      # container or script spec here
      outputs:
        artifacts:
          - name: file
            path: /tmp/file
    - name: C
      inputs:
        artifacts:
          - name: file
      # container or script spec here
    - name: D
      # container or script spec here

Step B_C in the main template runs instances of the B_C template in parallel.

Template B_C runs B and C in series. Once template B_C starts, it runs as quickly as possible, completely unaware of any concurrent executions of the B_C template. So C1 blocks only on B1, never on B2 or B3 or any other B#.

Once all instances of B_C are finished, the main template finally invokes the D template.

Upvotes: 3

Related Questions