azure-devopsazure-pipelinesazure-pipelines-yaml

Reputation: 329

What directory should I use to manipulate files that will be consumed in other tasks, but not published as artifacts?

I need to manipulate some text files that are not going to be published as artifacts. They will be consumed by other tasks later on in the pipeline stage.

Best practice in mind, what directory should I be using? Build.ArtifactStagingDirectory, Pipeline.Workspace, Agent.TempDirectory or is there another?

I've been using Pipeline.Workspace, but not sure if this is best practice.

Upvotes: -1

Answers (2)

ArwynFr

Reputation: 1712

Build agent directories

User directories

Source on learn.microsoft.com

The build agent directory structure is:

- /work directory
    - /1 build directory/pipeline workspace
        - /s source/working directory
        - /b binaries directory
        - /a artifacts staging directory
        - /TestResults Test results directory

/s is where your source version control software clones your repository into. You can control whether this directory is wiped or simply cleaned, and whether sources are downloaded or not from your pipeline configuration.

/a is where you are suggested to store your artifacts produced by your pipeline job. This directory is wiped between build runs.

/b is where you are suggested to store your binaries. This directory is not wiped between build runs. The purpose is to store your compiled assemblies there so subsequent runs will only rebuild assemblies whose sources changed since the last build.

Other directories

These are not meant to be used by your pipeline:

Pipeline.Workspace is the directory above the others (/work/1 in the example above). It is recomended not to use it to store files because you have no control on how this directory is cleaned across the lifetime of the agent. There is also a risk of collision with child directories. Some system operators like to restrict write permissions in that directory.

Agent.ToolsDirectory is the place where tasks like NuGetToolInstaller install software that are meant to have multiple versions installed side-by-side. Then the task adds the correct directory to the PATH so subsequent task will use the expected version of the tool. This directory is shared amongst pipelines definitions, so if two different pipelines need the same tool, version, and platform, it is only downloaded once for the whole agent. Using it is possible but it must conform to Azure Pipelines tools policy. It is intended for tools developpers that want to provide installtion tasks or templates.

Agent.HomeDirectory is the home directory of the whole agent. This is where the agent files are stored, including secrets for their authentication with the AZDO server, agent updaters, currently running binaries, ... YOU SHOULD NEVER WRITE TO THAT DIRECTORY, WRITING SHOULD BE DISALLOWED, IT COULD BREAK THE AGENT.

Agent.TempDirectory

This is kind of an inbetween case. This directory is at the same level as the work directory and is thus shared between different pipeline definitions, altough cleaned after each job execution.

It was designed to give a temporary place for tasks. For instance a powershell task with inline code will store a temporary ps1 file there, that can be then sent to powershell.exe. Another example, when uploading /a as an artifact the task considers a .artifactignore file. Based on that, files are moved from /a to the temporary directory prior to actually uploading them to the server.

It was designed for you to use in scenarios like when you need pass an argument to a CLI command as a file. For instance you have a secret variable containing an x.509 certificate to give to a CLI command that wants it as a file. You can write the varaible to a temp file in Agent.TempDirectory and pass the filename to the CLI.

Passing files across steps is discouraged.

Where to store transient files

There are globally two cases with different solutions:

The files are used in other jobs

AZDO Pipelines is designed on pipeline jobs being scheduled to agent pools. An agent pool may include multiple agents, and a pipeline can be composed of multiple jobs. The purpose is to allow infrastructure operators to control infrastructure cost by dispatching varying number of agents to the pool, and software developpers to control the potential degree of parallelization of their pipelines. The actual degree of parallelization depends on how fine grained your jobs are, how many agents are available in the pool and how many parallelization licenses your collection has.

The consequence is that different jobs, whether they are in the same stage or pipeline or not could be run on different agents. This means that a file produced in a specific job may not be available in a step of another job if the two jobs are scheduled on different agents. This is true even if the said jobs have a dependecy relationship, as dependency only control job scheduling. The existence of the file is only the consequence of the jobs being run on different paths, and usually different machines.

If you need your file for a step in a different job than the one that produced it, you MUST upload it in the job that produces it, and you MUST download it in each job that consumes it. So placing the file in /a so you may upload it is completely legitimate.

The files are used exclusively during a single job

If the file is produced and consumed exclusively in the same job, you don't need to upload it as an artifact, as all steps in the same job are executed sequencially by a single agent in the same work directory. You have three options to store the file, depending on the lifecycle you want for it:

Store the file in /s if your file can follow the same lifecycle policy as your sources. It's the easiest approach and should be the default for any file produced by the build process;
Store your file in /a if you want it to be cleaned after each run, because having the file present could cause side effects, but you have configured your /s directory to not clean. Make sure to organize /a in a way that avoids collisions and allows to upload other artifacts easily (usually via subdirectories in /a);
Store your file in /b if you know how to detect whether the file needs a rebuild, and skipping the rebuild helps reducing your average build time, but you have configured your /s directory to clean between runs.

Upvotes: 0

akg179

Reputation: 1559

Rather than best practice I think the selection of a location would depend on the objective at hand and from the scenario you have outlined-

I need to manipulate some text files that are not going to be published as artifacts. They will be consumed by other tasks later on in the pipeline stage

Pipeline.Workspace would be appropriate and most suitable out of the three.

Pipeline.Workspace gives you the ability to store and access something through the course of a pipeline stage and will get cleaned up automatically once the stage finishes so you don't need to worry about it.

This is unlike Agent.TempDirectory which gets cleaned up after every pipeline job which means accessing something between jobs of the same pipeline stage won't be possible.

When it comes to Build.ArtifactStagingDirectory, it won't be a suitable choice for your case because it is the space where something is placed with the intent of publishing it as artifact which clearly doesn't seem to be your objective and if there happen to be any artifacts chosen to be published in the stage, it will be difficult to maintain the separation.

Besides these,

Agent.ToolsDirectory, Agent.HomeDirectory and Agent.WorkingDirectory are for more specific purposes i.e. for use by tasks for switching version of tools, for containing the agent software and for the the working of the agent respectively.

Build.SourcesDirectory and Build.BinariesDirectory are local paths used by the agent to download the source code files and to use as output for compiled binaries respectively.

So for manipulating text files these other locations will not work.

Upvotes: 4