Reputation: 202
Issue:
When using TFENV_AUTO_INSTALL
environment variable in a Terragrunt repository, concurrent installations of the many different Terraform versions trigger a race condition.
This results in an error where tfenv attempts to install many versions of Terraform concurrently in parallel pipeline jobs, leading to permission denied issues.
My code repo:
dev-account01
├── eu-west-1
│ ├── iam_roles
│ │ ├── .terraform-version
│ │ ├── main.tf
│ ├── networking
│ │ ├── .terraform-version
│ │ ├── main.tf
For each module a different terrform version
1.6.2 and 1.5.5
PS: in my actual setup I have many more regions and more modules and more accounts.
Error Message:
/home/user/.tfenv/lib/tfenv-exec.sh: line 43: /home/user/.tfenv/versions/1.6.2/terraform: Permission denied
/home/user/.tfenv/lib/tfenv-exec.sh: line 43: exec: /home/user/.tfenv/versions/1.6.2/terraform: cannot execute: Permission denied
Reproducible Scenario:
TFENV_AUTO_INSTALL
in a Terragrunt repo.Expected Behavior:
TFENV_AUTO_INSTALL
should handle concurrent installations gracefully or sequentially, avoiding race conditions and permission denied errors.
Or is there any way to serialize the installations of the different terraform versions present in my terraform module in each account?
EDIT:
example of solution:
#!/bin/bash
LOCK_FILE="/tmp/tfenv-wrapper.lock"
MAX_CONCURRENT_PROCESSES=1
# Function to acquire a lock
function acquire_lock() {
while true; do
exec 202>"$LOCK_FILE"
flock -n 202 && break
echo "Another instance of the script is already running. Waiting for it to complete."
sleep 5
done
}
# Function to release the lock
function release_lock() {
flock -u 202
rm -f "$LOCK_FILE"
}
# Function to check the number of running processes matching the pattern
function check_tfenv_processes() {
pgrep -f "tfenv install" | grep -v $$ | wc -l
}
# Infinite loop to keep the script running
while true; do
# Acquire the lock
acquire_lock
# Check the number of running processes
num_processes=$(check_tfenv_processes)
# If the number of running processes exceeds the limit, wait
while [ "$num_processes" -ge "$MAX_CONCURRENT_PROCESSES" ]; do
echo "Maximum number of concurrent 'tfenv install' processes reached. Waiting for processes to complete."
sleep 5
num_processes=$(check_tfenv_processes)
done
# Your script logic goes here
# Simulate some work
echo "Script is running..."
# Release the lock
release_lock
done
Current workspaces:
atlantis-git-test-0:/$ ls -l /atlantis-data/repos/orga/infra-test/4
total 24
drwx--S--- 5 atlantis atlantis 4096 Jan 8 10:00 default
drwx--S--- 5 atlantis atlantis 4096 Jan 8 10:00 environments_eks-dev-1_09_eks
drwx--S--- 5 atlantis atlantis 4096 Jan 8 10:00 environments_eks-dev-1_11_r53_zones
drwx--S--- 5 atlantis atlantis 4096 Jan 8 10:00 environments_eks-dev-1_13_irsa
drwx--S--- 5 atlantis atlantis 4096 Jan 8 10:00 environments_eks-dev-1_15_vault
drwx--S--- 5 atlantis atlantis 4096 Jan 8 10:00 environments_eks-staging-1_11_r53_zones
Upvotes: 1
Views: 517