Reputation: 44363
I am starting a Amazon Linux instance (ami-fb8e9292) using the web console, pasting data into the user data box to run a script upon startup. If I use the example given by amazon to start a web server, it works. But when I run my own script (also a #!/bin/bash
script), it does not get run.
If I look in var/log/cloud-init.log
, it gives no useful information on the topic:
May 22 21:06:12 cloud-init[1286]: util.py[DEBUG]: Running command ['/var/lib/cloud/instance/scripts/part-001'] with allowed return codes [0] (shell=True, capture=False)
May 22 21:06:16 cloud-init[1286]: util.py[WARNING]: Failed running /var/lib/cloud/instance/scripts/part-001 [2]
May 22 21:06:16 cloud-init[1286]: util.py[DEBUG]: Failed running /var/lib/cloud/instance/scripts/part-001 [2]
Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/cloudinit/util.py", line 637, in runparts
subp([exe_path], capture=False, shell=True)
File "/usr/lib/python2.6/site-packages/cloudinit/util.py", line 1528, in subp
cmd=args)
ProcessExecutionError: Unexpected error while running command.
Command: ['/var/lib/cloud/instance/scripts/part-001']
Exit code: 2
Reason: -
Stdout: ''
Stderr: ''
If I ssh into the instance and sudo su
and execute the shell script directly:
/var/lib/cloud/instance/scripts/part-001
then it runs fine. Also, it works if I emulate the way cloud-init runs it:
python
>>> import cloudinit.util
>>> cloudinit.util.runparts("/var/lib/cloud/instance/scripts/")
Using either of those methods, if I intentionally introduce errors into the script then it produces error messages. How can I debug the selective absence of useful debugging output?
Upvotes: 27
Views: 36156
Reputation: 316
I've been through this, and in my case it was also an issue with spaces before the she-bang #!bin/bash
.
I spun up an instance through python code, using boto3.
ec2 = boto3.resource('ec2', region_name='eu-south-1')
instance = ec2.create_instances(
image=AMI_IMAGE_ID,
InstanceType=INSTANCE_TYPE,
...
UserData=USER_DATA_SCRIPT
...
)
where the definition of USER_DATA_SCRIPT
was:
USER_DATA_SCRIPT = """
#!/bin/bash
apt update -y
apt upgrade -y
...
"""
This contained spaces up front, and this caused the script to generate the error without further details in /var/log/cloud-init-output.log
.
Changing it into:
USER_DATA_SCRIPT = """#!/bin/bash
apt update -y
apt upgrade -y
...
"""
solved the issue.
Upvotes: 3
Reputation: 17
In my case cloudinit could not start script because userdata must start with
#!bin/bash
without empty spaces in front of it! Nice AWS bug, lot of time for troubleshooting :)
Upvotes: 0
Reputation: 6915
Hope it will reduce the debugging time for someone.
I didn't have any explicit error messages in my /var/log/cloud-init-output.log
, just this:
2021-04-07 10:36:57,748 - cc_scripts_user.py[WARNING]: Failed to run module scripts-user (scripts in /var/lib/cloud/instance/scripts) 2021-04-07 10:36:57,748 - util.py[WARNING]: Running module scripts-user (<module 'cloudinit.config.cc_scripts_user' from '/usr/lib/python3/dist-packages/cloudinit/config/cc_scripts_user.py'>) failed
After some investigation, I've realized that the cause was a typo in the shebang string: #!?bin/bash
instead of #!/bin/bash
.
Upvotes: 5
Reputation: 22198
Instead of /var/log/cloud-init.log
consider searcing for keywords like "Failed", "ERROR" "WARNING" or "/var/lib/cloud/instance/scripts/" inside /var/log/cloud-init-output.log
- which in most cases, contains very clear error messages.
For example - running a bad command will produce the following error in /var/log/cloud-init-output.log
:
/var/lib/cloud/instance/scripts/part-001: line 10: vncpasswd: command not found
cp: cannot stat '/lib/systemd/system/[email protected]': No such file or directory
sed: can't read /etc/systemd/system/[email protected]: No such file or directory
Failed to execute operation: No such file or directory
Failed to start vncserver@:1.service: Unit not found.
Loaded plugins: extras_suggestions, langpacks, priorities, update-motd
Cleaning repos: amzn2-core amzn2extra-docker amzn2extra-epel
And at the end of /var/log/cloud-init.log
you'll receive a quiet general error message:
Aug 31 15:14:00 cloud-init[3532]: util.py[DEBUG]: Failed running /var/lib/cloud/instance/scripts/part-001 [1]
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/cloudinit/util.py", line 910, in runparts
subp(prefix + [exe_path], capture=False, shell=True)
File "/usr/lib/python2.7/site-packages/cloudinit/util.py", line 2105, in subp
cmd=args)
ProcessExecutionError: Unexpected error while running command.
Command: ['/var/lib/cloud/instance/scripts/part-001']
Exit code: 1
Reason: -
Stdout: -
Stderr: -
cc_scripts_user.py[WARNING]: Failed to run module scripts-user (scripts in /var/lib/cloud/instance/scripts)
(*) Try to grep
just the relevant error message with:
grep -C 10 '<search-keyword>' cloud-init-output.log
Upvotes: 13
Reputation: 341
I had a similar issue and I was able to get around it. I realized that the environment variables EC2_HOME would not be setup for the sudo. I was doing a bunch of stuff in my configset which uses aws cli and for these to work, the EC2_HOME needs to be setup. So, I went in and removed sudo everywhere in my configset and UserData. Earlier when I was hitting the issue, my UserData looked like:
"UserData" : { "Fn::Base64" : { "Fn::Join" : ["", [
"#!/bin/bash\n",
"sudo yum update -y aws-cfn-bootstrap\n",
"# Install the files and packages and run the commands from the metadata\n",
"sudo /opt/aws/bin/cfn-init -v --access-key ", { "Ref" : "IAMUserAccessKey" }, " --secret-key ", { "Ref" : "SecretAccessKey" },
" --stack ", { "Ref" : "AWS::StackName" },
" --resource NAT2 ",
" --configsets config ",
" --region ", { "Ref" : "AWS::Region" }, "\n"
]]}}
My UserData after the changes looked like:
"UserData" : { "Fn::Base64" : { "Fn::Join" : ["", [
"#!/bin/bash -xe\n",
"yum update -y aws-cfn-bootstrap\n",
"# Install the files and packages and run the commands from the metadata\n",
"/opt/aws/bin/cfn-init -v --access-key ", { "Ref" : "IAMUserAccessKey" }, " --secret-key ", { "Ref" : "SecretAccessKey" },
" --stack ", { "Ref" : "AWS::StackName" },
" --resource NAT2 ",
" --configsets config ",
" --region ", { "Ref" : "AWS::Region" }, "\n"
]]}}
Similarly, I removed all the sudo calls I was doing in my configsets
Upvotes: 4
Reputation: 27610
I'm not sure if this is going to be the case for everyone, but I was having this issue and was able to fix it by changing my first line from this:
#!/bin/bash -e -v
to just this:
#!/bin/bash
Of course, now my script is failing and I have no idea how far it's getting, but at least I got past it not running it at. :)
Upvotes: 8