jjanes
jjanes

Reputation: 44363

user data scripts fails without giving reason

I am starting a Amazon Linux instance (ami-fb8e9292) using the web console, pasting data into the user data box to run a script upon startup. If I use the example given by amazon to start a web server, it works. But when I run my own script (also a #!/bin/bash script), it does not get run.

If I look in var/log/cloud-init.log, it gives no useful information on the topic:

May 22 21:06:12 cloud-init[1286]: util.py[DEBUG]: Running command ['/var/lib/cloud/instance/scripts/part-001'] with allowed return codes [0] (shell=True, capture=False)
May 22 21:06:16 cloud-init[1286]: util.py[WARNING]: Failed running /var/lib/cloud/instance/scripts/part-001 [2]
May 22 21:06:16 cloud-init[1286]: util.py[DEBUG]: Failed running /var/lib/cloud/instance/scripts/part-001 [2]
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/cloudinit/util.py", line 637, in runparts
    subp([exe_path], capture=False, shell=True)
  File "/usr/lib/python2.6/site-packages/cloudinit/util.py", line 1528, in subp
    cmd=args)
ProcessExecutionError: Unexpected error while running command.
Command: ['/var/lib/cloud/instance/scripts/part-001']
Exit code: 2
Reason: -
Stdout: ''
Stderr: ''

If I ssh into the instance and sudo su and execute the shell script directly:

/var/lib/cloud/instance/scripts/part-001

then it runs fine. Also, it works if I emulate the way cloud-init runs it:

python
>>> import cloudinit.util
>>> cloudinit.util.runparts("/var/lib/cloud/instance/scripts/")

Using either of those methods, if I intentionally introduce errors into the script then it produces error messages. How can I debug the selective absence of useful debugging output?

Upvotes: 27

Views: 36156

Answers (6)

Guglielmo Celata
Guglielmo Celata

Reputation: 316

I've been through this, and in my case it was also an issue with spaces before the she-bang #!bin/bash.

I spun up an instance through python code, using boto3.

ec2 = boto3.resource('ec2', region_name='eu-south-1')
instance = ec2.create_instances(
    image=AMI_IMAGE_ID,
    InstanceType=INSTANCE_TYPE,
    ...
    UserData=USER_DATA_SCRIPT
    ...
)

where the definition of USER_DATA_SCRIPT was:

USER_DATA_SCRIPT = """
#!/bin/bash
apt update -y
apt upgrade -y
...
"""

This contained spaces up front, and this caused the script to generate the error without further details in /var/log/cloud-init-output.log.

Changing it into:

USER_DATA_SCRIPT = """#!/bin/bash
apt update -y
apt upgrade -y
...
"""

solved the issue.

Upvotes: 3

Dejan Vasiljevic
Dejan Vasiljevic

Reputation: 17

In my case cloudinit could not start script because userdata must start with

#!bin/bash

without empty spaces in front of it! Nice AWS bug, lot of time for troubleshooting :)

Upvotes: 0

taras
taras

Reputation: 6915

Hope it will reduce the debugging time for someone. I didn't have any explicit error messages in my /var/log/cloud-init-output.log, just this:

2021-04-07 10:36:57,748 - cc_scripts_user.py[WARNING]: Failed to run module scripts-user (scripts in /var/lib/cloud/instance/scripts) 2021-04-07 10:36:57,748 - util.py[WARNING]: Running module scripts-user (<module 'cloudinit.config.cc_scripts_user' from '/usr/lib/python3/dist-packages/cloudinit/config/cc_scripts_user.py'>) failed

After some investigation, I've realized that the cause was a typo in the shebang string: #!?bin/bash instead of #!/bin/bash.

Upvotes: 5

Rotem jackoby
Rotem jackoby

Reputation: 22198

Instead of /var/log/cloud-init.log consider searcing for keywords like "Failed", "ERROR" "WARNING" or "/var/lib/cloud/instance/scripts/" inside /var/log/cloud-init-output.log - which in most cases, contains very clear error messages.

For example - running a bad command will produce the following error in /var/log/cloud-init-output.log:

/var/lib/cloud/instance/scripts/part-001: line 10: vncpasswd: command not found
cp: cannot stat '/lib/systemd/system/[email protected]': No such file or directory
sed: can't read /etc/systemd/system/[email protected]: No such file or directory
Failed to execute operation: No such file or directory
Failed to start vncserver@:1.service: Unit not found.
Loaded plugins: extras_suggestions, langpacks, priorities, update-motd
Cleaning repos: amzn2-core amzn2extra-docker amzn2extra-epel

And at the end of /var/log/cloud-init.log you'll receive a quiet general error message:

Aug 31 15:14:00 cloud-init[3532]: util.py[DEBUG]: Failed running /var/lib/cloud/instance/scripts/part-001 [1]
    Traceback (most recent call last):
      File "/usr/lib/python2.7/site-packages/cloudinit/util.py", line 910, in runparts
        subp(prefix + [exe_path], capture=False, shell=True)
      File "/usr/lib/python2.7/site-packages/cloudinit/util.py", line 2105, in subp
        cmd=args)
    ProcessExecutionError: Unexpected error while running command.
    Command: ['/var/lib/cloud/instance/scripts/part-001']
    Exit code: 1
    Reason: -
    Stdout: -
    Stderr: -
    cc_scripts_user.py[WARNING]: Failed to run module scripts-user (scripts in /var/lib/cloud/instance/scripts)

(*) Try to grep just the relevant error message with:

grep -C 10 '<search-keyword>' cloud-init-output.log

Upvotes: 13

suripoori
suripoori

Reputation: 341

I had a similar issue and I was able to get around it. I realized that the environment variables EC2_HOME would not be setup for the sudo. I was doing a bunch of stuff in my configset which uses aws cli and for these to work, the EC2_HOME needs to be setup. So, I went in and removed sudo everywhere in my configset and UserData. Earlier when I was hitting the issue, my UserData looked like:

"UserData"       : { "Fn::Base64" : { "Fn::Join" : ["", [
                                "#!/bin/bash\n",
                                "sudo yum update -y aws-cfn-bootstrap\n",

                                "# Install the files and packages and run the commands from the metadata\n",
                                "sudo /opt/aws/bin/cfn-init -v --access-key ", { "Ref" : "IAMUserAccessKey" }, " --secret-key ", { "Ref" : "SecretAccessKey" },  
                                "         --stack ", { "Ref" : "AWS::StackName" },
                                "         --resource NAT2 ",
                                "         --configsets config ",
                                "         --region ", { "Ref" : "AWS::Region" }, "\n"
                        ]]}}

My UserData after the changes looked like:

"UserData"       : { "Fn::Base64" : { "Fn::Join" : ["", [
                                "#!/bin/bash -xe\n",
                                "yum update -y aws-cfn-bootstrap\n",

                                "# Install the files and packages and run the commands from the metadata\n",
                                "/opt/aws/bin/cfn-init -v --access-key ", { "Ref" : "IAMUserAccessKey" }, " --secret-key ", { "Ref" : "SecretAccessKey" },  
                                "         --stack ", { "Ref" : "AWS::StackName" },
                                "         --resource NAT2 ",
                                "         --configsets config ",
                                "         --region ", { "Ref" : "AWS::Region" }, "\n"
                        ]]}}

Similarly, I removed all the sudo calls I was doing in my configsets

Upvotes: 4

jonny five
jonny five

Reputation: 27610

I'm not sure if this is going to be the case for everyone, but I was having this issue and was able to fix it by changing my first line from this:

#!/bin/bash -e -v

to just this:

#!/bin/bash

Of course, now my script is failing and I have no idea how far it's getting, but at least I got past it not running it at. :)

Upvotes: 8

Related Questions