Reputation: 6030
When running chef zero via AWS userdata, the run always fails. However, if I ssh onto the machine and manually execute the same commands, it works as expected. This is the output that I get:
Chef: 11.12.8
[2014-06-11T12:40:34+00:00] INFO: Auto-discovered chef repository at /opt/chef-zero
[2014-06-11T12:40:34+00:00] INFO: Starting chef-zero on port 8889 with repository at repository at /opt/chef-zero
One version per cookbook
[2014-06-11T12:40:34+00:00] INFO: Forking chef instance to converge...
[2014-06-11T12:40:35+00:00] DEBUG: Fork successful. Waiting for new chef pid: 1530
[2014-06-11T12:40:35+00:00] DEBUG: Forked instance now converging
[2014-06-11T12:40:35+00:00] ERROR: undefined method `[]' for nil:NilClass
[2014-06-11T12:40:35+00:00] FATAL: Chef::Exceptions::ChildConvergeError: Chef run process exited unsuccessfully (exit code 1)
The userdata that I set when launching the EC2 instance in AWS includes the following:
curl -L https://www.opscode.com/chef/install.sh | bash
mkdir /opt/chef-zero
cd /opt/chef-zero
wget http://myserver/chef-repo.tar.gz
tar zxf chef-repo
INSTANCE_ID=`curl http://169.254.169.254/latest/meta-data/instance-id`
cat <<EOF > /opt/chef-zero/solo.rb
ssl_verify_mode :verify_peer
node_name "$INSTANCE_ID"
EOF
/opt/chef/bin/chef-client -v >chef-zero.log 2>&1
/opt/chef/bin/chef-client -z -l debug -c solo.rb -o 'role[someRole]' -E BUILD >> chef-zero.log 2>&1
The AMI that I'm using is a custom one that was initially provisioned using knife
+ knife-ec2
(that bootstrapped chef 11.6.0 from an ubuntu 13.04 public ami). The omnibus installer from userdata (curl ... | bash
) is upgrading chef to 11.12.8. The original knife run included chef-client::service
in it's run, and the host is initially configured for use with chef-client + chef-server (i.e. there's a "validation.pem" and "client.rb" in /etc/chef - not sure if that makes a difference).
I am able to log onto the machine and execute chef-client -z -c solo.rb -o 'role[someRole]' -E BUILD
as soon as the machine comes up (after waiting for files to be retrieved and the user-data chef-client to fail) and the chef run executes normally.
I have no idea why the userdata chef-client run fails with undefined method
, any ideas what's causing it?
Upvotes: 0
Views: 814
Reputation: 6030
After some further investigation, and thanks to bit of chatting with the #chef guys on freenode, the problem was narrowed down to the environment.
When executing the script with userdata, the "HOME" variable is not set. shell.rb from the chef gem is littered with references to ENV["HOME"]
.
SSH:
# unset HOME
# chef-client -z -o 'role[test]'
ERROR: undefined method `[]' for nil:NilClass
# export HOME=/root
# chef-client -z -o 'role[test]'
Starting Chef Client, version ....
...
Chef Client finished, ...
If you need to execute chef-client via user data, you should manually export HOME before trying to execute chef.
Bug has been reported at https://tickets.opscode.com/browse/CHEF-5365
edit
Submitted a pull request which has since been merged into master. https://github.com/opscode/chef/pull/1494
Upvotes: 1
Reputation: 286
This likely has nothing to do with chef-zero but indicates a problem in your recipe code (whatever's inside that chef-repo.tar.gz, or is driven by role[someRole]). It indicates an attempt to access a sub-element of a hash like
node['foo']['bar']
but when node['foo']
is nil
(undefined)
Check the stacktrace that's generated by the chef client run to narrow it down.
Upvotes: 0