Reputation: 20808
I am trying to get etcd on my CoreOS cluster setup with TLS... and having a hell of a time.
I looked at the different guides, generated both client and peer certs and keys
etcd fails to start and what I get in journalctl is the following (IPs and token obfuscated):
Dec 16 00:05:12 coreos-123.123.123.123 systemd[1]: Starting etcd2...
-- Subject: Unit etcd2.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit etcd2.service has begun starting up.
Dec 16 00:05:12 coreos-123.123.123.123 etcd2[822]: recognized and used environment variable ETCD_ADVERTISE_CLIENT_URLS=http://123.123.123.123:2379
Dec 16 00:05:12 coreos-123.123.123.123 etcd2[822]: recognized and used environment variable ETCD_CERT_FILE=/etc/ssl/etcd/etcd-client123.123.123.123.cert.pem
Dec 16 00:05:12 coreos-123.123.123.123 etcd2[822]: recognized and used environment variable ETCD_CLIENT_CERT_AUTH=true
Dec 16 00:05:12 coreos-123.123.123.123 etcd2[822]: recognized and used environment variable ETCD_DATA_DIR=/var/lib/etcd2
Dec 16 00:05:12 coreos-123.123.123.123 etcd2[822]: recognized and used environment variable ETCD_DISCOVERY=https://discovery.etcd.io/xxxxxxxxxxxxxxxxxxxxxxxxxxxx
Dec 16 00:05:12 coreos-123.123.123.123 etcd2[822]: recognized and used environment variable ETCD_INITIAL_ADVERTISE_PEER_URLS=http://123.123.123.123:2380
Dec 16 00:05:12 coreos-123.123.123.123 etcd2[822]: recognized and used environment variable ETCD_KEY_FILE=/etc/ssl/etcd/private/etcd-client123.123.123.123.key.pem
Dec 16 00:05:12 coreos-123.123.123.123 etcd2[822]: recognized and used environment variable ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379,http://0.0.0.0:4001
Dec 16 00:05:12 coreos-123.123.123.123 etcd2[822]: recognized and used environment variable ETCD_LISTEN_PEER_URLS=http://123.123.123.123:2380,http://123.123.123.123:7001
Dec 16 00:05:12 coreos-123.123.123.123 etcd2[822]: recognized and used environment variable ETCD_NAME=yyyyyyyyyyyyyyyyyyyyyyyyyy
Dec 16 00:05:12 coreos-123.123.123.123 etcd2[822]: recognized and used environment variable ETCD_PEER_CERT_FILE=/etc/ssl/etcd/etcd-peer123.123.123.123.cert.pem
Dec 16 00:05:12 coreos-123.123.123.123 etcd2[822]: recognized and used environment variable ETCD_PEER_CLIENT_CERT_AUTH=true
Dec 16 00:05:12 coreos-123.123.123.123 etcd2[822]: recognized and used environment variable ETCD_PEER_KEY_FILE=/etc/ssl/etcd/private/etcd-peer123.123.123.123.key.pem
Dec 16 00:05:12 coreos-123.123.123.123 etcd2[822]: recognized and used environment variable ETCD_PEER_TRUSTED_CA_FILE=/etc/ssl/certs/ca-chain.cert.pem
Dec 16 00:05:12 coreos-123.123.123.123 etcd2[822]: recognized and used environment variable ETCD_TRUSTED_CA_FILE=/etc/ssl/certs/ca-chain.cert.pem
Dec 16 00:05:12 coreos-123.123.123.123 etcd2[822]: etcd Version: 2.2.0
Dec 16 00:05:12 coreos-123.123.123.123 etcd2[822]: Git SHA: e4561dd
Dec 16 00:05:12 coreos-123.123.123.123 etcd2[822]: Go Version: go1.4.2
Dec 16 00:05:12 coreos-123.123.123.123 etcd2[822]: Go OS/Arch: linux/amd64
Dec 16 00:05:12 coreos-123.123.123.123 etcd2[822]: setting maximum number of CPUs to 1, total number of available CPUs is 4
Dec 16 00:05:12 coreos-123.123.123.123 etcd2[822]: the server is already initialized as member before, starting as etcd member...
Dec 16 00:05:12 coreos-123.123.123.123 systemd[1]: etcd2.service: Main process exited, code=exited, status=1/FAILURE
Dec 16 00:05:12 coreos-123.123.123.123 systemd[1]: Failed to start etcd2.
-- Subject: Unit etcd2.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit etcd2.service has failed.
--
-- The result is failed.
Dec 16 00:05:12 coreos-123.123.123.123 systemd[1]: etcd2.service: Unit entered failed state.
Dec 16 00:05:12 coreos-123.123.123.123 systemd[1]: etcd2.service: Failed with result 'exit-code'.
I have the certs and keys in the right folders. I'm pretty sure permissions are fine. The certs have extensions for clientAuth,serverAuth (for peer cert) and clientAuth(for client) as well as SAN with the node IP.
Client cert data:
Exponent: 65537 (0x10001)
X509v3 extensions:
X509v3 Basic Constraints:
CA:FALSE
Netscape Cert Type:
SSL Client, S/MIME
Netscape Comment:
OpenSSL Generated Client Certificate
X509v3 Subject Key Identifier:
X509v3 Authority Key Identifier:
keyid:
X509v3 Key Usage: critical
Digital Signature, Non Repudiation, Key Encipherment
X509v3 Extended Key Usage:
TLS Web Client Authentication, E-mail Protection
X509v3 Subject Alternative Name:
IP Address:127.0.0.1, IP Address:123.123.123.123
Peer cert data:
X509v3 extensions:
X509v3 Basic Constraints:
CA:FALSE
Netscape Cert Type:
SSL Server
Netscape Comment:
OpenSSL Generated Server Certificate
X509v3 Subject Key Identifier:
X509v3 Authority Key Identifier:
keyid:
X509v3 Key Usage: critical
Digital Signature, Key Encipherment
X509v3 Extended Key Usage:
TLS Web Server Authentication, TLS Web Client Authentication
X509v3 Subject Alternative Name:
IP Address:127.0.0.1, IP Address:123.123.123.123
what else am I missing here? there is nothing in this log to explain the failure.
My goal is to have TLS authentication for clients and peers as it is on the public cloud. PS: it worked fine without TLS. I only added the certs and the 8 TLS flags:
# client flags
trusted-ca-file: /etc/ssl/certs/ca-chain.cert.pem
cert-file: /etc/ssl/etcd/etcd-client$public_ipv4.cert.pem
key-file: /etc/ssl/etcd/private/etcd-client$public_ipv4.key.pem
client-cert-auth: true
# peer flags
peer-trusted-ca-file: /etc/ssl/certs/ca-chain.cert.pem
peer-cert-file: /etc/ssl/etcd/etcd-peer$public_ipv4.cert.pem
peer-key-file: /etc/ssl/etcd/private/etcd-peer$public_ipv4.key.pem
peer-client-cert-auth: true
The $public_ipv4 tag gets translated properly obviously since the IP shows in the logs
I just can't tell what is the problem here since the logs don't say much.
Any idea to point me in the right direction?
Thanks
Upvotes: 0
Views: 290
Reputation: 63
Due to an upstream systemd bug, journald may miss the last few log lines when its process exit. If journalctl tells you that etcd stops without fatal or panic message, you could try sudo journalctl -f -t etcd2
to get full log.
Once you have the full log it should tell you what etcd is failing on.
Upvotes: 1