Reputation: 105
I have setup cluster with 5 node on Amazon EC2 with multi region center. And ops center node/instance is separate from cluster nodes. When I try to add a existing cluster by the opscenter web, it show "Error creating cluster: Timeout while adding cluster. Please check the log for details on the problem.." on the web. Then I checked the opscenterd.log , it seems that opscenter can connect both nodes, but a warning: "ProcessingError while calling CreateClusterConfController: Timeout while adding cluster. Please check the log for details on the problem."
Do you have any idea on this issue? I'm using DataStax Enterprise version 4.0.2 , Cassandra 2.0.6 and Opscenter 4.1.2. I'm creating cluster on Ubuntu 12.0.4 I have checked the Cassandra system logs and datastax-agent agent log but there is no error.
Is this any existing issue ? like opscenter version 4.1.0 issue " opscenterd breaking when updating definition files on platforms with Python 2.6" which was fixed in 4.1.1. http://www.datastax.com/documentation/opscenter/4.1/opsc/release_notes/opscReleaseNotes411.html
Please suggest.
================================================================================== All port are open on ec2 security groups, (61620,61621 ..etc)> I did the telnet from opsceter to host with port 61621 and telnet from host to opscenter with port 61620 both are connecting. Below is the opscenter.log
2014-05-14 05:53:46+0000 [] INFO: Starting factory <opscenterd.ThriftService.NoReconnectCassandraClientFactory instance at 0x4076c68>
2014-05-14 05:53:46+0000 [] INFO: Adding new cluster 'Connect2me': {u'jmx': {u'username': u'', u'password': u'', u'port': u'7199'}, 'kerberos_client_principals': {}, 'kerberos': {}, u'agents': {}, 'kerberos_hostnames': {}, 'kerberos_services': {}, u'cassandra': {u'username': u'', u'seed_hosts': u'54.214.1.100', u'api_port': u'9160', u'password': u''}}
2014-05-14 05:53:46+0000 [] INFO: Starting new cluster services for Connect2me
2014-05-14 05:53:46+0000 [Connect2me] INFO: Starting services for cluster Connect2me
2014-05-14 05:53:46+0000 [Connect2me] INFO: Loading event plugins
2014-05-14 05:53:46+0000 [Connect2me] INFO: Loading event plugin conf /etc/opscenter/event-plugins/posturl.conf
2014-05-14 05:53:46+0000 [Connect2me] INFO: Successfully loaded event plugin conf /etc/opscenter/event-plugins/posturl.conf
2014-05-14 05:53:46+0000 [Connect2me] INFO: Loading event plugin conf /etc/opscenter/event-plugins/email.conf
2014-05-14 05:53:46+0000 [Connect2me] INFO: Successfully loaded event plugin conf /etc/opscenter/event-plugins/email.conf
2014-05-14 05:53:46+0000 [Connect2me] INFO: Done loading event plugins
2014-05-14 05:53:46+0000 [] INFO: Metric caching enabled with 50 points and 1000 metrics cached
2014-05-14 05:53:46+0000 [] INFO: Starting PushService
2014-05-14 05:53:46+0000 [Connect2me] INFO: Starting CassandraCluster service
2014-05-14 05:53:46+0000 [Connect2me] INFO: agent_config items: {'cassandra_log_location': '/var/log/cassandra/system.log', 'thrift_port': 9160, 'thrift_ssl_truststore': None, 'rollups300_ttl': 2419200, 'rollups86400_ttl': -1, 'jmx_port': 7199, 'metrics_ignored_solr_cores': '', 'api_port': '61621', 'metrics_enabled': 1, 'thrift_ssl_truststore_type': 'JKS', 'kerberos_use_ticket_cache': True, 'use_ssl': 1, 'kerberos_renew_tgt': True, 'rollups60_ttl': 604800, 'cassandra_install_location': '', 'rollups7200_ttl': 31536000, 'kerberos_debug': False, 'storage_keyspace': 'OpsCenter', 'ec2_metadata_api_host': '169.254.169.254', 'provisioning': 0, 'kerberos_use_keytab': True, 'metrics_ignored_column_families': '', 'thrift_ssl_truststore_password': None, 'metrics_ignored_keyspaces': 'system, system_traces, system_auth, dse_auth, OpsCenter'}
2014-05-14 05:53:46+0000 [] INFO: Stopping factory <opscenterd.ThriftService.NoReconnectCassandraClientFactory instance at 0x4076c68>
2014-05-14 05:53:47+0000 [Connect2me] INFO: Enterprise functionality: True
2014-05-14 05:53:48+0000 [Connect2me] INFO: Snitch: com.datastax.bdp.snitch.DseDelegateSnitch
2014-05-14 05:53:48+0000 [Connect2me] INFO: Cluster Name: Connect2me
2014-05-14 05:53:48+0000 [Connect2me] INFO: Partitioner: org.apache.cassandra.dht.RandomPartitioner
2014-05-14 05:53:50+0000 [Connect2me] INFO: Recognizing new node 54.214.1.100 ('128010234515697016761586673489854425713')
2014-05-14 05:53:50+0000 [Connect2me] INFO: Node 54.214.1.100 has multiple tokens (vnodes). Only one picked for display.
2014-05-14 05:53:50+0000 [Connect2me] INFO: Recognizing new node 54.214.1.110 ('74547314523494862953006764525852718268')
2014-05-14 05:53:50+0000 [Connect2me] INFO: Node 54.214.1.110 has multiple tokens (vnodes). Only one picked for display.
2014-05-14 05:53:50+0000 [Connect2me] INFO: Recognizing new node 54.243.203.229 ('95676355653121167189122638977297238333')
2014-05-14 05:53:50+0000 [Connect2me] INFO: Recognizing new node 54.214.1.78 ('165496574052081051366176941207447197429')
2014-05-14 05:53:50+0000 [Connect2me] INFO: Recognizing new node 54.243.201.237 ('164812453774030768973707069212224107713')
2014-05-14 05:53:56+0000 [Connect2me] INFO: Keyspaces: {'dse_security': CassandraKeyspace(name=dse_security, column_families=['tokens'], tables=[u'tokens'], attributes={'strategy_options': {'us-east': '1'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.NetworkTopologyStrategy'}), 'solr_admin': CassandraKeyspace(name=solr_admin, column_families=[], tables=[u'solr_resources'], attributes={'strategy_options': {}, 'replica_placement_strategy': 'org.apache.cassandra.locator.EverywhereStrategy'}), 'mykeyspace1': CassandraKeyspace(name=mykeyspace1, column_families=[], tables=[u'mysolr1', u'videos', u'lyrics', u'song'], attributes={'strategy_options': {'us-west-2': '3', 'us-east': '2'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.NetworkTopologyStrategy'}), 'system': CassandraKeyspace(name=system, column_families=['IndexInfo', 'NodeIdInfo', 'schema_keyspaces', 'hints'], tables=[u'peers', u'range_xfers', u'schema_keyspaces', u'schema_columns', u'IndexInfo', u'schema_triggers', u'sstable_activity', u'peer_events', u'paxos', u'batchlog', u'NodeIdInfo', u'compaction_history', u'compactions_in_progress', u'schema_columnfamilies', u'local', u'hints'], attributes={'strategy_options': {}, 'replica_placement_strategy': 'org.apache.cassandra.locator.LocalStrategy'}), 'cfs_archive': CassandraKeyspace(name=cfs_archive, column_families=['rules', 'sblocks', 'cleanup', 'inode'], tables=[u'rules', u'sblocks', u'cleanup', u'inode'], attributes={'strategy_options': {'us-east': '1'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.NetworkTopologyStrategy'}), 'OpsCenter': CassandraKeyspace(name=OpsCenter, column_families=['events_timeline', 'settings', 'rollups60', 'rollups86400', 'pdps', 'rollups7200', 'events', 'rollups300'], tables=[u'events_timeline', u'settings', u'rollups60', u'rollups86400', u'pdps', u'rollups7200', u'events', u'rollups300'], attributes={'strategy_options': {'replication_factor': '2'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.SimpleStrategy'}), 'system_traces': CassandraKeyspace(name=system_traces, column_families=[], tables=[u'events', u'sessions'], attributes={'strategy_options': {'replication_factor': '2'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.SimpleStrategy'}), 'HiveMetaStore': CassandraKeyspace(name=HiveMetaStore, column_families=['MetaStore'], tables=[u'MetaStore'], attributes={'strategy_options': {'replication_factor': '1'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.SimpleStrategy'}), 'cfs': CassandraKeyspace(name=cfs, column_families=['rules', 'sblocks', 'cleanup', 'inode'], tables=[u'rules', u'sblocks', u'cleanup', u'inode'], attributes={'strategy_options': {'us-east': '1'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.NetworkTopologyStrategy'}), 'dse_system': CassandraKeyspace(name=dse_system, column_families=[], tables=[u'job_trackers'], attributes={'strategy_options': {}, 'replica_placement_strategy': 'org.apache.cassandra.locator.EverywhereStrategy'})}
2014-05-14 05:54:06+0000 [] WARN: ProcessingError while calling CreateClusterConfController: Timeout while adding cluster. Please check the log for details on the problem.
2014-05-14 05:54:54+0000 [Connect2me] INFO: Initializing event storage.
2014-05-14 05:54:54+0000 [Connect2me] INFO: SSL agent communication is enabled. Automatic agent detection will be turned off.
2014-05-14 05:54:54+0000 [Connect2me] INFO: Attempting to load all persisted alert rules
2014-05-14 05:54:55+0000 [Connect2me] INFO: Done initializing event storage.
2014-05-14 05:54:55+0000 [Connect2me] INFO: Done loading persisted scheduled job descriptions
2014-05-14 05:54:55+0000 [Connect2me] INFO: Done loading persisted alert rules
2014-05-14 05:54:55+0000 [Connect2me] INFO: OpsCenter starting up.
Here is datastax-agent/agent.log
INFO [qtp30763405-24] 2014-05-14 05:54:35,911 New JMX connection (127.0.0.1:7199)
INFO [qtp30763405-24] 2014-05-14 05::54:35,921 HTTP: :get /cluster/topology {:node_ip "54.214.1.100"} - 200
INFO [qtp30763405-21] 2014-05-14 05::54:35,934 New JMX connection (127.0.0.1:7199)
INFO [qtp30763405-21] 2014-05-14 05:54:35,945 HTTP: :get /cluster/topology {:node_ip "54.214.1.110"} - 200
INFO [qtp30763405-19] 2014-05-14 05:54:35,952 New JMX connection (127.0.0.1:7199)
INFO [qtp30763405-22] 2014-05-14 05:54:35,957 New JMX connection (127.0.0.1:7199)
INFO [qtp30763405-19] 2014-05-14 05:54:35,960 HTTP: :get /cluster/topology {:node_ip "54.243.203.229"} - 200
INFO [qtp30763405-22] 2014-05-14 05:54:35,972 HTTP: :get /cluster/topology {:node_ip "54.243.201.237"} - 200
I could not see any error in both the logs but still and I think nodes are connecting but still I'm getting timeout error
2014-05-14 05:54:06+0000 [] WARN: ProcessingError while calling CreateClusterConfController: Timeout while adding cluster. Please check the log for details on the problem.
Can anyone help me on this.
Upvotes: 2
Views: 1132
Reputation: 470
There are several things that may cause this to timeout, some of which are bugs in Cassandra and some of which can be optimized OpsCenter-side. You may be able to workaround this issue by creating a cluster config file manually in /etc/opscenter/clusters/ and restarting opscenterd. For example, write the following to mycluster.conf:
[cassandra]
seed_hosts = 1.2.3.4, 2.3.4.5
It may still take ~1 minute for things to properly work for that cluster, but this will bypass the timeout check.
Upvotes: 4
Reputation: 386
The most likely issue since opscenter is is separate from the cluster nodes is firewall issues (security group)
Take a look at the ports listed here http://www.datastax.com/documentation/datastax_enterprise/4.0/datastax_enterprise/sec/secConfFirePort.html and make sure you can telnet from opscenterd to the cluster nodes and back again for the relevant ports.
The error you mention would throw an error with a stack trace that would contain ERROR: Error upacking definitions file
Upvotes: 0