Daniel Pilch
Daniel Pilch

Reputation: 2247

Cassandra Opscenter only showing data for one datacenter

I have installed datastax community Cassandra over two datacenters (3 nodes per datacenter). I think the initial Cassandra configuration is correct however when I am looking at Opscenter (monitoring an existing cluster set up via config files) I have performance metrics for all of HETZNER1 but for none of OVH1.

I have checked the datastax-agent logs but nothing screams out at me that is wrong.

Here are a couple of pictures from Opscenter v5:

Opscenter Storage Capacity

As you can see it shows 3 of 6 nodes (storage should be much greater than 1tb) All agents are however, connected.

Opscenter Data Metrics working HETZNER1 node with data :)

Opscenter Data Metrics broken OVH1 node with no data :(

Checking the datastax-agent logs (startup.log) on OVH1 NODE1 shows:

 INFO [main] 2014-07-25 10:36:09,475 Loading conf files: /var/lib/datastax-agent/conf/address.yaml
 INFO [main] 2014-07-25 10:36:09,512 Java vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.7.0_60
 INFO [main] 2014-07-25 10:36:09,512 DataStax Agent version: 5.0.0
 INFO [main] 2014-07-25 10:36:09,552 Default config values: {:rollups300_ttl 2419200, :settings_cf "settings", :restore_req_update_period 60, :my_channel_prefix "/agent", :poll_period 60, :kerberos_hostname nil, :storage_dc nil, :thrift_conn_timeout 10000, :thrift_max_frame_size 15728640, :rollups60_ttl 604800, :stomp_port 61620, :shorttime_interval 10, :longtime_interval 300, :private-conf-props ["initial_token" "listen_address" "broadcast_address" "rpc_address"], :thrift_port 9160, :async_retry_timeout 5, :agent-conf-group "global-cluster-agent-group", :jmx_host "127.0.0.1", :ec2_metadata_api_host "OMITTED", :metrics_enabled 1, :async_queue_size 5000, :disk_usage_update_period 60, :autodiscovery_interval 120, :rollups7200_ttl 31536000, :autodiscovery_enabled true, :thrift_ssl_truststore nil, :rollup_snapshot_period 300, :is_package true, :monitor_command "/usr/share/datastax-agent/bin/datastax_agent_monitor", :thrift_socket_timeout 5000, :cassandra_log_location "/var/log/cassandra/system.log", :config_md5 nil, :jmx_port 7199, :jmx_metrics_threadpool_size 4, :use_ssl 0, :rollups86400_ttl -1, :nodedetails_threadpool_size 3, :api_port 61621, :kerberos_service nil, :kerberos_client_principal nil, :jmx_thread_pool_size 5, :production 1, :runs_sudo 1, :stomp_interface "OMITTED", :storage_keyspace "OpsCenter", :rollup_snapshot_threshold 300, :thrift_ssl_truststore_type "JKS", :realtime_interval 5}
 INFO [main] 2014-07-25 10:36:09,553 Waiting for the config from OpsCenter
 INFO [main] 2014-07-25 10:36:09,554 Attempting to determine Cassandra's broadcast address through JMX
 INFO [main] 2014-07-25 10:36:09,554 Starting Stomp
 INFO [Initialization] 2014-07-25 10:36:09,556 New JMX connection (127.0.0.1:7199)
 INFO [main] 2014-07-25 10:36:09,557 SSL communication is disabled
 INFO [main] 2014-07-25 10:36:09,557 Creating stomp connection to OMITTED:61620
 INFO [StompConnection receiver] 2014-07-25 10:36:09,562 Reconnecting in 0s.
 INFO [StompConnection receiver] 2014-07-25 10:36:09,566 Connected to OMITTED:61620
 INFO [main] 2014-07-25 10:36:09,672 Starting Jetty server: {:port 61621, :host nil, :ssl? false, :join? false}
 INFO [StompConnection receiver] 2014-07-25 10:36:09,687 Got new config from OpsCenter: {:kerberos_use_keytab true, :rollups300_ttl 2419200, :kerberos_use_ticket_cache true, :rollups60_ttl 604800, :thrift_port 9160, :ec2_metadata_api_host "OMITTED", :metrics_enabled 1, :rollups7200_ttl 31536000, :thrift_ssl_truststore nil, :metrics_ignored_column_families "", :cassandra_log_location "/var/log/cassandra/system.log", :thrift_rpc_interface "OMITTED", :config_md5 "abfe7ce1d2750e030dada2ffb4551777", :jmx_port 7199, :provisioning 0, :use_ssl 0, :kerberos_debug false, :rollups86400_ttl -1, :api_port "61621", :storage_keyspace "OpsCenter", :kerberos_renew_tgt true, :metrics_ignored_solr_cores "", :thrift_ssl_truststore_type "JKS", :metrics_ignored_keyspaces "system, system_traces, system_auth, dse_auth, OpsCenter", :rollup_subscriptions [], :cassandra_install_location ""}
 INFO [StompConnection receiver] 2014-07-25 10:36:09,688 New JMX connection (127.0.0.1:7199)
 INFO [Jetty] 2014-07-25 10:36:09,698 Jetty server started   

agent.log:

df: `/var/named/chroot/etc/named.rfc1912.zones': Permission denied
df: `/var/named/chroot/etc/rndc.key': Permission denied
df: `/var/named/chroot/usr/lib64/bind': Permission denied
df: `/var/named/chroot/etc/named.iscdlv.key': Permission denied
df: `/var/named/chroot/etc/named.root.key': Permission denied
Filesystem     Type     1G-blocks  Used Available Use% Mounted on
rootfs         rootfs         884     2       838   1% /
/dev/root      ext3           884     2       838   1% /
devtmpfs       devtmpfs        63     1        63   1% /dev
tmpfs          tmpfs           63     0        63   0% /dev/shm

I don't think permission for the chroot partitions is really necessary...?

installer.log:

2014-07-24 19:57:38 +0200
2014-07-24 19:57:38 +0200  Installed:
2014-07-24 19:57:38 +0200  datastax-agent.noarch 0:5.0.0-1
2014-07-24 19:57:38 +0200
2014-07-24 19:57:38 +0200  Complete!
2014-07-24 19:57:38 +0200  Installing certificates from opscenterd...
2014-07-24 19:57:38 +0200  cp: cannot stat `ssl/agentKeyStore': No such file or directory
2014-07-24 19:57:38 +0200  Setting up agent node state...
2014-07-24 19:57:38 +0200  Starting new agent...
2014-07-24 19:57:38 +0200  Starting DataStax Agent datastax-agent
2014-07-24 19:57:38 +0200  Starting datastax-agent         [  OK  ]
2014-07-24 19:57:38 +0200  log4j:WARN No appenders could be found for logger (org.eclipse.jetty.util.log).
2014-07-24 19:57:38 +0200  log4j:WARN Please initialize the log4j system properly.
2014-07-24 19:57:38 +0200  log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
2014-07-24 19:57:42 +0200  Agent installation complete.

I really can't seem to figure this one out. Can anyone help?

Thanks

Upvotes: 1

Views: 1276

Answers (2)

I resolve by fixing stomp_interface of each nodes:

----------configure point to Opscenter------

nano /var/lib/datastax-agent/conf/address.yaml

stomp_interface: OPSCENTER-INTERNAL-IP

service datastax-agent restart

Upvotes: 0

Daniel Pilch
Daniel Pilch

Reputation: 2247

Ok so I found the culprit, the datastax-agent runs as a non-sudo user. It was because the df commands were failing that data was not being sent to opscenter.

Changing the user in /etc/init.d/datastax-agent to a privileged user fixed the issue!

Upvotes: 4

Related Questions