advait
advait

Reputation: 6525

Datastax Enterprise: shark/spark not working on new analytics node

Background

I just added an Analytics node to my multi-datacenter cluster. I'm running DSE 4.5.1. Here's my topology:

$ dsetool ring liminex_ent
Address          DC           Rack         Workload         Status  State    Load             Effective-Ownership  VNodes                                      
172.31.22.79     Solr         rack1        Search           Up      Normal   1.31 GB          75.00%               1                                           
172.31.42.106    Solr         rack1        Search           Up      Normal   1.11 GB          58.33%               1                                           
172.31.11.202    Solr         rack1        Search           Up      Normal   1.16 GB          66.67%               1                                           
172.31.45.40     Analytics    2a           Unknown          Up      Normal   391.15 MB        100.00%              1                                           
172.31.41.76     us-west-2    2a           Unknown          Up      Normal   2.05 GB          100.00%              255                                         
172.31.50.106    us-west-2    2b           Unknown          Up      Normal   1.29 GB          0.36%                255 
172.31.8.174     us-west-2    2c           Unknown          Up      Normal   2.23 GB          99.64%               255

My liminex_ent keyspace has the following replication:

'class': 'NetworkTopologyStrategy',
'us-west-2': '2',
'Solr': '2',
'Analytics': '1'

Ops Center is recognizing the analytics node (strangely dsetool ring doesn't realize its workload is analytics):

Ops Center Screenshot

The node's /etc/default/dse has HADOOP_ENABLED=1 and SPARK_ENABLED=1

The problem

Running dse shark or dse spark on the node just hangs indefinitely. Moreover, system.log is constantly/repeatedly dumping the following:

INFO [main] 2014-08-22 22:13:34,580 PluginManager.java (line 223) Activating plugin: com.datastax.bdp.plugin.ExternalProcessAuthPlugin
INFO [main] 2014-08-22 22:13:34,582 PluginManager.java (line 232) No enough available nodes to start plugin com.datastax.bdp.plugin.ExternalProcessAuthPlugin. Trying once again...

I don't have enough context about DSE to understand what's going on. There seem to be a couple instances of this problem floating around, but no solutions.

I'd really appreciate some help with this. DSE has been great so far - I would love to get shark working!

Upvotes: 2

Views: 412

Answers (1)

c360ian
c360ian

Reputation: 1303

TL;DR Just do this

Ended up in the same situation when bringing up a new analytics DC with 2 nodes.

The log message mentioned above is saying that the Cassandra is configured with PasswordAuthentication and so DSE/Spark is looking for a way to create a parallel user to keep it secure.

But when a new DC added, generally the Strategy is changes to NTS for all desired keyspaces - usually the user owned ones. However, the security related stuff is all kept in dse_security and system_auth - which also need to be using NTS with RF=N. Unless the new DC has these keyspaces, Spark bring up wont happen. The bringup sequence keeps looking for them every 5 secs and prints outs this cryptic INFO message.

Solution:

ALTER KEYSPACE "system_auth" 
    WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'dc1' : 3, 'dc2' : 2};

ALTER KEYSPACE "dse_security"
    WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'dc1' : 3, 'dc2' : 2};

No restart of the analytics/spark nodes is required. Just run

$ nodetool repair system_auth
$ nodetool repair dse_security

HTH.

Upvotes: 1

Related Questions