Reputation: 61
I'm trying to read data from a local elasticsearch and getting the "Cannot detect ES version ... 'es.nodes.wan.only'" error but when I enable TRACE logs the application is able to connect to elasticsearch.
I submit the application to local spark using elasticsearch-spark_2.11-2.4.5.jar to connect to elasticsearch 6.2.4.
20/05/07 10:15:47 TRACE HeaderElement: enter HeaderElement.getParameterByName(String)
20/05/07 10:15:47 TRACE CommonsHttpTransport: Rx @[192.168.50.34] [200-OK] [{
"name" : "node-1",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "39igfUt5S4S3JYomBTZmqw",
"version" : {
"number" : "6.2.4",
"build_hash" : "ccec39f",
"build_date" : "2018-04-12T20:37:28.497551Z",
"build_snapshot" : false,
"lucene_version" : "7.2.1",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}
]
20/05/07 10:15:47 DEBUG HttpMethodBase: re-creating response stream from byte array
20/05/07 10:15:47 DEBUG HttpMethodBase: re-creating response stream from byte array
20/05/07 10:15:47 DEBUG DataSource: Discovered Elasticsearch version [6.2.4]
20/05/07 10:15:47 TRACE CommonsHttpTransport: Closing HTTP transport to 192.168.50.34:9200
20/05/07 10:15:47 TRACE HttpConnection: enter HttpConnection.close()
20/05/07 10:15:47 TRACE HttpConnection: enter HttpConnection.closeSockedAndStreams()
Exception in thread "main" org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
at org.elasticsearch.hadoop.rest.InitializationUtils.discoverEsVersion(InitializationUtils.java:196)
at org.elasticsearch.spark.sql.SchemaUtils$.discoverMappingAsField(SchemaUtils.scala:76)
at org.elasticsearch.spark.sql.SchemaUtils$.discoverMapping(SchemaUtils.scala:69)
at org.elasticsearch.spark.sql.ElasticsearchRelation.lazySchema$lzycompute(DefaultSource.scala:112)
at org.elasticsearch.spark.sql.ElasticsearchRelation.lazySchema(DefaultSource.scala:112)
at org.elasticsearch.spark.sql.ElasticsearchRelation$$anonfun$schema$1.apply(DefaultSource.scala:116)
at org.elasticsearch.spark.sql.ElasticsearchRelation$$anonfun$schema$1.apply(DefaultSource.scala:116)
at scala.Option.getOrElse(Option.scala:121)
at org.elasticsearch.spark.sql.ElasticsearchRelation.schema(DefaultSource.scala:116)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:403)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
at Elastic2Json$.main(Elastic2Json.scala:25)
This is the code to read from an elasticsearch index to a dataframe
val spark = SparkSession.builder.appName("ElasticRead").getOrCreate()
val reader = spark.read.format("org.elasticsearch.spark.sql")
.option("es.read.metadata", "false")
.option("es.nodes.wan.only", "true")
.option("es.port", "9200")
.option("es.net.ssl", "false")
.option("es.nodes", "localhost")
.option("es.resource", "myindex/document")
.option("es.http.retries", "3")
println("...test 1")
val df = reader.load("myindex").limit(10)
println("...test 2 Schema")
df.printSchema()
df.show()
```
Thanks
Upvotes: 3
Views: 10729
Reputation: 2206
For me, there were two things that needed to be fixed:
cluster:monitor/*
permissions.Upvotes: 0
Reputation: 61
I solved it changing the elasticsearch library to the specific elasticsearch version I'm using.
Upvotes: 3