Shastick
Shastick

Reputation: 1258

Vespa Tutorial – Pig Failing to connect to local Vespa endpoint: URISyntaxException

When following Vespa's tutorial about blog recommendation I ran into an issue connecting to the local Vespa endpoint when calling Pig from the command line with ENDPOINT=$(hostname):8080:

ERROR org.apache.pig.PigServer - exception during parsing: Error during parsing. Pig script failed to parse: 
<file tutorial_feed_content_and_tensor_vespa.pig, line 131, column 0> pig script failed to validate:
  java.lang.IllegalArgumentException: 
  java.net.URISyntaxException: Relative path in absolute URI: localhost:8080

This is a bit frustrating for people unfamiliar with Pig following the tutorial step-by-step.

The accepted answer works to get the correct port set. Problem with Handshake flying-otter.local:8080 is still an issue but is probably unrelated.

edited to add, if it's of any use: The Problem with Handshake seemed to occur when the application was not activated (i.e., deployed but forgot to do the next step).

Upvotes: 3

Views: 305

Answers (2)

Jo Kristian Bergum
Jo Kristian Bergum

Reputation: 3184

Correct usage is

-param ENDPOINT=$(hostname) -D vespa.feed.defaultport=8080 

I see you have gotten around it by rewire the port but using -Dvespa.feed.defaultport will be better for production use cases.

https://github.com/vespa-engine/vespa/pull/3576

Upvotes: 2

Shastick
Shastick

Reputation: 1258

From what I understand, this fails due to Pig checking that the string is a correct URI.

The not so obvious solution (at least at first...) is simply to add http://in front of the hostname so it becomes a valid URI ENDPOINT="http://localhost:8080"

The complete call to Pig thus becomes:

pig -x local -f tutorial_feed_content_and_tensor_vespa.pig \
    -param VESPA_HADOOP_JAR=vespa-hadoop.jar \
    -param DATA_PATH=trainPosts.json \
    -param TEST_INDICES=blog-job/training_and_test_indices/testing_set_ids \
    -param BLOG_POST_FACTORS=blog-job/user_item_cf/product_features \
    -param USER_FACTORS=blog-job/user_item_cf/user_features \
    -param ENDPOINT=http://$(hostname):8080

edit: Well, after running for a (long) while, the issue now becomes, so while the URI is validated, pig seems to add the default port anyway.

 com.yahoo.vespa.http.client.core.communication.IOThread cycle
     INFO: Problem with Handshake localhost:8080:4080 ssl=false

At this point I just used socat to rewire the 4080 port to 8080 to avoid the need to restart the docker vm :/ socat tcp-listen:4080,reuseaddr,fork tcp:localhost:8080

Upvotes: 2

Related Questions