marqh
marqh

Reputation: 276

Make jena and fuseki proxy aware for federated queries

Our application is built on top of Jena and Fuseki. The application uses federated SPARQL queries accessing SPARQL endpoints in the public domain.

Some networks which the application is intended for only allow HTTP requests via a HTTP proxy as part of their network policy.

How may fuseki be configured so that, when it makes HTTP requests as part of a SERVICE block in a SPARQL sub-query, it uses the correct http_proxy?

On Linux, I have tried using a local environment variable

export http_proxy=http://myproxy.notadomain

in the shell which runs the fuseki-server process but Fuseki does not appear to respect this environment variable.

I cannot find information in the Fuseki documentation about how this is handled.

I would like a way to run fuseki-server directly as a Linux process with the proxy configured, either in a configuration file, or as a run time parameter.

All advice gratefully received.

Upvotes: 1

Views: 785

Answers (2)

Rob Hall
Rob Hall

Reputation: 2823

Proxy Configuration

You can use HttpOp to access/change the HttpClient that is used by Jena and then configure then assign a client which has been Configured for Proxy. As a note, the version of HttpClient used by by Jena is not the most up-to-date, so if you are following tutorials (such as the one I linked) you'll need to adjust slightly in order to create a client.

jena-arq-2.12.0 used by fuseki-1.1.0 depends on httpclient-4.2.6. The following code will configure ARQ to use a proxy:

final HttpHost proxy = new HttpHost("someproxy",8080);
final DefaultHttpClient httpclient = new DefaultHttpClient();
httpclient.getParams().setParameter(ConnRoutePNames.DEFAULT_PROXY,proxy);
HttpOp.setDefaultHttpClient(httpclient);

Fuseki Configuration

I can't seem to find a documented method for configuring this in fuseki, so my own personal hack would be:

  • create a class whose static initialization method sets all of the proxy configuration.
package my.fully.qualified;
public class ConfigurationClass {
   static {
      // Proxy config code
      final HttpHost proxy = new HttpHost("someproxy",8080);
      final DefaultHttpClient httpclient = new DefaultHttpClient();
      httpclient.getParams().setParameter(ConnRoutePNames.DEFAULT_PROXY,proxy);
      HttpOp.setDefaultHttpClient(httpclient);
   }
}
  • Put a jar containing that class on the classpath when executing fuseki. This will require tweaking the fuseki script to add my jar to the classpath.
$ java -classpath '*' org.apache.jena.fuseki.FusekiCmd
  • Edit my fuseki configuration to contain a triple of the form [] ja:loadClass "my.fully.qualified.ConfigurationClass" . This will cause fuseki to execute the initialization method for my special class, which will then change the default HttpClient used by Jena/ARQ internally. This is the same technique used by Jena internally to initialize TDB with [] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .

Upvotes: 1

enridaga
enridaga

Reputation: 641

Fuseki is a Java application. What I usually do is to export a JAVA_OPTIONS variable with all my customizations, for example:

export JAVA_OPTIONS="-Xmx10g -Dhttp.proxyHost=proxy.example.org -Dhttp.proxyPort=8080 -Dfile.encoding=UTF-8 -Dlog4j.configuration=file:./log4j.properties"

Upvotes: 2

Related Questions