Reputation: 41
SPARQL queries not working as expected in our Wikidata Docker WDQS (wikidata query service).
We are running wikibase docker on an AWS EC2. First I will describe the 3 queries that are not working, and then provide details about our setup. We suspect there is a setting in the docker-compose.yml file (at end of post) that is not correct.
Query 1 - no results selected. The query is:
# return item's whose favorite city (P8) is Chicago (Q7)
SELECT ?item ?itemLabel WHERE {
?item wdt:P8 wd:Q7 .
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
The query returns “No Matching Records Found” even though we have entered an item Q1, whose favorite city (P8) is Chicago (Q7). Note that the “mouse over” in the WDQS/sparql UI does indicate P8 is favorite city and Q7 is chicago, but the "mouse over" data is coming from elasticsearch and not the WDQS service.
Query 2 - a simpler query that returns results, but the returned itemLabel is the Q number and not the text. The link returned seems correct and does link to the correct item. HOWEVER, if I use a Q number that is not in our wikibase (like Q9999999999) it will still return a link (of course the link will not work because the Q number does not exist).
The query is:
SELECT ?item ?itemLabel
WHERE {VALUES ?item {wd:Q1}
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }}
Query 3 - A simple query that looks to work except that the links to our wikibase have the value wikibase.svc and not a valid link to our wikibase. Here is the query:
SELECT * WHERE { ?a ?b ?c}
In the results I think that ‘wikibase.svc’ should be the url to our wikibase server.
Our setup: we are running wikibase docker on an AWS EC2 (Centos 8) behind an AWS ALB (application load balancer - which also manages SSL)
We use xxxx.xxxx.xxxx.xxxx.edu/sparql/ to access the EC2 on port 8282 and therefore the wdqs-frontend container. (the routing to the port, is managed by the ALB)
xxxx.xxxx.xxxx.xxxx.edu is routed by the ALB to port 8181 on the EC2 and therefore the wikibase container.
For the wdqs-frontend container we have updated /etc/nginx/nginx.conf to include /etc/nginx/conf.d/sparqlpath.conf instead of default.conf.
Sparqlpath.conf is below.
# This file, sparqlpath.conf, replaces the file (default.conf) provided by the wikibase/wdqs-frontend docker image.
# Modifications:
# Joe Troy 01/15/2021 add location /sparql and sparlq_upstream because users will use [servername]/sparql to connect to the sparql frontend
# also note that this change required changes to the AWS application Load Balancer (ALB)
upstream sparql_upstream {
server 127.0.0.1:80;
}
server {
listen 80;
server_name localhost;
location /sparql {
# the trailing slash is key as not to look for a slash sub directory
#include /etc/nginx/mime.types;
proxy_pass http://sparql_upstream/;
}
location /proxy/wikibase {
rewrite /proxy/wikibase/(.*) /$1 break;
proxy_pass http://wikibase.svc:80;
}
location /proxy/wdqs {
rewrite /proxy/wdqs/(.*) /$1 break;
proxy_pass http://wdqs-proxy.svc:80;
}
location / {
root /usr/share/nginx/html;
index index.html index.htm;
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}
}
And finally below is our docker-compose.yml file
# Wikibase with Query Service
#
# This docker-compose example can be used to pull the images from docker hub.
#
# Examples:
#
# Access Wikibase via "http://localhost:8181"
# (or "http://$(docker-machine ip):8181" if using docker-machine)
#
# Access Query Service via "http://localhost:8282"
# (or "http://$(docker-machine ip):8282" if using docker-machine)
version: '3'
services:
wikibase:
#image: wikibase/wikibase:1.34-bundle
#
#NOTES regarding the authorities.library.illinois.edu implementation
#The wikibase image is modified so it has settings relevent to authorities.library.illinois.edu
#if a new version of the wikibase docker image is used
#changes to the /LocalSettings.php.template should be examined
#any changes might need to be reflected in ./wikibase/LocalSettings.php.template in this repository
#
#Modified to use a Dockerfile
build:
context: .
dockerfile: ./wikibase/Dockerfile
links:
- mysql
ports:
# CONFIG - Change the 8181 here to expose Wikibase & MediaWiki on a different port
- "8181:80"
volumes:
- mediawiki-images-data:/var/www/html/images
- quickstatements-data:/quickstatements/data
- /etc/localtime:/etc/localtime:ro
depends_on:
- mysql
- elasticsearch
restart: unless-stopped
networks:
default:
aliases:
- wikibase.svc
- xxxx.xxxx.xxxx.xxxx.edu
# CONFIG - Add (added directly above) your real wikibase hostname here, for example wikibase-registry.wmflabs.org
environment:
- DB_SERVER=mysql.svc:3306
- MW_ELASTIC_HOST=elasticsearch.svc
- MW_ELASTIC_PORT=9200
# CONFIG - Change the default values below
- MW_ADMIN_NAME=WikibaseAdmin
- MW_ADMIN_PASS=${ENV_VAR_WikibaseDockerAdminPass}
- [email protected]
- MW_WG_SECRET_KEY=secretkey
# CONFIG - Change the default values below (should match mysql values in this file)
- DB_USER=wikiuser
- DB_PASS=${ENV_VAR_sqlpass}
- DB_NAME=my_wiki
- QS_PUBLIC_SCHEME_HOST_AND_PORT=http://localhost:9191
- SMTP_PASS=${ENV_VAR_sparkpostpass}
mysql:
image: mariadb:10.3
restart: unless-stopped
volumes:
- mediawiki-mysql-data:/var/lib/mysql
- /etc/localtime:/etc/localtime:ro
environment:
MYSQL_RANDOM_ROOT_PASSWORD: 'yes'
# CONFIG - Change the default values below (should match values passed to wikibase)
MYSQL_DATABASE: 'my_wiki'
MYSQL_USER: 'wikiuser'
MYSQL_PASSWORD: '${ENV_VAR_sqlpass}'
networks:
default:
aliases:
- mysql.svc
wdqs-frontend:
#image: wikibase/wdqs-frontend:latest
build:
context: .
dockerfile: ./wdqs-frontend/Dockerfile
restart: unless-stopped
ports:
# CONFIG - Change the 8282 here to expose the Query Service UI on a different port
- "8282:80"
depends_on:
- wdqs-proxy
networks:
default:
aliases:
- wdqs-frontend.svc
environment:
- WIKIBASE_HOST=xxxx.xxxx.xxxx.xxxx.edu
- WDQS_HOST=wdqs-proxy.svc
volumes:
- /etc/localtime:/etc/localtime:ro
wdqs:
image: wikibase/wdqs:0.3.10
restart: unless-stopped
volumes:
- query-service-data:/wdqs/data
- /etc/localtime:/etc/localtime:ro
tmpfs: /tmp
command: /runBlazegraph.sh
networks:
default:
aliases:
- wdqs.svc
environment:
- WIKIBASE_HOST=xxxx.xxxx.xxxx.xxxx.edu
- WIKIBASE_SCHEME=https
- WDQS_HOST=wdqs.svc
- WDQS_PORT=9999
expose:
- 9999
wdqs-proxy:
image: wikibase/wdqs-proxy
restart: unless-stopped
environment:
- PROXY_PASS_HOST=wdqs.svc:9999
ports:
- "8989:80"
depends_on:
- wdqs
volumes:
- /etc/localtime:/etc/localtime:ro
networks:
default:
aliases:
- wdqs-proxy.svc
wdqs-updater:
image: wikibase/wdqs:0.3.10
restart: unless-stopped
command: /runUpdate.sh
depends_on:
- wdqs
- wikibase
networks:
default:
aliases:
- wdqs-updater.svc
environment:
- WIKIBASE_HOST=wikibase.svc
- WIKIBASE_SCHEME=http
- WDQS_HOST=wdqs.svc
- WDQS_PORT=9999
volumes:
- /etc/localtime:/etc/localtime:ro
elasticsearch:
image: wikibase/elasticsearch:6.5.4-extra
restart: unless-stopped
networks:
default:
aliases:
- elasticsearch.svc
environment:
discovery.type: single-node
ES_JAVA_OPTS: "-Xms512m -Xmx512m"
volumes:
- /etc/localtime:/etc/localtime:ro
# CONFING, in order to not load quickstatements then remove this entire section
quickstatements:
image: wikibase/quickstatements:latest
ports:
- "9191:80"
depends_on:
- wikibase
volumes:
- quickstatements-data:/quickstatements/data
- /etc/localtime:/etc/localtime:ro
networks:
default:
aliases:
- quickstatements.svc
environment:
- QS_PUBLIC_SCHEME_HOST_AND_PORT=http://localhost:9191
- WB_PUBLIC_SCHEME_HOST_AND_PORT=http://localhost:8181
- WIKIBASE_SCHEME_AND_HOST=http://wikibase.svc
- WB_PROPERTY_NAMESPACE=122
- "WB_PROPERTY_PREFIX=Property:"
- WB_ITEM_NAMESPACE=120
- "WB_ITEM_PREFIX=Item:"
volumes:
mediawiki-mysql-data:
mediawiki-images-data:
query-service-data:
quickstatements-data:
Upvotes: 4
Views: 538