From 31676aaf07499dcfb664d97132539d71a5734755 Mon Sep 17 00:00:00 2001 From: Nick Molcanov <32801560+nck-mlcnv@users.noreply.github.com> Date: Thu, 5 Sep 2024 11:08:32 +0200 Subject: [PATCH] Update documentation --- docs/configuration/queries.md | 26 ++++++++++++++++++- example-suite.yml | 6 ++++- .../iguana/cc/query/handler/QueryHandler.java | 6 ++--- 3 files changed, 33 insertions(+), 5 deletions(-) diff --git a/docs/configuration/queries.md b/docs/configuration/queries.md index 262f1b98c..1d3bca1de 100644 --- a/docs/configuration/queries.md +++ b/docs/configuration/queries.md @@ -16,7 +16,7 @@ The `queries` property is an object that contains the following properties: | order | no | `linear` | The order in which the queries are executed. If set to `linear` the queries will be executed in their order inside the file. If `format` is set to `folder`, queries will be sorted by their file name first. | `random` or `linear` | | seed | no | `0` | The seed for the random number generator that selects the queries. If multiple workers use the same query handler, their seed will be the sum of the given seed and their worker id. | `12345` | | lang | no | `SPARQL` | Not used for anything at the moment. | | - +| pattern | no | | If set, queries from `path` will be treated as patten queries. See [Pattern Queries](#pattern-queries) for more information. | | ## Format ### One-per-line @@ -93,3 +93,27 @@ tasks: lang: "SPARQL" # ... additional worker properties ``` + +## Pattern Queries +The pattern attribute has the following properties: +- `endpoint` - the endpoint to query +- `limit` - the maximum number of instances per query pattern +- `caching` - if set to `true`, queries instances will be stored in files + +Pattern queries are queries that contain placeholders. +A query pattern is a SPARQL 1.1 Query, which can have additional variables in the regex form of +`%%var[0-9]+%%` in the Basic Graph Pattern. + +An exemplary pattern: +`SELECT * WHERE {?s %%var1%% ?o . ?o %%var2%%}` + +This pattern will then be converted to: +`SELECT ?var1 ?var2 {?s ?var1 ?o . ?o ?var2}` + +The SELECT query will then be requested from the given sparql endpoint (e.g DBpedia). +The solutions for this query are used to instantiate the query pattern. +The results may look like the following: +- `SELECT * WHERE {?s ?o . ?o "123"}` +- `SELECT * WHERE {?s ?o . ?o "12"}` +- `SELECT * WHERE {?s ?o . ?o "1234"}` + diff --git a/example-suite.yml b/example-suite.yml index 00c50eb5e..873bc73df 100644 --- a/example-suite.yml +++ b/example-suite.yml @@ -74,7 +74,11 @@ tasks: number: 16 requestType: post query queries: - path: "./example/queries.txt" + path: "./example/query_pattern.txt" + pattern: + endpoint: "https://dbpedia.org/sparql" + limit: 1000 + caching: false timeout: 180s completionTarget: duration: 1000s diff --git a/src/main/java/org/aksw/iguana/cc/query/handler/QueryHandler.java b/src/main/java/org/aksw/iguana/cc/query/handler/QueryHandler.java index 7eff27c66..dc043b98e 100644 --- a/src/main/java/org/aksw/iguana/cc/query/handler/QueryHandler.java +++ b/src/main/java/org/aksw/iguana/cc/query/handler/QueryHandler.java @@ -295,9 +295,9 @@ public Config getConfig() { * and will request query solutions from the given sparql endpoint (e.g DBpedia).
* The solutions will then be instantiated into the query pattern. * The result may look like the following:
- * SELECT * {?s <http://prop/1> ?o . ?o <http://exa.com> "123"}
- * SELECT * {?s <http://prop/1> ?o . ?o <http://exa.com> "12"}
- * SELECT * {?s <http://prop/2> ?o . ?o <http://exa.com> "1234"}
+ * SELECT * WHERE {?s <http://prop/1> ?o . ?o <http://exa.com> "123"}
+ * SELECT * WHERE {?s <http://prop/1> ?o . ?o <http://exa.com> "12"}
+ * SELECT * WHERE {?s <http://prop/2> ?o . ?o <http://exa.com> "1234"}
*/ private static List instantiatePatternQueries(QuerySource querySource, Config.Pattern config) throws IOException { final var patternQueries = new InMemQueryList(querySource);