Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lucene Searcher performance #3

Open
TortugaAttack opened this issue Jan 12, 2018 · 1 comment
Open

Lucene Searcher performance #3

TortugaAttack opened this issue Jan 12, 2018 · 1 comment

Comments

@TortugaAttack
Copy link
Owner

TortugaAttack commented Jan 12, 2018

Currently the searcher is pretty slow.
Re arrange the data strucutre so there is no more need to search for 1000000 triples, but create two indexes as following:

1: for all node
HashN -> s_tripleIndex1 s_tripleIndex2 ... o_tripleIndexN

2: for all triples ->
tripleIndexK -> quadString

where as the indexes in the node are the same as the ones in the triples.

Using a IntPoint.newSetQuery this only needs the following searches:
first Step:

  • search for quad Nodes in BGP: (around 1ms on 2 GB Ram with 30.000.000 nodes)
  • merge Indexes (so only the indexes where the searched Nodes are at the field (s,p,o) and merge all of them)
    second step:
  • search using a boolean query all of the indexes in the triplesIndex (thus maxSearch can be set dynamically) (around 80ms on 2GB ram with 30.000.000 triples with 100 results)

Due to the maxSearch dynamically setting, small resultsets are much faster returned (7 results need only an eight about 100 results)
this is still not that good, but an amazing improvment on what it was before.

@TortugaAttack
Copy link
Owner Author

Further on using 10 Cluster, would reduce the size of nodes and triples at about a 10th which would lead to a small improvment again.

It is very important to reduce the maxSearch at all cost!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant