Skip to content

Commit

Permalink
update
Browse files Browse the repository at this point in the history
  • Loading branch information
eikek committed Sep 26, 2024
1 parent 052516e commit f973ae2
Show file tree
Hide file tree
Showing 11 changed files with 375 additions and 118 deletions.
17 changes: 2 additions & 15 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,28 +16,15 @@ jobs:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
java: [ 17 ]
steps:
- uses: actions/[email protected]
with:
fetch-depth: 0
- uses: olafurpg/setup-scala@v14
with:
java-version: ${{ matrix.java }}
# - name: Coursier cache
# uses: coursier/cache-action@v6
- uses: cachix/install-nix-action@v27
- name: sbt ci ${{ github.ref }}
env:
README_BASE_REF: origin/${{ github.base_ref }}
run: sbt -mem 2048 ci
- name: Log in to Docker Hub
uses: docker/login-action@v2
with:
username: ${{ secrets.RENKU_DOCKER_USERNAME }}
password: ${{ secrets.RENKU_DOCKER_PASSWORD }}
- name: sbt docker:publishLocal
run: sbt -mem 2048 search-provision/Docker/publishLocal search-api/Docker/publishLocal
run: nix develop .#ci --command sbt ci
ci:
runs-on: ubuntu-latest
needs: [ci-matrix]
Expand Down
2 changes: 0 additions & 2 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
name: Release
on:
push:
branches: [ main ]
release:
types: [ published ]

Expand Down
91 changes: 45 additions & 46 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,47 +1,47 @@
<!-- -*- fill-column: 80 -*- -->
# Renku Search


This provides the renku search services for efficientlyf searching
across entities in the Renku platform.
This provides the renku search services for efficientlyf searching across
entities in the Renku platform.

The engine backing the search functionality is
[SOLR](https://solr.apache.org) (Lucene). The index is created from
data pulled from a Redis stream.
The engine backing the search functionality is [SOLR](https://solr.apache.org)
(Lucene). The index is created from data pulled from a Redis stream.

There are two services provided: [search-api](#search-api) and
[search-provision](#search-provision).

There is a detailed [development documentation](/developing.md) for getting
started.

## Search Provision

Responsible for maintaining the index. This service is pulling
elements out of the Redis stream, transforming it into solr documents
and the updates the index. It also creates the SOLR schema and
provides endpoints to trigger re-indexing.
Responsible for maintaining the index. This service is pulling elements out of
the Redis stream, transforming it into solr documents and the updates the index.
It also creates the SOLR schema and provides endpoints to trigger re-indexing.

This service is internal only and can be used by other service to
publish data that should be search- and discoverable.
This service is internal only and can be used by other service to publish data
that should be search- and discoverable.

The data in the index is received as a Redis message pulled from a
Redis stream.
The data in the index is received as a Redis message pulled from a Redis stream.

### Messages

Messages in the stream must conform to the definitions in
[renku-schema](https://github.com/SwissDataScienceCenter/renku-schema)
and are sent as binary [Avro](https://avro.apache.org/) messages.
[renku-schema](https://github.com/SwissDataScienceCenter/renku-schema) and are
sent as binary [Avro](https://avro.apache.org/) messages.

A redis message is expected to contain two keys:

- `headers`
- `payload`

Where the header denotes properties that control how a payload is
processed. The important properties are `type`, `dataContentType` and
`schemaVersion`.
Where the header denotes properties that control how a payload is processed. The
important properties are `type`, `dataContentType` and `schemaVersion`.

**type** specifies what kind of payload is transported and how it can
be decoded. There are currently these message types, each denoting a
specific payload structure:
**type** specifies what kind of payload is transported and how it can be
decoded. There are currently these message types, each denoting a specific
payload structure:

- `project.created`
- `project.updated`
Expand All @@ -61,17 +61,16 @@ specific payload structure:
- `reprovisioning.started`
- `reprovisioning.finished`

If a header contains a value different from that list, the message
cannot be processed.
If a header contains a value different from that list, the message cannot be
processed.

**dataContentType** specifies the transport encoding, where avro
supports
**dataContentType** specifies the transport encoding, where avro supports

- `application/avro+binary`
- `application/avro+json`

**schemaVersion** specifies which version of `renku-schema` messages
is sent. Search supports
**schemaVersion** specifies which version of `renku-schema` messages is sent.
Search supports

- `V1`
- `V2`
Expand All @@ -86,11 +85,11 @@ There are few endpoints exposed for internal use only.
- `/ping`
- `/version`

Doing a re-index works by dropping the SOLR index completely and then
re-reading the Redis stream. The `reindex` endpoint requires POST
request with a JSON payload. It can optionally specify a redis
message-id from where to start reading. If it is omitted, it will
start from the last known message that initiated the index.
Doing a re-index works by dropping the SOLR index completely and then re-reading
the Redis stream. The `reindex` endpoint requires POST request with a JSON
payload. It can optionally specify a redis message-id from where to start
reading. If it is omitted, it will start from the last known message that
initiated the index.

Example:

Expand All @@ -117,16 +116,16 @@ Content-Type: application/json

### Configuration

The service is configured via environment variables. Each variable is
prefixed with `RS_` (for "renku search").
The service is configured via environment variables. Each variable is prefixed
with `RS_` (for "renku search").

```
RS_CLIENT_ID=search-provisioner
RS_HTTP_SERVER_BIND_ADDRESS=0.0.0.0
RS_HTTP_SERVER_PORT=8081
RS_HTTP_SHUTDOWN_TIMEOUT=30s
RS_LOG_LEVEL=2
RS_METRICS_UPDATE_INTERVAL=15 seconds
RS_PROVISION_HTTP_SERVER_BIND_ADDRESS=0.0.0.0
RS_PROVISION_HTTP_SERVER_PORT=8081
RS_REDIS_CONNECTION_REFRESH_INTERVAL=30 minutes
RS_REDIS_DB=
RS_REDIS_HOST=localhost
Expand Down Expand Up @@ -162,18 +161,18 @@ RS_SOLR_USER=admin
## Search Api

Provides http endpoints for searching the index. There is a [query
dsl](/docs/query-manual.md) for more convenient searching for renku
entities. Additionally, there is an openapi documentation generated
and a version endpoint.
dsl](/docs/query-manual.md) for more convenient searching for renku entities.
Additionally, there is an openapi documentation generated and a version
endpoint.

```
GET /api/search/query?q=<query-string>
GET /api/search/version
GET /api/search/spec.json
```

Here is an example of the result structure. For more details, the
openapi doc should be consulted.
Here is an example of the result structure. For more details, the openapi doc
should be consulted.

``` json
{
Expand Down Expand Up @@ -204,7 +203,7 @@ openapi doc should be consulted.
"lastName": "Einstein",
"score": 2.1
},
"creationDate": "2024-09-26T09: 22: 00.611173804Z",
"creationDate": "2024-09-26T13: 40: 59.727541431Z",
"keywords": [
"data",
"science"
Expand Down Expand Up @@ -248,18 +247,18 @@ openapi doc should be consulted.

### Configuration

The service is configured via environment variables. Each variable is
prefixed with `RS_` (for "renku search").
The service is configured via environment variables. Each variable is prefixed
with `RS_` (for "renku search").

```
RS_HTTP_SERVER_BIND_ADDRESS=0.0.0.0
RS_HTTP_SERVER_PORT=8080
RS_HTTP_SHUTDOWN_TIMEOUT=30s
RS_JWT_ALLOWED_ISSUER_URL_PATTERNS=
RS_JWT_ENABLE_SIGNATURE_CHECK=true
RS_JWT_KEYCLOAK_REQUEST_DELAY=1 minute
RS_JWT_OPENID_CONFIG_PATH=.well-known/openid-configuration
RS_LOG_LEVEL=2
RS_SEARCH_HTTP_SERVER_BIND_ADDRESS=0.0.0.0
RS_SEARCH_HTTP_SERVER_PORT=8080
RS_SOLR_CORE=search-core-test
RS_SOLR_LOG_MESSAGE_BODIES=false
RS_SOLR_PASS=
Expand Down
2 changes: 1 addition & 1 deletion build.sbt
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ releaseTagName := (ThisBuild / version).value

addCommandAlias(
"ci",
"readme/readmeCheckModification; lint; compile; Test/compile; dbTests; readme/readmeUpdate; publishLocal"
"readme/readmeCheckModification; lint; compile; Test/compile; dbTests; readme/readmeUpdate; publishLocal; search-provision/Docker/publishLocal; search-api/Docker/publishLocal"
)
addCommandAlias(
"lint",
Expand Down
Loading

0 comments on commit f973ae2

Please sign in to comment.