Skip to content

Commit

Permalink
WIP: starting with basic docs
Browse files Browse the repository at this point in the history
ATTENTION :-) this changes environment variables! The vars for queue
names were not looking like env variables
  • Loading branch information
eikek committed Sep 26, 2024
1 parent aa73086 commit ee55940
Show file tree
Hide file tree
Showing 42 changed files with 1,345 additions and 163 deletions.
21 changes: 5 additions & 16 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,26 +16,15 @@ jobs:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
java: [ 17 ]
steps:
- uses: actions/[email protected]
with:
fetch-depth: 100
- uses: olafurpg/setup-scala@v14
with:
java-version: ${{ matrix.java }}
# - name: Coursier cache
# uses: coursier/cache-action@v6
fetch-depth: 0
- uses: cachix/install-nix-action@v27
- name: sbt ci ${{ github.ref }}
run: sbt -mem 2048 ci
- name: Log in to Docker Hub
uses: docker/login-action@v2
with:
username: ${{ secrets.RENKU_DOCKER_USERNAME }}
password: ${{ secrets.RENKU_DOCKER_PASSWORD }}
- name: sbt docker:publishLocal
run: sbt -mem 2048 search-provision/Docker/publishLocal search-api/Docker/publishLocal
env:
README_BASE_REF: origin/${{ github.base_ref }}
run: nix develop .#ci --command sbt ci
ci:
runs-on: ubuntu-latest
needs: [ci-matrix]
Expand Down
2 changes: 0 additions & 2 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
name: Release
on:
push:
branches: [ main ]
release:
types: [ published ]

Expand Down
267 changes: 267 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,267 @@
<!-- -*- fill-column: 80 -*- -->
# Renku Search


This provides the renku search services for efficientlyf searching across
entities in the Renku platform.

The engine backing the search functionality is [SOLR](https://solr.apache.org)
(Lucene). The index is created from data pulled from a Redis stream.

There are two services provided: [search-api](#search-api) and
[search-provision](#search-provision).

There is a detailed [development documentation](development.md) for getting
started.

## Search Provision

Responsible for maintaining the index. This service is pulling elements out of
the Redis stream, transforming it into solr documents and the updates the index.
It also creates the SOLR schema and provides endpoints to trigger re-indexing.

This service is internal only and can be used by other service to publish data
that should be search- and discoverable.

The data in the index is received as a Redis message pulled from a Redis stream.

### Messages

Messages in the stream must conform to the definitions in
[renku-schema](https://github.com/SwissDataScienceCenter/renku-schema) and are
sent as binary [Avro](https://avro.apache.org/) messages.

A redis message is expected to contain two keys:

- `headers`
- `payload`

Where the header denotes properties that control how a payload is processed. The
important properties are `type`, `dataContentType` and `schemaVersion`.

**type** specifies what kind of payload is transported and how it can be
decoded. There are currently these message types, each denoting a specific
payload structure:

- `project.created`
- `project.updated`
- `project.removed`
- `projectAuth.added`
- `projectAuth.updated`
- `projectAuth.removed`
- `user.added`
- `user.updated`
- `user.removed`
- `group.added`
- `group.updated`
- `group.removed`
- `memberGroup.added`
- `memberGroup.updated`
- `memberGroup.removed`
- `reprovisioning.started`
- `reprovisioning.finished`

If a header contains a value different from that list, the message cannot be
processed.

**dataContentType** specifies the transport encoding, where avro supports

- `application/avro+binary`
- `application/avro+json`

**schemaVersion** specifies which version of `renku-schema` messages is sent.
Search supports

- `V1`
- `V2`

The `V` can be omitted in the payload.

### Endpoints

There are few endpoints exposed for internal use only.

- `/reindex`
- `/ping`
- `/version`

Doing a re-index works by dropping the SOLR index completely and then re-reading
the Redis stream. The `reindex` endpoint requires POST request with a JSON
payload. It can optionally specify a redis message-id from where to start
reading. If it is omitted, it will start from the last known message that
initiated the index.

Example:

Re-Index from last known start:
```
POST /reindex
Content-Type: application/json
{
}
```

Re-index by speciying a message id to start from:
```
POST /reindex
Content-Type: application/json
{
"messageId": "22154-0"
}
```


### Configuration

The service is configured via environment variables. Each variable is prefixed
with `RS_` (for "renku search").

```
RS_CLIENT_ID=search-provisioner
RS_HTTP_SHUTDOWN_TIMEOUT=30s
RS_LOG_LEVEL=2
RS_METRICS_UPDATE_INTERVAL=15 seconds
RS_PROVISION_HTTP_SERVER_BIND_ADDRESS=0.0.0.0
RS_PROVISION_HTTP_SERVER_PORT=8081
RS_REDIS_CONNECTION_REFRESH_INTERVAL=30 minutes
RS_REDIS_DB=
RS_REDIS_HOST=localhost
RS_REDIS_MASTER_SET=
RS_REDIS_PASSWORD=
RS_REDIS_PORT=6379
RS_REDIS_QUEUE_DATASERVICE_ALLEVENTS=
RS_REDIS_QUEUE_GROUPMEMBER_ADDED=
RS_REDIS_QUEUE_GROUPMEMBER_REMOVED=
RS_REDIS_QUEUE_GROUPMEMBER_UPDATED=
RS_REDIS_QUEUE_GROUP_ADDED=
RS_REDIS_QUEUE_GROUP_REMOVED=
RS_REDIS_QUEUE_GROUP_UPDATED=
RS_REDIS_QUEUE_PROJECTAUTH_ADDED=
RS_REDIS_QUEUE_PROJECTAUTH_REMOVED=
RS_REDIS_QUEUE_PROJECTAUTH_UPDATED=
RS_REDIS_QUEUE_PROJECT_CREATED=
RS_REDIS_QUEUE_PROJECT_REMOVED=
RS_REDIS_QUEUE_PROJECT_UPDATED=
RS_REDIS_QUEUE_USER_ADDED=
RS_REDIS_QUEUE_USER_REMOVED=
RS_REDIS_QUEUE_USER_UPDATED=
RS_REDIS_SENTINEL=
RS_RETRY_ON_ERROR_DELAY=10 seconds
RS_SOLR_CORE=search-core-test
RS_SOLR_LOG_MESSAGE_BODIES=false
RS_SOLR_PASS=
RS_SOLR_URL=http://localhost:8983
RS_SOLR_USER=admin
```


## Search Api

Provides http endpoints for searching the index. There is a [query
dsl](/docs/query-manual.md) for more convenient searching for renku entities.
Additionally, there is an openapi documentation generated and a version
endpoint.

```
GET /api/search/query?q=<query-string>
GET /api/search/version
GET /api/search/spec.json
```

Here is an example of the result structure. For more details, the openapi doc
should be consulted.

``` json
{
"items": [
{
"type": "Project",
"id": "01HRA7AZ2Q234CDQWGA052F8MK",
"name": "renku",
"slug": "renku",
"namespace": {
"type": "Group",
"id": "2CAF4C73F50D4514A041C9EDDB025A36",
"name": "SDSC",
"namespace": "SDSC",
"description": "SDSC group",
"score": 1.1
},
"repositories": [
"https: //github.com/renku"
],
"visibility": "public",
"description": "Renku project",
"createdBy": {
"type": "User",
"id": "1CAF4C73F50D4514A041C9EDDB025A36",
"namespace": "renku/renku",
"firstName": "Albert",
"lastName": "Einstein",
"score": 2.1
},
"creationDate": "2024-09-26T13: 42: 57.213115111Z",
"keywords": [
"data",
"science"
],
"score": 1.0
},
{
"type": "User",
"id": "1CAF4C73F50D4514A041C9EDDB025A36",
"namespace": "renku/renku",
"firstName": "Albert",
"lastName": "Einstein",
"score": 2.1
},
{
"type": "Group",
"id": "2CAF4C73F50D4514A041C9EDDB025A36",
"name": "SDSC",
"namespace": "SDSC",
"description": "SDSC group",
"score": 1.1
}
],
"facets": {
"entityType": {
"Project": 1,
"Group": 1,
"User": 1
}
},
"pagingInfo": {
"page": {
"limit": 25,
"offset": 0
},
"totalResult": 3,
"totalPages": 1
}
}
```

### Configuration

The service is configured via environment variables. Each variable is prefixed
with `RS_` (for "renku search").

```
RS_HTTP_SHUTDOWN_TIMEOUT=30s
RS_JWT_ALLOWED_ISSUER_URL_PATTERNS=
RS_JWT_ENABLE_SIGNATURE_CHECK=true
RS_JWT_KEYCLOAK_REQUEST_DELAY=1 minute
RS_JWT_OPENID_CONFIG_PATH=.well-known/openid-configuration
RS_LOG_LEVEL=2
RS_SEARCH_HTTP_SERVER_BIND_ADDRESS=0.0.0.0
RS_SEARCH_HTTP_SERVER_PORT=8080
RS_SOLR_CORE=search-core-test
RS_SOLR_LOG_MESSAGE_BODIES=false
RS_SOLR_PASS=
RS_SOLR_URL=http://localhost:8983
RS_SOLR_USER=admin
```
32 changes: 31 additions & 1 deletion build.sbt
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,10 @@ releaseVersionBump := sbtrelease.Version.Bump.Minor
releaseIgnoreUntrackedFiles := true
releaseTagName := (ThisBuild / version).value

addCommandAlias("ci", "; lint; compile; Test/compile; dbTests; publishLocal")
addCommandAlias(
"ci",
"readme/readmeCheckModification; lint; compile; Test/compile; dbTests; readme/readmeUpdate; publishLocal; search-provision/Docker/publishLocal; search-api/Docker/publishLocal"
)
addCommandAlias(
"lint",
"; scalafmtSbtCheck; scalafmtCheckAll; scalafixAll --check"
Expand Down Expand Up @@ -441,6 +444,33 @@ lazy val searchCli = project
configValues % "compile->compile;test->test"
)

lazy val readme = project
.in(file("modules/readme"))
.enablePlugins(ReadmePlugin)
.settings(commonSettings)
.settings(
name := "search-readme",
scalacOptions :=
Seq(
"-feature",
"-deprecation",
"-unchecked",
"-encoding",
"UTF-8",
"-language:higherKinds",
"-Xkind-projector:underscores"
),
readmeAdditionalFiles := Map(
// copy query manual into source tree for easier discovery on github
(searchQueryDocs / Docs / outputDirectory).value / "manual.md" -> "query-manual.md"
)
)
.dependsOn(
renkuRedisClient % "compile->compile;compile->test",
searchProvision,
searchApi
)

lazy val commonSettings = Seq(
organization := "io.renku",
publish / skip := true,
Expand Down
Loading

0 comments on commit ee55940

Please sign in to comment.