Skip to content

Commit

Permalink
fix(genall): fix genall and use dependency injection, nicer AuthError…
Browse files Browse the repository at this point in the history
… message (#8)

* typer.Context + genall ;/

* update structure ;9

* .gitattributes, add tests, cleanup genall
  • Loading branch information
skyl authored Nov 2, 2024
1 parent 6259461 commit 23c473d
Show file tree
Hide file tree
Showing 28 changed files with 544 additions and 30 deletions.
3 changes: 1 addition & 2 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -1,2 +1 @@
py/gen/* linguist-generated=true

py/packages/corpora_client/* linguist-generated=true
1 change: 0 additions & 1 deletion TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@
- publish to pypi
- consider ruff
- pr-agent only on comments?
- SaaS versus internal, dogfood
- add full oauth 3 leg to CLI
- https://django-oauth-toolkit.readthedocs.io/en/latest/getting_started.html#

Expand Down
141 changes: 141 additions & 0 deletions md/prompts/corpora/about-structure.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
Corpora is a git repository, a body of software.

https://github.com/skyl/corpora

```txt
.
├── .corpora.yaml
├── .devcontainer
│ ├── Dockerfile
│ ├── README.md
│ ├── devcontainer.json
│ ├── entrypoint.sh
│ └── setup.sh
├── .gitattributes
├── .github
│ └── workflows
│ ├── README.md
│ ├── ci-python.yml
│ ├── pgvector17.yml
│ └── pr-agent.yml
├── .gitignore
├── .pr_agent.toml
├── .vscode
│ └── settings.json
├── CODEOWNERS
├── LICENSE
├── NOTICE
├── README.md
├── TODO.md
├── docker
│ └── Dockerfile.pgvector
├── docker-compose.yaml
├── md
│ ├── README.md
│ ├── SETUP.md
│ └── prompts
│ └── corpora
│ ├── about-structure.md
│ └── ai-summary.md
└── py
├── .gitignore
├── README.md
├── genall.sh
├── openapitools.json
├── packages
│ ├── README.md
│ ├── corpora
│ │ ├── README.md
│ │ ├── __init__.py
│ │ ├── admin.py
│ │ ├── api.py
│ │ ├── apps.py
│ │ ├── auth.py
│ │ ├── docs
│ │ │ └── json_field_metadata.md
│ │ ├── lib
│ │ │ ├── README.md
│ │ │ ├── __init__.py
│ │ │ ├── dj
│ │ │ │ ├── __init__.py
│ │ │ │ ├── decorators.py
│ │ │ │ └── test_decorators.py
│ │ │ ├── files.py
│ │ │ └── test_files.py
│ │ ├── migrations
│ │ │ ├── 0001_enable_vector_extension.py
│ │ │ ├── 0002_initial.py
│ │ │ ├── 0003_corpus_owner.py
│ │ │ └── __init__.py
│ │ ├── models.py
│ │ ├── requirements.txt
│ │ ├── schema.py
│ │ ├── test_api.py
│ │ └── test_models.py
│ ├── corpora_cli
│ │ ├── README.md
│ │ ├── __init__py
│ │ ├── auth.py
│ │ ├── commands
│ │ │ ├── __init__.py
│ │ │ ├── corpus.py
│ │ │ └── file.py
│ │ ├── config.py
│ │ ├── constants.py
│ │ ├── main.py
│ │ ├── requirements.txt
│ │ ├── test_auth.py
│ │ └── test_config.py
│ ├── corpora_client
│ │ ├── README.md
│ │ ├── __init__.py
│ │ ├── api
│ │ │ ├── __init__.py
│ │ │ └── corpora_api.py
│ │ ├── api_client.py
│ │ ├── api_response.py
│ │ ├── configuration.py
│ │ ├── docs
│ │ │ ├── CorporaApi.md
│ │ │ ├── CorpusResponseSchema.md
│ │ │ ├── CorpusSchema.md
│ │ │ ├── FileResponseSchema.md
│ │ │ └── FileSchema.md
│ │ ├── exceptions.py
│ │ ├── models
│ │ │ ├── __init__.py
│ │ │ ├── corpus_response_schema.py
│ │ │ ├── corpus_schema.py
│ │ │ ├── file_response_schema.py
│ │ │ └── file_schema.py
│ │ ├── py.typed
│ │ ├── requirements.txt
│ │ ├── rest.py
│ │ ├── setup.py
│ │ └── test-requirements.txt
│ └── corpora_proj
│ ├── README.md
│ ├── __init__.py
│ ├── asgi.py
│ ├── manage.py
│ ├── settings.py
│ ├── urls.py
│ └── wsgi.py
├── pyproject.toml
├── pytest.ini
├── requirements-dev.txt
└── requirements.txt
23 directories, 99 files
```

The purpose of the Corpora project is to build tools that will help build other corpora.

Corpora will soon build itself as it builds tools to build other arbitrary repositories.

What we are working on now:
- build the perfect scalable polyglot monorepo, focusing on Python first
- utilize pgvector with Django to sync repositories to postgres+AI (starting with corpora itself)
- build a beautiful, modern, modular CLI that interacts seamlessly with the API
- run locally in our devcontainer within corpora repo first but we need to be able to publish in a variety of ways: modules to pypi, containers, gitops to our own k8s, etc.
- the software must be top-quality, perfectly tested with the latest best tools
23 changes: 23 additions & 0 deletions md/prompts/corpora/ai-summary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
**Project Overview**
Corpora is a monorepo-driven software platform that builds tools to create, manage, and analyze repositories, starting with itself. It’s designed as a self-sustaining and evolving system, capable of bootstrapping software projects and accelerating development through automation, data insights, and AI-powered analysis.

**Key Components**
- **Polyglot Monorepo**: Architected for scalability, starting with a strong focus on Python, but with flexibility for multiple languages. The repository structure and workflows ensure maintainability, modularity, and ease of integration.
- **pgvector + Django Integration**: Corpora uses pgvector with Django to maintain repository data in PostgreSQL. This enables advanced AI-driven analysis and insights into repository structures and content.
- **Modern CLI and API**: A powerful, modular CLI reflects the API’s capabilities, offering users seamless interaction with the repository's tools. The CLI will be deployable in multiple formats (PyPI, Docker containers, Kubernetes) for versatility.
- **Devcontainer for Local Development**: Configured to ensure a consistent and efficient development environment with all dependencies, setup scripts, and configurations needed for working with Corpora.

**Current Focus**
- Building a highly scalable, polyglot monorepo
- Implementing pgvector with Django for synchronized data storage and AI-enhanced queries
- Creating a modular, intuitive CLI that aligns with the API
- Ensuring code quality and robust testing using the latest best practices in software development

**Deployment Goals**
Corpora tools are designed for multi-platform publishing, with options to distribute via PyPI, Docker, GitOps, and more, ensuring the flexibility to meet various operational needs.

**Quality Standards**
We are committed to maintaining top-notch code quality with comprehensive, automated testing, adhering to the latest best practices and tools in the industry.

**Repository Structure**
The Corpora repository is structured to support this modular and scalable vision, with specific folders for development environments, configurations, Docker setup, documentation, Python packages, and CLI tools.
6 changes: 4 additions & 2 deletions py/genall.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
set -e

rm -rf packages/corpora_client
mkdir -p packages/corpora_client/docs
mkdir -p packages/corpora_client/docs packages/corpora_client/test
rm -rf gen/corpora_client
openapi-generator-cli generate -i http://127.0.0.1:8000/api/openapi.json \
-g python -o gen/corpora_client \
Expand All @@ -12,10 +12,12 @@ openapi-generator-cli generate -i http://127.0.0.1:8000/api/openapi.json \
# TODO: doesn't resolve in the IDE - so ... do something about that.
# pip install -e /workspace/py/gen/corpora_client
# TODO: this is fragile.
cp -r gen/corpora_client/corpora_client packages/corpora_client
cp -r gen/corpora_client/corpora_client/* packages/corpora_client
cp -r gen/corpora_client/docs/* packages/corpora_client/docs
cp -r gen/corpora_client/test/* packages/corpora_client/test
cp gen/corpora_client/README.md packages/corpora_client/README.md
cp gen/corpora_client/setup.py packages/corpora_client/setup.py
cp gen/corpora_client/requirements.txt packages/corpora_client/requirements.txt
cp gen/corpora_client/test-requirements.txt packages/corpora_client/test-requirements.txt
rm -rf gen/corpora_client
black .
22 changes: 15 additions & 7 deletions py/packages/corpora_cli/commands/corpus.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,25 @@
from rich import print as rprint

app = typer.Typer(help="Corpus commands")
# TODO: Implement the corpus commands


@app.command()
def list():
def init(ctx: typer.Context):
"""Initialize a new corpus - Upload a tarball."""
# Access the API client from the context
api_client = ctx.obj["api_client"]
rprint("Initializing a new corpus...")

# Example usage (replace with actual API call)
# response = api_client.create_corpus(...)
# rprint(response)


@app.command()
def list(ctx: typer.Context):
"""List all corpora."""
# rprint("Listing all corpora...")
from corpora_cli.main import corpora_api
api_client = ctx.obj["api_client"]

corpora_list = corpora_api.corpora_api_list_corpora()
corpora_list = api_client.corpora_api_list_corpora()
for corpus in corpora_list:
# rprint(dir(corpus))
# rprint(corpus.json())
rprint(f"{corpus.name}")
10 changes: 10 additions & 0 deletions py/packages/corpora_cli/constants.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
NO_AUTHENTICATION_MESSAGE = """
Currently we support Client Credential authentication.
Check the server in your corpora.yaml file and make sure the server is correct.
Then, in your CLI environment, set the following environment variables:
export CORPORA_CLIENT_ID=your_client_id
export CORPORA_CLIENT_SECRET=your_client_secret
"""
52 changes: 34 additions & 18 deletions py/packages/corpora_cli/main.py
Original file line number Diff line number Diff line change
@@ -1,31 +1,47 @@
import typer

from rich.console import Console
from rich.text import Text
import corpora_client

from corpora_cli.commands import corpus, file
from corpora_cli.config import load_config
from corpora_cli.auth import AuthResolver, AuthError
from corpora_cli.constants import NO_AUTHENTICATION_MESSAGE

app = typer.Typer(help="Corpora CLI: Manage and process your corpora")

# Load config for the session
config = load_config()

# Initialize AuthResolver and authenticate
try:
auth_resolver = AuthResolver(config)
auth_token = auth_resolver.resolve_auth()
except AuthError as e:
typer.echo(str(e), err=True)
raise typer.Exit(code=1)

client_config = corpora_client.Configuration()
client_config.host = config["server"]["base_url"]
client_config.access_token = auth_token
with corpora_client.ApiClient(client_config) as api_client:
corpora_api = corpora_client.CorporaApi(api_client)

# Register commands
def get_api_client(config) -> corpora_client.CorporaApi:
"""
Initialize and authenticate API client with given config.
Returns an authenticated CorporaApi instance.
"""
try:
# Initialize AuthResolver and authenticate
auth_resolver = AuthResolver(config)
auth_token = auth_resolver.resolve_auth()
except AuthError as e:
console = Console()
console.print(Text(str(e), style="bold red"))
console.print(Text(NO_AUTHENTICATION_MESSAGE, style="bold yellow"))
raise typer.Exit(code=1)

# Configure and return the authenticated API client
client_config = corpora_client.Configuration()
client_config.host = config["server"]["base_url"]
client_config.access_token = auth_token
return corpora_client.CorporaApi(corpora_client.ApiClient(client_config))


@app.callback()
def main(ctx: typer.Context):
"""Main entry point. Sets up configuration and API client."""
# Load config and pass it to the context
config = load_config()
ctx.obj = {"api_client": get_api_client(config), "config": config}


# Register commands with the app
app.add_typer(corpus.app, name="corpus", help="Commands for managing corpora")
app.add_typer(file.app, name="file", help="Commands for file operations")

Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Empty file.
66 changes: 66 additions & 0 deletions py/packages/corpora_client/test/test_corpora_api.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# coding: utf-8

"""
Corpora API
API for managing and processing corpora
The version of the OpenAPI document: 0.1.0
Generated by OpenAPI Generator (https://openapi-generator.tech)
Do not edit the class manually.
""" # noqa: E501


import unittest

from corpora_client.api.corpora_api import CorporaApi


class TestCorporaApi(unittest.TestCase):
"""CorporaApi unit test stubs"""

def setUp(self) -> None:
self.api = CorporaApi()

def tearDown(self) -> None:
pass

def test_corpora_api_create_corpus(self) -> None:
"""Test case for corpora_api_create_corpus
Create Corpus
"""
pass

def test_corpora_api_create_file(self) -> None:
"""Test case for corpora_api_create_file
Create File
"""
pass

def test_corpora_api_get_corpus(self) -> None:
"""Test case for corpora_api_get_corpus
Get Corpus
"""
pass

def test_corpora_api_get_file(self) -> None:
"""Test case for corpora_api_get_file
Get File
"""
pass

def test_corpora_api_list_corpora(self) -> None:
"""Test case for corpora_api_list_corpora
List Corpora
"""
pass


if __name__ == "__main__":
unittest.main()
Loading

0 comments on commit 23c473d

Please sign in to comment.