An example CLI tool in Python that demonstrates integrating Pangea's AuthZ service into a LangChain app to apply user-based authorization to control access to files for a RAG workflow.
- Python v3.12 or greater.
- pip v24.2 or uv v0.4.18.
- A Pangea account with AuthZ enabled.
- An OpenAI API key.
- libmagic
The setup in AuthZ should look something like this:
Name | Permissions |
---|---|
engineering | read |
finance | read |
Tip
At this point you need to create 2 new Roles under the Roles & Access
tab in the Pangea console named engineering
and finance
.
Resource type | Permissions (read) |
---|---|
engineering | ✔️ |
finance | ❌ |
Resource type | Permissions (read) |
---|---|
engineering | ❌ |
finance | ✔️ |
Subject type | Subject ID | Role/Relation |
---|---|---|
user | alice | engineering |
user | bob | finance |
git clone https://github.com/pangeacyber/langchain-python-rag-authz.git
cd langchain-python-rag-authz
This is included in Windows via the python-magic-bin package
On macOS, you can install via this shell command:
brew install libmagic
If using pip:
python -m venv .venv
source .venv/bin/activate
pip install .
Or, if using uv:
uv sync
source .venv/bin/activate
The sample can then be executed with:
python -m langchain_rag_authz --user alice "How much does John Doe make?"
Usage: python -m langchain_rag_authz [OPTIONS] PROMPT
Options:
--user TEXT Unique username to simulate retrieval as.
[required]
--authz-token SECRET Pangea AuthZ API token. May also be set via the
`PANGEA_AUTHZ_TOKEN` environment variable.
[required]
--pangea-domain TEXT Pangea API domain. May also be set via the
`PANGEA_DOMAIN` environment variable. [default:
aws.us.pangea.cloud; required]
--model TEXT OpenAI model. [default: gpt-4o-mini; required]
--openai-api-key SECRET OpenAI API key. May also be set via the
`OPENAI_API_KEY` environment variable. [required]
--help Show this message and exit.
Assuming user "alice" has permission to see engineering documents, they can query the LLM on information regarding those documents:
$ python -m langchain_rag_authz --user alice "What is the software architecture of the company?"
The company's software architecture consists of a frontend built with ReactJS,
Redux, and Axios, along with Material-UI for design components. The backend
utilizes Node.js and Express.js, with MongoDB as the database. Authentication
and authorization are managed through JSON Web Tokens (JWT) and OAuth 2.0, and
version control is handled using Git and GitHub.
But they cannot query finance information:
$ python -m langchain_rag_authz --user alice "What is the top salary in the Engineering department?"
I don't know the answer to that question, and you may not be authorized to know the answer.
And vice versa for "bob", who is in finance but not engineering.