An AI-powered Pull Request review system that automatically analyzes code changes and provides comprehensive feedback on style, bugs, performance, best practices, and security.
-
Automatic PR analysis using LangGraph and LangChain
-
Smart token management for large PRs
-
Language-aware code analysis
-
Asynchronous processing with Celery
-
Comprehensive review categories:
- Code style and formatting
- Potential bugs and errors
- Performance improvements
- Security vulnerabilities
- Best practices recommendations
-
Hosted link :- http://135.232.104.57:8000
- Python >= 3.11
- Redis (for Celery)
- PostgreSQL (for result storage)
- Clone the repository:
git clone https://github.com/yourusername/pr-review-agent.git
cd pr-review-agent
- Create and activate virtual environment:
python -m venv venv
source venv/bin/activate # On Windows use: venv\Scripts\activate
- Install Poetry:
pip install poetry
- Install dependencies:
poetry install
- Copy environment example and configure:
cp .env.example .env
# Edit .env with your configuration
Create a .env
file with the following configurations:
# Database
DATABASE_URL=postgresql://user:password@localhost/pr_review
# Redis
REDIS_URL=redis://localhost:6379
# OpenAI
OPENAI_API_KEY=your_api_key
- Start Redis:
redis-server
- Start Celery worker:
celery -A app.worker worker --loglevel=info
- Start the FastAPI application:
uvicorn app.main:app --host 0.0.0.0 --port 8000
Initiate a PR review.
Request:
{
"repo_url": "https://github.com/user/repo",
"pr_number": 123,
"github_token": "optional_token"
}
Response:
{
"task_id": "abc123",
"status": "pending"
}
Check the status of a PR review.
Response:
{
"task_id": "abc123",
"status": "processing|completed|failed",
"progress": 75
}
Get the results of a completed PR review.
Response:
{
"task_id": "abc123",
"status": "completed",
"results": {
"files": [
{
"name": "main.py",
"issues": [
{
"type": "style",
"line": 15,
"description": "Line too long",
"suggestion": "Break line into multiple lines"
},
{
"type": "security",
"line": 45,
"description": "Potential SQL injection vulnerability",
"suggestion": "Use parameterized queries"
},
{
"type": "bug",
"line": 23,
"description": "Potential null pointer",
"suggestion": "Add null check"
}
]
}
],
"summary": {
"total_files": 1,
"total_issues": 3,
"critical_issues": 1
}
}
}
pr-review-agent/
├── alembic/
├── app/
│ ├── core/
│ │ ├── config.py
│ │ └── logging_config.py
│ ├── celery.py
│ ├── database.py
│ ├── main.py
│ ├── models.py
│ └── pr_review_agent.py
│
├── tests/
├── .env.example
├── poetry.lock
├── pyproject.toml
├── alembic.ini
├── docker-compose.yml
├── Dockerfile
└── README.md
- LangGraph: For orchestrating the review workflow and managing state transitions.
- Celery: For handling asynchronous processing of PR reviews.
- PostgreSQL: For storing review results and maintaining task history.
- Redis: For Celery message broker and result backend.
- FastAPI: For building a fast, modern API with automatic documentation.
- Docker: For building image and deployment
- Use token counter to measure each file's size and determine if PR fits within model's context window (system prompt + user prompt + PR content)
- When total tokens are within limit, process entire diff in single prompt with surrounding context lines
- Compress by keeping additions and consolidating all deletions into a list
- Sort files by token size within each language (larger files first for each language)
- Fill context up to a safe batch size, mark remaining as "other modified files"
- For other modified file generate small summaries of code and search for bugs
- Include deleted files list if context space remains
This adaptive token-aware strategy ensures we maximize the model's context window while preserving the most relevant code changes for review. Think of it like packing a suitcase - we fit the largest, most important items first (prioritized by language), then add smaller items if space permits.
-
Enhanced Analysis
- Add support for more programming languages
- Improve diff management for large PR
- Add machine learning-based code smell detection
-
Performance Optimization
- Implement caching for similar code patterns
- Add parallel processing for large PRs
- Optimize token usage
-
Integration Features
- Add GitHub Actions integration
- Implement GitLab support
- Add Bitbucket support
-
UI Features
- Add web dashboard for monitoring
- Implement real-time progress updates
- Add custom rule configuration UI
-
Security Enhancements
- Add vulnerability database integration
- Implement custom security rules
- Add compliance checking
curl -X POST "http://135.232.104.57:8000/analyze-pr" \
-H "Content-Type: application/json" \
-d '{"repo_url": "https://github.com/owner/repo", "pr_number": 123, "github_token": "your_github_token_here"}'
curl -X GET "http://135.232.104.57:8000/status/{task_id}" \
-H "Content-Type: application/json"
curl -X GET "http://135.232.104.57:8000/results/{task_id}" \
-H "Content-Type: application/json"
Replace {task_id}
with the actual task ID returned by the POST /analyze-pr
command.
Run the test suite:
poetry run pytest
Run with coverage:
poetry run pytest --cov=app tests/
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.