A simple web application that summarizes documents from a given URL using advanced language processing techniques with LangChain and Groq. This app allows users to input a URL, fetch the content of the document, and generate concise summaries.
- URL Input: Users can enter a URL to load documents.
- Asynchronous Processing: Summarization is done asynchronously for efficient handling of large documents.
- Chunking and Collapsing: The app intelligently splits documents into chunks, processes them, and combines summaries into a final summary.
- Environment Configuration: Uses environment variables for API key management.
- Python 3.7 or later
- Streamlit
- LangChain
- Groq API
- dotenv
- Other dependencies specified in
requirements.txt
- Clone the repository:
git clone https://github.com/vishnun0027/Documents-Summary-App.git cd Documents-Summary-App
- Install the required packages:
pip install -r requirements.txt
- Create a .env file in the root directory and add your Groq API key:
GROQ_API_KEY=your_api_key_here
-
Run the Streamlit app:
streamlit run app.py
-
Open your browser and go to http://localhost:8501.
-
Enter the URL of the document you want to summarize and click on "Send" to generate the summary.
app.py
: The main Streamlit application file that handles user input and displays the summary.summarize.py
: Contains the logic for loading documents, generating summaries, and collapsing them if necessary.loader.py
: Implements the load_and_split_docs function to load documents from a URL and split them into manageable chunks.llm.py
: Sets up the language model with the Groq API for generating summaries.utils.py
: Contains utility functions for mapping and reducing summaries.