PDF-Scraper-for-Bank-of-America-Statements

This Python script automates the extraction of crucial financial data from Bank of America statements in PDF format. It processes files in the input directory, extracting account balances, deposits, withdrawals, and daily ledger entries. The data is then organized into JSON files and saved in the output folder.

Overview

This script is designed to extract specific information from PDF files, particularly those generated by Bank of America statements. It processes these files and saves the relevant data in JSON format. Below is an overview of the script's functionality:

Script Functionality

Input and Output Paths

• The script expects the PDF files to be processed to be in a folder named input within the "Bank of America" directory. • The extracted JSON files will be saved in a folder named output within the same "Bank of America" directory.

Data Extraction

• The script extracts various financial details, including account information, balances, deposits, withdrawals, and daily ledger balances.

Data Structuring

• The extracted information is organized into a structured JSON format, making it easy to access and analyze.

File Naming

• The resulting JSON files are named the same as the original PDF files.

How to Use

Setting Up the Environment

•Ensure you have the necessary Python packages installed (e.g., os, fitz, time, pandas, re).

Folder Structure

• Create a directory named Bank of America. • Within this directory, create sub-directories named input and output.

Placing PDF Files

• Put the PDF files you want to process in the input directory.

Running the Script

• Execute the script. It will process all PDF files in the input directory.

Output Files

• Once the script finishes processing, you will find corresponding JSON files in the output directory.

Customization (Optional)

• If you need to adjust any parameters or functionalities, refer to the comments within the script for guidance.

Important Notes

• Ensure Python Environment:

Make sure you have Python installed with the required packages before running the script.

• File Compatibility:

This script is designed for Bank of America statements in PDF format. Ensure the PDFs follow the expected format for accurate extraction.

Disclaimer

This script is provided as is and may require modification based on specific use cases. Use it responsibly and verify the results for critical applications.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
bank_parser		bank_parser
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PDF-Scraper-for-Bank-of-America-Statements

Overview

Script Functionality

Input and Output Paths

Data Extraction

Data Structuring

File Naming

How to Use

Setting Up the Environment

Folder Structure

Placing PDF Files

Running the Script

Output Files

Customization (Optional)

Important Notes

• Ensure Python Environment:

• File Compatibility:

Disclaimer

About

Releases

Packages

Contributors 2

Languages

kmqasim055/PDF-Scraper-for-Bank-of-America-Statements

Folders and files

Latest commit

History

Repository files navigation

PDF-Scraper-for-Bank-of-America-Statements

Overview

Script Functionality

Input and Output Paths

Data Extraction

Data Structuring

File Naming

How to Use

Setting Up the Environment

Folder Structure

Placing PDF Files

Running the Script

Output Files

Customization (Optional)

Important Notes

• Ensure Python Environment:

• File Compatibility:

Disclaimer

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages