This Python script automates the extraction of crucial financial data from Bank of America statements in PDF format. It processes files in the input directory, extracting account balances, deposits, withdrawals, and daily ledger entries. The data is then organized into JSON files and saved in the output folder.
This script is designed to extract specific information from PDF files, particularly those generated by Bank of America statements. It processes these files and saves the relevant data in JSON format. Below is an overview of the script's functionality:
• The script expects the PDF files to be processed to be in a folder named input within the "Bank of America" directory. • The extracted JSON files will be saved in a folder named output within the same "Bank of America" directory.
• The script extracts various financial details, including account information, balances, deposits, withdrawals, and daily ledger balances.
• The extracted information is organized into a structured JSON format, making it easy to access and analyze.
• The resulting JSON files are named the same as the original PDF files.
•Ensure you have the necessary Python packages installed (e.g., os, fitz, time, pandas, re).
• Create a directory named Bank of America. • Within this directory, create sub-directories named input and output.
• Put the PDF files you want to process in the input directory.
• Execute the script. It will process all PDF files in the input directory.
• Once the script finishes processing, you will find corresponding JSON files in the output directory.
• If you need to adjust any parameters or functionalities, refer to the comments within the script for guidance.
Make sure you have Python installed with the required packages before running the script.
This script is designed for Bank of America statements in PDF format. Ensure the PDFs follow the expected format for accurate extraction.
This script is provided as is and may require modification based on specific use cases. Use it responsibly and verify the results for critical applications.