A collection of Python scripts to detect and remove duplicate files across your Google Drive account
The script is based on the Example of Google Drive API quickstart (https://developers.google.com/drive/api/v3/quickstart/python)
The script browses across your drive account to find duplicated (based on md5 checksum of each file). It ignores file in Trash bin of the Google Drive.
- Python 2.6 or greater
- Required libraries installed (as listed in file
requirements.txt
)
- Create a
credentials.json
file in the project folder (see step 1 at https://developers.google.com/drive/api/v3/quickstart/python) - Create a
config.json
file in the project folder (based on cofig.example.json) - Run:
python .\gdrivededup.py
- On the first execution the script will require you to login, the token will be saved as
token.pickle
(please ensure this file is kept/removed safely) - On success, the script will prompt a list of dictionaries with information for each duplicated file (if none, no output)
Potential Improvements for this repo:
- save the list as a csv file
- document python file
- package into an executable file