All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog.
- Provide
DownloadImagePipeline
class in place ofdownload_image_pipeline
object.
- Log uncaught exceptions
- Store
repr(e)
rather thane
itself in the run report, wheree
is an exception.
- Increased test coverage
- Errors from trying to write to the same outpath in multiple threads
- Use image URL in README that is accessible on PyPI.
- Rename library "wildebeest"
- Use GitHub Actions for CI/CD.
==== Below is Creevey development ====
- Warning about upcoming name change
- Use
pip-tools
.
- Function rotate() that takes an image and an angle and ouputs the rotated image.
- warning about upcoming rename of library
- Function to flip an image horizontally.
- Function to flip an image vertically.
- Pipelines now take a function to decide whether to skip a file based on its input and output paths, rather than just providing the option to skip files whose outpath points to a local file. "skipped_existing" field of run report has been renamed to "skipped" accordingly.
- Run report is now stored in
<Pipeline object>.run_report_
rather than being returned. - All exceptions that inherit from
Exception
that arise during file processing are now handled by default. - Rather than noting whether an exception was handled in processing a particular file in an "exception_handled" field, the run report now contains either the exception object itself or
np.nan
in a field called "error." - "time_finished" field in run reports now uses human-readable timestamps.
Pipeline.run()
method has been removed; now pipelines can only be called directly.Pipeline.ops
is nowNone
by default.- Moved most of the README content to readthedocs, improved the examples, and added docstrings.
log_dict
is now a Pipeline attribute.pipelines/core.py
has been renamed topipelines/pipelines.py
- Example text scraping application.
- Have
flake8
check for docstrings in library functions and classes.
- Pipelines can now be called directly rather than through a
.run()
method;.run()
still exists as an alias for backwards compatibility.
write_image
writes to a tempfile in a ".tmp" subdirectory in the output directory rather than in an arbitrary location to avoid issues when writing to a mounted volume.
write_image
writes to a tempfile and then renames it to ensure that it does not create partial image files if writing is interrupted.
- Downloads timeout after 5s by default if no response received.
record_dhash
function
trim_padding
function
centercrop
can now be used within aCustomReportingPipeline
.
- Bug in error response when loading an image with unusual errors
- provide one script to run pre-merge checks
- refine PR checklist
- typo in README
- typo in
skip_existing=True
warning message
- Sort actual and expected DataFrames by column name in
test_custom_reporting_pipeline
to avoid uninteresting indeterministic failure.
- Centercrop function into ops/image.py
- Revert reading version number from CHANGELOG.md in setup.py and _version.py
- Read version number from CHANGELOG.md in setup.py and _version.py
- Instead of logging a warning for every file skipped, warn once up front and log with level DEBUG.
- Function to generate unique filenames from input paths, outdir, extension
- Ability to keep original extension with using
join_outdir_filename_extension
- Load color channels from URL in RGB order
- Added empty
__init__.py
files, the absence of which seems to be causing problems when installing from PyPI. - Converted some
Path
objects to strings where required in Python 3.6.
- Handled string path inputs in
replace_dir
- Added simple unit tests for
path_funcs
module
- Added hosted docs for readthedocs.
Pipeline
runs return "run reports"CustomReportingPipeline
class allows custom run report fields
- Redesigned library around
Pipeline
abstraction
- Switched to using one requests session in each download thread.
- Added boto3 to setup.py requirements.
- Added dataset module
- Elevated
group_train_test_split
to a public function in its own module.
- Use
black
to format code.
- Address flake8 complaints.
- Used black to format code.
- Remove download instructions from README now that a simple pip install should work.
- Added functionality to save_response_content_as_png() to allow resize on download
- Fill in some missing type hints.
- Modify Jenkins job to avoid trying to push old versions to PyPI.
- Catch
FileExistsError
in case in which another thread creates directory after we check for it.
- Configure PyPI Markdown rendering.
- Functions for downloading, resizing, and creating imagenet-style symlinks.
- Basic docs.