Version 0.11.0 (22 Dec 2017)
- The submodule 'standardise' is renamed. The new name is 'preprocessing'.
The submodule 'standardise' will get deprecated in a next version. - Deprecation errors were not visible for many users. In this version, the
errors are better visible. - Improved and new logs for indexing, comparing and classification.
- Faster comparing of string variables. Thanks Joel Becker.
- Changes make it possible to pickle Compare and Index objects. This makes it
easier to run code in parallel. Tests were added to ensure that pickling
remains possible. - Important change. MultiIndex objects with many record pairs were split into
pieces to lower memory usage. In this version, this automatic splitting is
removed. Please split the data yourself. - Integer indexing. Blog post will follow on this.
- The metrics submodule has changed heavily. This will break with the previous
version. - repr() and str() will return informative information for index and compare
objects. - It is possible to use abbreviations for string similarity methods. For example
'jw' for the Jaro-Winkler method. - The FEBRL dataset loaders can now return the true links as a
pandas.MultIndex for each FEBRL dataset. This option is disabled by default.
See the FEBRL datasets for details. - Fix issue with automatic recognision of license on Github.
- Various small improvements.
Note: In the next release, the Pairs class will get removed. Migrate now.