Skip to content

v4.7.0

Compare
Choose a tag to compare
@rzo1 rzo1 released this 25 Oct 08:28
· 417 commits to master since this release

Breaking Changes

Robots

  • Replaces homebrew robotstxt code with crawler-commons

Normalization

  • Replaces homebrew URL normalization with crawler-commons

You now need to pass a BasicURLNormalizer into the PageFetcher and the CrawlController, e.g.

BasicURLNormalizer normalizer = BasicURLNormalizer.newBuilder().idnNormalization(BasicURLNormalizer.IdnNormalization.NONE).build();

Please note, that this BasicURLNormalizer can support IdnNormalization.

Dependency Upgrades

  • Updates Tika to 2.1.0 (check/update your excludes, if you are importing crawler4j into your own code-base)
  • Updates Jackson to 2.13.0 (test scope only)
  • Updates PostgreSQL driver to 42.3.0 (examples only)
  • Updates Flyway to 8.0.1 (examples only)
  • Updates Guava to 31.0.1-jre
  • Updates Groovy to 3.0.9 (test only)

Additional Notes

Full Changelog: v4.6.0...v4.7.0