Browsertrix Crawler 1.1.0 Beta 5
Pre-release
Pre-release
What's Changed
- Separate writing pages to pages.jsonl + extraPages.jsonl to use with new py-wacz by @ikreymer in #535
- Adblock support by @ikreymer in #534
- Remove no longer needed invalid Brave update URLs by @tw4l in #539
- Better logging of all queue WARCWriter operations by @ikreymer in #536
- qa: filter out non-html pages by @ikreymer in #541
- Fix for --rolloverSize for individual WARCs in 1.x by @ikreymer in #542
- Set mime type for html pages by @tw4l in #545
Full Changelog: v1.1.0-beta.4...v1.1.0-beta.5