Skip to content

Releases: benwbrum/fromthepage

November 2020 Release

01 Dec 01:46
Compare
Choose a tag to compare

Major release v20.11 includes substantial new support for metadata, allowing new fields to be imported or uploaded in bulk and then used for user navigation, as well as adding new editable fields to works supporting documentary editing work-flows. Notable bug fixes include more intelligent support for hyphens in transcripts, fixes to the project owner activity email contents, and fixes for per-page image uploads.

User Enhancements

  • Project owners can upload spreadsheets to bulk-assign metadata to their works.
  • Uploaded or imported metadata values can be used to allow users to navigate works by facets
  • Reviewing a page logs a "Review" deed visible in the activity stream and in exports.
  • Rearranged metadata fields on work settings page to be more intuitive
  • Added EDTF-compliant Document Date field
  • Added missing messages to localization dictionaries
  • Zip files uploads now support folders with square braces in their names.

Integration Improvements

  • Exposed new metadata fields in TEI-XML exports
  • Single sign on issuer is recorded for existing accounts

Bug fixes

  • Fixed bad "Questions and Notes" message on transcription screen
  • Works within document sets are now in the same order as they appear in collections
  • Activity emails sent to project owners now show details of activity.
  • Project owners can now upload individual page images when creating new pages or editing old ones.
  • New single-sign-on users whose logins conflict with existing users no longer see errors.
  • Hyphens on a line that do not immediately follow a word are no longer interpreted as soft-breaks.

Serviceability Improvements

  • Log directories for CONTENTdm imports are now created if not already present.

The team is grateful to the National Endowment for the Humanities and University of Texas-Austin Libraries for funding and guidance, and to Diego Viola for development of the metadata and faceted browsing features. The team is also grateful to the National Historic Records and Publications Commission and to the Civil War Governors of Kentucky Digital Documentary Edition for support for newly editable and exportable metadata fields.

October 2020 Release

31 Oct 15:01
Compare
Choose a tag to compare

Major release v20.10 includes better support for document sets, including API access, bulk assignment during imports, and a more efficient interface for moving works between sets. It also includes support for multiple single-sign-on providers in the same installation, and substantial improvements to the IIIF APIs and TEI-XML export.

User Enhancements

  • Text created by project owners will no longer be flagged as potential spam
  • Searching for works within a collection or document set now searches metadata as well as work titles
  • Project owners may upload or import works to a particular document set within a collection
  • Work assignment to document sets now contains a work search bar, allowing easier management of works between document sets

Integration Improvements

  • Multiple SAML identity providers are now supported on one installation. Sponsored by the Church History Library
  • TEI export now includes geo elements for subjects containing latitude/longitude outside of the "placeography" in the TEI header
  • New IIIF top-level collection per user account
  • Contributions API now accepts a user slug, extending support to uploaded documents as well as those imported from IIIF content providers.
  • Document Sets are now exposed through the IIIF API

Bug fixes

  • Top-level IIIF collection no longer returns 404
  • Line breaks in searchable plaintext export are now replaced with a single space
  • Notes display "n days ago"
  • Work search within document sets returned documents within different collection
  • When subjects have been created from verbatim text including parentheses or braces, autolink no longer breaks
  • Eliminated document set/collection confusion for the "joined project" deed.

Serviceability Improvements

  • Inclusive language in accordance with IETF draft
  • Full-text search from a subject article is now launched via a POST, addressing crawler-initiated performance problems

The team is grateful to the Church History Library (Church of Jesus Christ of Latter-day Saints) for funding multi-SSO integration.

September 2020 Release

24 Sep 20:16
Compare
Choose a tag to compare

Major release including performance and serviceability enhancements, improvements to user experience for collaborators on private projects, and several bug fixes.

User Enhancements

  • Show collaborators private collections and document sets on their dashboard (if they have access to them).
  • Higher-resolution page/work thumbnails for modern displays.
  • Subject tab performance improvements for collections with large numbers of subjects.
  • Removed "owned by" wording from collections.
  • Use English-language messages as a fall-back if no Spanish or Portuguese messages are found.
  • Searching for works now searches metadata imported from IIIF.
  • Improved performance of still_editing action.
  • Include OCR correction actions in user statistics.

Integration Improvements

  • Notify users if IIIF manifests have already been imported into the system.
  • Additional logging message for imported IIIF collections
  • Prohibited crawlers from executing full-text searches.
  • Fixed CONTENTdm transcript sync regression.

Bug fixes

  • Document sets created by staff owners are now owned by the organization account.
  • Autolink now supports subjects with unbalanced parentheses in their verbatim text.
  • Activity emails use absolute URLs again.
  • Page images uploaded individually were broken following Rails 6 upgrade.
  • Performance improvements when displaying pages linking to a subject.
  • IIIF endpoints now use absolute URLs again.
  • Latitude/Longitude and URI fields are persisted to the database again.

Serviceability Improvements

  • STDERR now logged for all background tasks.
  • Dead code supporting an OAI-PMH client and Omeka Classic integration was removed.

August 2020 Release, Update 3

13 Aug 20:56
Compare
Choose a tag to compare

Removes wording specific to hosted version for deployment by self-hosted sites.

Updates version number in application.

August 2020 Release, Update 2

12 Aug 21:26
Compare
Choose a tag to compare

A point release to address regressions in SAML integration, as well as minor bug fixes and enhancements.

User Enhancements

  • Open Seadragon is now used to display pages hosted on the Internet Archive
  • "Edit Fields" tab of collections relabeld to "Fields"
  • Administrator dashboard now displays number of transcribed pages

Integration Improvements

  • SAML authentication works on new code-base

Bug fixes

  • De-duplication/combination of subjects now works
  • Project owners can remove works from document sets

Serviceability Improvements

  • Rack Mini-profiler is available for performance monitoring

v20.8.1

05 Aug 18:14
Compare
Choose a tag to compare

A point release to fix problems many transcribers were encountering when transcribing or correcting OCR that had been imported via PDF.

User Enhancements:

  • handle form feed characters from PDFs
  • longer flash messages on uploads and concurrent edits
  • replaces whitelist with allowlist

Bug fixes:

  • dashes in text exports preserved

The team is grateful to Ben Companjen at the University of Leiden for his contribution to this release.

v20.8

03 Aug 21:27
Compare
Choose a tag to compare

Note that this release is the first release based on Ruby on Rails 6. It will require upgrades of many of the underlying components. For more details, please review this page on the wiki.

This release contains the following enhancements and bug fixes:

User Enhancements

  • Beta release of internationalization support. (Funded by a National Endowment for the Humanities Grant to FromThePage and the University of Texas Libraries.)
  • Beta release of translations to Spanish and Portuguese. (Funded by a National Endowment for the Humanities Grant to FromThePage and the University of Texas Libraries.)

Integration Enhancements

  • Bulk export of all collections in all formats, including TEI.
  • Bulk export includes page level granularity.

Serviceability Improvements

  • Disable subject linking on new projects by default.

The team is grateful to everyone who contributed to this release:

  • Diego Viola, who worked tirelessly on the upgrade to Rails 6 and keeping the code in sync between the old and new versions for way too long.
  • Diego, again, and Joshua Ortiz Baco for their work on internationalizing (Diego) and translating (Joshua) the FromThePage interface into Spanish and Portuguese with guidance from Allyssa Guzman and Albert Palacios.
  • Our summer interns, Isabela Barton and Josie Brumfield, for contributing bug fixes.

v20.6.1

03 Aug 17:09
Compare
Choose a tag to compare

Note that this release is the last scheduled release based on Ruby on Rails 4. Future releases will be based on Ruby on Rails 6.

This release contains the following enhancements and bug fixes:

User Enhancements

  • Status bars now correctly show pages marked blank as green, making the status bars consistent with the statistics.
  • Field-based transcription now accepts < and > signs as input.
  • When collections are inactive, the application no longer displays "start transcribing", "pages to transcribe", "add work" or similar buttons.
  • Display collection footers in all screens showing the collection.
  • References to the autosplit tool have been removed from the start-a-project screen, as more projects want to work with full openings than want to separate them into individual pages.
  • Subject URI fields are only hyperlinked if they contain a URI.

Integration Enhancements

  • Any originating IIIF manifest is now displayed on the About tab for a work.
  • TEI export now includes latitude/longitude coordinates on subjects that include them.
  • TEI export now displays categories for each subject.
  • Display real_name in the respStmt elements in the TEI export.
  • Internet Archive imports now accept URLs generated by the new Internet Archive responsive UI.

Serviceability Improvements

  • The system no longer emails administrators when uploads have been started or completed.

Bug Fixes

  • TEI export now places all subjects which are in child categories of 'People' in the personography, and behaves similarly with Places and the placeography in the TEI header.

The team is grateful to our summer interns, Isabela Barton and Josie Brumfield, for contributing bug fixes.

June 2020 Release

24 Jun 20:36
Compare
Choose a tag to compare

This release contains the following enhancements and bug fixes:

User Enhancements

  • When IIIF manifests are imported into FromThePage, the metadata block is now displayed in the work's "About" tab. (Funded by the Civil War and Reconstruction Governors of Mississippi project.)
  • Project owners can now upload a CSV file of entities to be used as indexable subjects. (Funded by the Frederick Douglass Papers.)
  • A project owner can now restrict all completed works within a collection to be editable only by project owners with a single button.
  • When a folder containing a PDF and a metadata.yml file is uploaded, the metadata.yml file is used to populate the work metadata. (Previously this was only available for folders containing image files.)
  • Improved logic for calculating minutes spent on site.
  • Clicking on the thumbnail for a work takes the user to that work.
  • When an administrator edits a owner record, the browser redirects to the owner list rather than the user list.
  • Wording improvements to transcriber sign-up and log-in.

Integration Enhancements

  • Prevent private collections from being exposed through IIIF API. (Funded by the Beinecke Rare Book and Manuscript Library, Yale University)
  • Imports from CONTENTdm now can ingest OCR from Full-Text Search fields regardless of their name.

Serviceability Improvements

  • Automated tests have been made more robust.
  • Updated robots.txt to prevent crawlers from launching full-text searches
  • When users save a page with no subjects linked, do not update the page_article_links or articles tables

Bug Fixes

  • Project owners editing their accounts no longer have their display names lost. (Contributed by Jacob Creedon.)
  • Line breaks were being compressed in word-wrap mode; now they are replaced with a single space.
  • Large collections were not displayed on the carousel if fewer than 1% of their works had been completed.

The team is grateful to the Frederick Douglass Papers, the Civil War and Reconstruction Governors of Mississippi, and the Beinecke Library for funding several enhancements, and to Jacob Creedon for contributing bug fixes.

April 2020 Release

24 Apr 20:09
51502e7
Compare
Choose a tag to compare

This release contains the following enhancements and bug fixes:

User Enhancements

  • Better support for markdown tables
  • Project owners can now download email addresses for all users who have worked on their projects, along with opt-out information and activity summaries, so that they can add them to monthly newsletters or other communications.
  • Project owners are required to confirm that they want to blank out a collection.
  • Collaborators on private collections now see those collection on their user dashboard.
  • Performance enhancements to the Find a Project page
  • Removed full-text-search launcher from robots.txt, reducing database load due to crawlers.

Integration Enhancements

  • Limit API access to private collections. Only private collections that have "Allow access to transcriptions via IIIF API" checked will expose the collection data via the API. (Sponsored by the Beinecke Library at Yale University)
  • OCR imports from CONTENTdm now ingest raw OCR from full fields in addition to transcr fields.

Bug Fixes

  • Document sets were handled incorrectly in some cases.
  • Fixed LaTex encoding problems in transcriptions.