CTDA System Migration


Project Update, June 14, 2022

Latest iterative design for collection and object pages are available here: https://xd.adobe.com/view/21823df5-0689-473e-ad83-1b5d076ad532-74d1/ . Click on the comments for an explanation of the changes.

Checksum and Fixity checking are in final stages of review.

Taxonomy mapping is the next feature that will be tested. We are examining the ability to bulk load terms from external vocabularies, including vocabularies from non-customary domains. This is supported by research being done by Rachael Nutt for the CTDA in Context project. Our current list of potential vocabularies is available: https://docs.google.com/document/d/1CPelIcZvit4ghR4D5Zms_tTnk6PTUlYieUl34ZwxYxY/edit#

Project Update, June 8, 2022

I can’t believe that it has been almost a month since the last project update! Lots to report. The metadata profile is finished, thanks to all on the metadata review group for all their help.

Presentation layer designs are progressing. The most current iteration is here: https://xd.adobe.com/view/bd1b11d7-e85d-40e6-95b1-d261575227de-e848/ although that could change quickly

The Mirador viewer implementation is waiting for development work on search term highlighting in full text to be completed. This is a NEW function for the open source Mirador viewer and the code we develop will be contributed to both the Mirador and Islandora open source communities.

New features being implemented include the new “Groups” feature which allows top level collection owners the ability to manage the logo color scheme of their collections, as well as manage their own user accounts. And even more exciting, Group managers will be able to configure the layout of their item level pages themselves. (So if you want the metadata to display first, second, on the left, on the right, etc. it is your decision!)

Other backend developments include implementation of checksum checking and fixity checking that ensures the authenticity of your content.

Coming soon will be implementation of presentation designs in the CTDA Sandbox.

Project Update, May 11, 2022

Not much news to report, the metadata profile is nearing completion, just a few details left. This includes decisions about taxonomies and how they will work in both management and facet displays on search results.

Preliminary work has begun on the Presentation Layer, beginning with a redesign of the “Front” page, or the landing page of the entire site. We will be putting out a call for volunteers to review our design and information architecture ideas shortly. Stay tuned!

Project Update addendum, April 22, 2022

I forgot to mention that the new system will be implementing the Mirador viewer for all supported object types. Mirador is a community developed, open source viewer that allows for highly customizable presentations of content that are user configurable. A good example of how Mirador handles text navigation and highlighting is here: https://mirador-textoverlay.netlify.app Note that clicking on the "Hamburger" icon on the top left will toggle the metadata, and you will be able to click around on other icons to change the display interface. The specific features available on our implementation are yet to be determined.

Project Update, April 22, 2022

The Metadata Review group has finished its analysis of the metadata crosswalk between the current system and Islandora 2. A document will be available soon. The new system allows us to customize metadata management to our local needs while still remaining consistent with national standards based on MODS. Not surprisingly, the review turned up a lot of “non-standard” metadata, mostly in the form of typographical errors and adding good terms into the incorrect fields. Mike and Heather are working on metadata normalization.

Feature testing continues as. new updates come available. Most work is being done on backend systems. Feature testing has been completed on Collection search configuration, and versioning, as well as fixity checking, and checksum generation.

Feature testing will ramp up as metadata standards are implemented in the system and we begin to work on presentation information architecture and theme development.

The new Groups feature being developed will allow members to manage not only their user accounts and individual permissions, but also provide a browser-based way to manage your collection’s color scheme, logos, and other aspects of your organization’s look and feel to the public if you so choose, giving every member many of the features of an independent site while remaining within the CTDA system.

Project Update, March 21, 2022

Both the Metadata review and the Feature testing groups have been formed and have had at least initial meetings. Both groups have open channels in CTDA Slack (#feature-testing, #metadata_review) if you are interested in following along, and if you want access to the documentation from dgi about testing or metadata please ask us at the CTDA for read access to the Google drive.

Project Update, March 4, 2022

Most questions relating to storage, checksum checking, and meeting preservation standards have been ironed out. Next up on the agenda are two areas that begin to touch content managers and end users: Metadata profile and functionality.

We will put out a call next week for metadata experts to review the proposed i2 metadata profile. The new CTDA will not store MODS records as xml data streams but in database tables and nodes in the Drupal system. Content managers will still work in the MODS schema on the front end and in spreadsheet ingest, but how the data is managed will change. the MODS profile will translate our MODS language into database language. We will be looking to our metadata experts to review these mappings.

Second, we have created a small group to test system functionality. Led by Anna Newman (CSL) and including, Molly Woods (CHS), Sean Parke UofH, Betsy Pittman (UConn), and Tina Panik (Avon PL), this group will begin testing core functional elements of the i2 system as they are released to the test server. These will begin with fairly mundane activities like logging in, adding a user, creating permissions, creating objects, using metadata forms, and such. After preliminary testing of the system is complete, we will open the site to read-only access to all members. Once the system is stable, probably this summer, we will begin training on using the i2 system.

The testing group will be very interested in your feedback and will be communicating with the membership as they go along.


Project Update, February 21, 2022

The last couple of weeks have been spent looking at questions relating to preservation, storage, checksum, and other deeply buried foundational frameworks that most users will never see or need, but must exist to guarantee authenticity and reliability over time. These “not shiny” parts are the heart of the preservation system.

Additionally we are also implementing a “sandbox/development” site that will begin with an out-of-the-box implementation of the Islandora 2 system. This system will be enhanced with features specific to the CTDA as illustrated in the feature list at right and in the spreadsheet. As soon as the sandbox is ready, we will open for read only access, so members can get a look at the beginnings of the new system. NOTE: The sandbox will NOT have the CTDA theme implemented at the beginning. That will come later.

Project Update, February 4, 2022

Next week we will start working on presentation layer theming in the test environment and metadata mapping.

A Note on Metadata

Although we will still do our work within the MODS metadata schema, the data will no longer be stored in mods-based xml files, but in data tables in the Islandora system. This is more efficient for the system and will result in enhanced performance. This change will not affect our day-to-day work in the system as MODS elements will continue to be the basis for our metadata forms and spreadsheet ingest metadata. When MODS records are required, such as for DPLA uploads, those MODS files will be generated by the system.

Project Update January 14, 2022

The project officially began with a kickoff meeting between the discovery garden team and the UConn project team on January 12, 2022.

Project Update, December 1, 2021

We continue to wait for UConn’s purchasing and contracts unit to give final approval to begin work. 

Project Update, November 1, 2021

Our project timeline is on hold while we wait for UConn's purchasing and contracts department to give the final OK to the contract with discoverygarden to implement the system migration.

Project Update, October 1, 2021

We have created a spreadsheet to connect items mentioned in the Final System Analysis Report from dgi to features and activities we are all used to doing in the current CTDA system. Like many spreadsheets it can be hard to read in its current PDF form. But we wanted to get something out there in case people had questions. 

The spreadsheet has a column called "WBS Item" which links the feature on the left with the Item in the Work Breakdown Structure of the Final System Analysis report received from dgi.   If the WBS column says "Core" that means there is no specific WBS activity for this feature, it is part of the standard installation. Another column labeled "Parity or New Feature" explains the nature of the feature and if it is enhancing old i7 feature, or is new to the system altogether. 

Project Update, September 10, 2021

The final analysis report from discoverygarden was approved by the CTDA in the first week of September. This report is now available.  The report is filled with technical language and jargon. We will produce an executive summary of the report written in non technical language in the coming days. 

The Ultra-condensed version is that we are focusing on "feature parity" as a first step for the new system. That means that we are going to ensure that everything you can do now in the CTDA you will be able to do in the new system. This is more difficult than it may seem since the underlying method of data management is much different from how the system currently manages data internally. 

Because of the amount of work we did developing the assessment report and roadmap. We can move directly to implementation, saving months of analysis and planning time. 

The Project

The CTDA is embarking on a major upgrade to its infrastructure to bring you better service and more features. Our current Islandora 7 system was originally installed in 2011 and runs on three core open-source tools: Islandora version 7, Drupal version 7, and FedoraCommons version 3.6. All three of these systems will be discontinued in 2022. We started planning last Summer to migrate to more up-to-date versions of these software applications and wanted to report our progress to date and solicit feedback in some areas.

Current Technology

First, we will be expanding the technology architecture of the CTDA system. The current system, implemented in 2018, was scaled for approximately 50TB and 1.5 million objects. The current CTDA has over two million objects, and we have all been aware of slowdowns in the system, especially in derivative generation on the back end. The new infrastructure will be scaled for a Petabyte-sized system with millions of objects and will have the ability to automatically balance the load of system resources based on demand. We should see significant improvements in system performance and continued good performance as the system grows. We will be working with UConn ITS to implement the new technology infrastructure.

Migration Roadmap

Next, we worked with our software support vendor discoverygarden (dgi) to analyze the current system and create roadmap for migration to the new system that preserves a “one to one” feature set between the two. That is, all current features available in the CTDA will be available in the new system. Some of these features may work a bit differently behind the scenes, and there will be some changes in how certain things are done in the new system. We will develop training programs and documentation in the Resource Center to smooth the transition. Islandora 8 (as it is called) also comes with a host of new features already built-in. 

A few of the new features that are standard in Islandora 8 include:

  • A Taxonomy feature that will make metadata management easier, faster, and more responsive.

  • A redesigned management interface which will make content management easier to understand. This new interface uses a “common sense” approach to content management without the need to understand obscure things like “RelsExt” datastream terminology and syntax.

  • A new “universal media viewer” that supports all types of multi-object content, including books, newspapers, and compound objects in a streamlined interface

  • A browser-based, user friendly “batch update” tool, that will allow you to make metadata updates to multiple files at once.

New CTDA Features

Finally, we want to take this opportunity to implement new features in the system to enhance the user and content manager experience. While we have many ideas of our own, we want to hear from you. We’ve created a Islandora 8 wishlist channel in CTDA Slack for you to contribute your wishes. And since this is a “wish” list, don’t worry about how we might do what you are asking, or what it would cost. We just want to hear about it. We hope that this wish list will spark conversation among the membership about what you want to CTDA to become.

Project Timeline Overview
(January 2022)

The project officially began on January 12, 2022 with a kickoff meeting between representatives from discovery garden and the UConn project team. The following list is an high-level overview of the activities in each phase with the approximate date of completion for each phase:

The project will proceed through five phases:

  • Implementation Planning (February 1)

    • Finalize the activities of the other planning phases

  • Infrastructure and Installation (March 15)

    • Deploy a “sandbox” environment on UConn servers

    • Configure hardware infrastructure with UITS (moved to later in the process)

    • Install Islandora framework and stack components (moved to later in the process)

    • Meet Core Trust Seal requirements

  • Presentation Layer Configuration (March 15)

    • Member collection branding

    • Collection limited search

    • Collection search in page header

    • Representative object view configuration

    • Member item branding

    • Paginated item display with interaction (waiting for development)

    • Newspaper display and interaction (waiting for development)

    • Iterative refinement and theme completion (in process)

  • Functional management Configuration (July 15)

    • Permissions and access: Groups (in process)

    • Media versioning

    • Checksum features

    • Taxonomy updates (in process)

    • Content type updates (in process)

    • Third party ingest process (combined a number of other processes into one 6/22)

    • Handle service

    • Multipage CVS ingest rows

    • Manuscript content support

    • Bulk application of embargoes

    • Extracted text hit highlighting for paged content

    • IIIF/embedded frames in external web sites (added 1/22)

  • Migration (August 23)

    • Migrate I7 repository object metadata and files to new system

    • Maintain fidelity of metadata, digital files, and object relationships

    • Create taxonomies

    • Remap Handles for all repository items

    • Retire outdated domain names

System tuning and testing will begin in late August. Organizations will be migrated individually beginning in August, so there will be a time when some groups are “live” in the new system and others are still “live” in the old system. Following the last organizational transfer, the new system and the old system (with no updates) will run in parallel for a time to be determined. Final retirement of the I7 system is expected no later than December 31, 2022.