Sunday 17 September 2023

Setting up the revised Collation editor: some history (2023)

 I am a huge fan of the "Collation Editor", built by Cat Smith of the Institute for Textual Scholarship and Electronic Editing (ITSEE) at the University of Birmingham, with substantial input from Troy Griffitts, now at The Göttingen Academy of Sciences and Humanities in Lower Saxony. Some history is required. The roots of the Collation Editor lie in my Collate software, written for the Macintosh computer from 1989 on and, in its day, used heavily by multiple editing projects. Notable among these user projects were two groups editing Biblical texts: those associated with the Institute for New Testament research at Münster, Germany (INTF), and David Parker and scholars working with him at the University of Birmingham (now, ITSEE). 

Part of the story of how Collate begat CollateX, and CollateX begat the Collation Editor, is told in other blogs on this site: https://scholarlydigitaleditions.blogspot.com/2014/09/the-history-of-collate.html and https://scholarlydigitaleditions.blogspot.com/2014/09/collate-2-and-design-for-its-successor.html. These blogs, though here dated 2014, were written in 2007. Other parts can be deduced from an article about the evolution of digital methods in the INTF and ITSEE written by myself, David Parker, Hugh Houghton and Klaus Wachtel (you can read that article at my Academia site, or via its DOI). 

The first part of this begetting is the making of CollateX. CollateX fulfilled completely the first part of the agenda I laid out in the blogs on this site: to create a system for comparison of multiple texts which was modular and independent of any one hardware or software implementation. CollateX is a marvel, and a remarkable achievement by the team of software engineers who made it (prominently, Ronald Dekker of the Huygens Institute, Amsterdam). 

The second part of this begetting was the making of the Collation Editor. This creates an entire environment permitting editors to create exactly the collation they want, by determining through a point-and-click interface exactly what words collate with what and how the collation is to be expressed. Essentially, the Collation Editor is an interface to, and an extension of, CollateX: permitting editors to adjust the CollateX collations to create exactly the collations they want. For me, the test of the Collation Editor, and its implementation of CollateX, was simple: could we achieve exactly the same complex collations with the Collation Editor/CollateX as we could, from 1995 to around 2015, with Collate? The answer is, triumphantly, yes. Indeed, we could achieve far more with the Collation Editor than we ever could with Collate. Here is the tool I dreamed of in 2007. (Somewhere, I said that it would take a team of ten people ten years to make the replacement for Collate. I was not far wrong).

Accordingly, in 2016 I started work on integrating the Collation Editor into Textual Communities. We have now used this integrated implementation to collate some four thousand lines of the Canterbury Tales, in preparation of our forthcoming Critical Edition of the Tales. You can see how this works in a video I made, collating just one line of the Tales. As you can see, the Collation Editor can create exactly the highly-complex collations we want. In the last years, it has become an absolutely vital part of our work on the Tales. However, the version we integrated in 2016, and which is still the version we are using, is now seriously outdated. Many improvements have been made to the Collation Editor since 2016 (or, in effect, 2019, when we last updated our implementation of the Collation Editor) and finally, thanks to a sabbatical, I am setting out to bring the Textual Communities version of the Collation Editor up to date. This task should be greatly eased by the re-organization and rewriting of the Collation Editor since 2019. The Collation Editor code has now been cleanly divided into a "core" code library, designed so that the whole core can fit inside any implementation and be easily updated, and a "services" code library, which connects the core to whatever implementation you want. In our case, we use MongoDB document databases to store all our information about our texts, and hence everything the Collation Editor needs to function should be linked to our MongoDB databases.

In the next posts, I will explain how I went about setting up the updated core collation tools of the Colllation Editor to work within Textual Communities, in the same way as a series of blogs on StaticSearch explain how I got this to work with our data.



No comments:

Post a Comment