If you see a character looking like a p with a bar through the descender in the title of this post, and you see it here too ꝑ, then ... read on. And if you don't, then read on (and let me know!)
Thirty years ago, when myself, Tim Berners-Lee, Lou Burnard and the web were much younger, every "special character" was a challenge, and a potential triumph or failure. "Special" meant something beyond ASCI 127 (ah, the acronyms!). It meant anything non-English, in the most limited BREXIT sense. E-acute was used by people from across the Channel, and a few Canadians, and not to be used without Special Equipment (in those days, a Macintosh computer). Devanagari was a distant dream, and right-to-left writing, an impossibility.
Nowadays, thanks to Unicode, and the work of many unsung heroes of font-design, with a special shout-out to those who sat on myriad committees and shepherded the whole process to every smart phone on the planet, we have become so used to everything appearing just right, with no effort at all on our part, that we are in danger of forgetting how many miracles had to occur so that I can insert a ꝑ in my document, and you can see it. (The best miracles are made by people working together, of course). But every now and then, something happens to remind us of how many ducks make a row.
Like many medievalists, I am a fan of Peter Baker's beautiful Junicode font. For years, I have been happily typing ꝑ into transcriptions, Word and pdf documents. This and a few other characters are very common in many medieval vernacular and Latin manuscripts. ꝑ is used as an abbreviation for per or par, as in "person" and "parish", and so found everywhere in Chaucer manuscripts (think of the Parson and the Pardoner). One of the great joys of Junicode is that it shows this character in a particularly elegant form, appearing as
By this time, we had graduated to bundling the Junicode font with our developing site, so that readers would not have to download the font to their computer. This a well-documented process, and Squirrel font documents it and provides neat tools to convert any font to a "webfont", easily embeddable in any web page. So I began investigating. On my computer, the character appeared fine:
- if I had Junicode on my computer, and the font embedded in the page
- if I had Junicode on my computer, and the font NOT embedded in the page
I began digging. The unicode code point for p with a bar is A751. This is in the "general use" area of unicode, which major fonts will support as a matter of course: so you can paste the ꝑ from this document into a Word document and use it in Times New Roman, Geneva, etc. When I looked at Junicode in my computer, using Apple's Font Book, p with a bar appeared as glyph 2007, Unicode A751, exactly as it should:
However, on my collaborator's computer, the same character appeared in a quite different place: as glyph 2066, unicode E670 (on my computer, Junicode has a quite different character at glyph 2007).
What is going on? Why is her Junicode different from mine? On digging about, it appears that some time in the past, Junicode indeed had this character at E670. The "E" and "F" unicode ranges are "Private Use" areas, and it appears that up to the time when p with a bar was allocated A751 in the "general use" area, Junicode put p with a bar in the "private use" area, with that encoding. This is a rather long story, involving a group called the Medieval Unicode Font Initiative (MUFI). One of the aims of this group was to have "core" characters judged as essential to scholars working with medieval western European texts incorporated into the "official" Unicode encoding. As of Unicode 5.1, 152 MUFI characters -- among them, p with a bar -- had made it into official unicode. It appears that my version of Junicode reflects this shift of p with a bar into official, post 5.1, unicode. The version of Junicode on Prue's computer did not.
More digging. By this time, I was suspecting that the embeddable version of Junicode did not have p with a bar at A751. But why did it display correctly on my computer? It appears that somewhere deep in the innards was an instruction to the effect: if the browser could not find the character in the embedded font, look elsewhere: so it looked in the Junicode on my computer, found it and displayed it. It did this even when I tried to fool it by calling the embedded font something else in the CSS ("junicoderegular") style sheet. However, on my collaborator's computer the character did not appear as A751, and so it showed an A751 from another font altogether.
Eventually, after scores of emails and hours of digging, I concluded that the root of the problem lay in the embedded font. Somehow, this embedded Junicode did not have p bar where it should be. So I set to trying to correct this. First I went to the Squirrel font generator:
I uploaded the Junicode TTF from my computer, Squirrel converted it to a "webfont", and all seemed fine. Nope. Same problem. I dug deeper. I went to Peter Baker's "Junicode" page on FontSquirrel and used the "webfont kit" generator on that page. Nope. Same problem. With increasing desperation, I noticed that the page offered a choice of "subsets":
So, I chose "no subsetting" and created the webfont. And at last! it worked!
All this for characters which appear just five times in some 2400 pages of manuscript transcription.
This tale casts into relief the many rough edges that exist in the interplay of fonts, glyphs, character coding points, unicode spaces, and encoding systems (utf8? or 16? BOM or not?), all playing against multiple versions as all of these evolve and agreements are forged and renewed. The wonder is that problems like these occur so rarely.