December 2023 fixes

I made a bunch of fixes to Latin verse (also the Odyssey). hic and hoc should mostly have correct, conventional macrons now (I originally made a pedantic decision to macronize them all when they are long by hidden quantity). hujus, cujus and their friends are also better behaved.
I have a version of Aeschylus’ Agamemnon ready, but need to do some work on presentation of different sections (stasimon, kommos, anapaests etc.)

More standoff attributes

Also considering structural attributes for the standoff so as to mark lyric stanzas, episodes etc. Some ideas (these would be applied to each syllable in a section, but would refer to the entire section; I know this could be done with enclosing tags, but for the sake of simplicity and flexibility the verse xml is intended to be equivalent to csv):

div="prologue" style="ia"
div="episode" style="an"
div="episode" style="ia"
div="episode" style="tr"
div="choral" style="de" stanza="strophe"
div="choral" style="de" stanza="antistrophe"
div="choral" style="ae" stanza="epode"
div="choral" style="do" stanza="epode"
div="kommos" style="do" stanza="strophe"
div="choral" style="io" stanza="strophe"

Are there other formal sections that deserve marking? Agon? Epirrheme/Syzygy? I guess I can deal with these as I come to them (esp. when I get to comedy), but it would be nice to have a clear view of the ontology early on. The trick is to distinguish between form and content (e.g. what makes an agon an agon?).
There are 4 top level categories here: prologue, episode, choral, kommos. I am tempted to see prologue as a type of episode, and kommos as a type of choral, but for now they seem legitimately distinct. I don’t currently plan to mark parodos and exodos, since these are always choral[0] and choral[-1] (right?). Well, I know there is in fact a variety of ways a chorus is introduced to an audience (or says goodbye to it) – frogs and furies come to mind – but that seems like a good reason not to overprivilege the parodos/exodos category.
Also these are the ‘features’ a syllable can have. Colon and period mark the final syllable in lyric units, though colon is arguably equivalent to line, and so might not be used. lbn = long by nature – marks long αιυ where accent or iota subscript don’t do this. bil = brevis in longo. empty is for added syllables without a conjecture about content. link is for DE link sylls.

'diastole', 'systole', 'res1', 'res2', 'lbn', 'hiatus', 
'bil', 'synizesis', 'crasis', 'elision', 'empty',
'anceps', 'added', 'link', 'colon', 'period'

Treebanks, Tragedy, PDFs and Standoff XML

I’ve been stalled here for a while. I spent a little time connecting my metrical analyses with the Perseus treebanks, and realized that it makes much more sense to use those (where available) as a base text for my work. It’s taken me some time to get my head around what it will take to incorporate that into my workflow, and since there are treebanks for most of Sophocles and Aeschylus, this has been something of a blocker; but I’m just about there. The plan is now to use forked treebanks (i.e. my own lightly edited versions, hosted on github) as base texts, and construct standoff xml files with metrical info. 1. The standoff xml will link to the treebank via sentence and word id. 2. The standoff xml will be syllable based, not word based. Here’s a sample: 

<syll features="" line="92" metre="ia6" qty="short" sentence="2900129" speaker="Ἰσμήνη">
<seg word="5">πρέ</seg>
</syll>
<syll features="" line="92" metre="ia6" qty="long" sentence="2900129" speaker="Ἰσμήνη">
<seg word="5">πει</seg>
</syll>
<syll features="crasis" line="92" metre="ia6" qty="long" sentence="2900129" speaker="Ἰσμήνη">
<seg word="8">τ</seg>
<seg word="6">ἀ</seg>
</syll>
<syll features="" line="92" metre="ia6" qty="long" sentence="2900129" speaker="Ἰσμήνη">
<seg word="6">μή</seg>
</syll>
<syll features="" line="92" metre="ia6" qty="short" sentence="2900129" speaker="Ἰσμήνη">
<seg word="6">χα</seg>
</syll>
<syll features="bil" line="92" metre="ia6" qty="long" sentence="2900129" speaker="Ἰσμήνη">
<seg word="6">να</seg>
</syll>

The main change to the treebanks is to split all instances of crasis, so that two words which share a syllable are properly identified there (e.g. τἀμήχανα). A speaker attribute is also added to the sentence element. Minor changes to line numbers and text are unavoidable when creating a sane map of lyric patterns (Storr’s Sophocles, for instance, is based on Jebb, which is out of date, but even then Storr made errors in transcription of his source).

There is some duplication of info right now, but I think in the finished product speaker and line number will only need to be in the treebank (I know speakers sometimes finish each others’ sentences in antilabe, but I don’t think those ever count as the same sentence in the treebanks… right?).

Syllable level tagging in the metrical xml allows for one very important option: splitting lines mid-word in lyric. This is largely an artifact of physical page, but without it one does end up with some rather long lines. I’m doing some work on tagging schemes that will allow for identifying sub-units in lines and cola (e.g. the choriamb in a glyconic, or the D in a D-E line), and I’m hopeful that will allow for moving beyond some of the frustrations imposed by the print/manuscript tradition.

This work is ongoing and moving quickly (Soph OT and Ant are largely done), so initial results should be available here soon.

PDFS:

This summer I did a lot of work on automatic conversion of these metrical texts to pdf format. Hopefully that work will show up here soon.

Conspectūs Lectionum:

Part of the pdf work involved creating a laborious workflow to allow for development of tables of alternative readings found in modern editions. That’s done for Vergil’s Aeneid, and when time allows, I’ll pursue it for other texts.