arca- A digital implementation of Athanasius Kircher's device for automatic music composition, the Arca musarithmica of 1650

Copyright(c) 2022 Andrew A. Cashner
Safe HaskellNone




This module reads (lectio, Latin, "I read") and process input text to be set to music using the ark.

Kircher's specification

Kircher expects the user to prepare a text by segmenting it into phrases according to the poetic meter and prosody. In his description the texts are Latin, but he also demonstrates how the machine could be used with Aramaic and other languages, ideally by Jesuit missionaries.


XML input

In our implementation we also expect the user to mark the input text by dividing the syllables with hyphens and marking the long syllables with accent symbols (`, placed before the relevant syllable), for example:

Lau-`da-te `Do-mi-num `om-nis `ter-rae. Al-le-`lu-ia. A-`men.

This implementation takes input in the form of an XML document, in which the text is syllabified and accented as just demonstrated, and divided into one or more sections. In the attributes for each <section> element, the user sets the values we need as input for the ark:

e.g., Prose or Adonium
Duple, TripleMinor, or TripleMajor
Simple (= Syntagma I) or Florid (= Syntagma II)
e.g., Tone1

Within each section the text is divided into one or more line groups (<lg>) and lines (<l>). (These elements are borrowed from TEI.)



In Prose meter, Kircher leaves it up to the user to divide the text into phrases. We are currently using a very simple algorithm to divide the text into phrase groups within the correct size range. It would be better to use a more sophisticated algorithm to parse the text into optimal groups.

Reading and parsing the input file

The main function is prepareInput, which reads and parses the file and produces a list of LyricSections.

This module reads the input file, parses the XML tree to extract the text and needed parameters for setting the text (within each section), and then packages the text into its own data structures to pass on to the other parts of the program (Cogito for processing and Scribo for writing output).

Capturing XML data

The text is first grouped into intermediate data structures that closely reflect the XML structure. Each <section> becomes an ArkTextSection, containing a nested list of strings (line groups and lines from XML) and an ArkConfig with the parameters from the XML section attributes. The list of these is packaged into a single ArkInput structure containing metadata for the whole document (taken from the XML <head>), and a list of ArkTextSections.

Preparing for musical setting

The module then processes this data and converts it into a list of LyricSections that the other modules will use. Below are the structures that are passed on to other modules, from top down. Each structure contains the element below it, plus information about it (length, number of syllables, etc.). To get that information, these structures are created with methods that calculate the data upfront.

group of sentences (from <section>)
  • also contains an ArkConfig with the text-setting parameters
group of phrases (from <lg>)
group of words (from <l>)
individual word, broken into syllables

Read input file

Global settings for input format

hyphenChar :: Char Source #

The character used to demarcate syllables (default '-')

accentChar :: Char Source #

The character used at the beginning of syllables to show long (or accented) syllables (default '`')

Storing XML data

data ArkMetadata Source #

Header information


Show ArkMetadata Source # 
Instance details

Defined in Lectio

data ArkInput Source #

The input to the ark is an ArkConfig element with tone, style, and meter; and a list of strings, each of which will become a LyricSentence

Show ArkInput Source # 
Instance details

Defined in Lectio

data ArkTextSection Source #

A section of input text (from xml section element)




Show ArkTextSection Source # 
Instance details

Defined in Lectio

xmlSearch :: String -> QName Source #

Create a QName to search the xml tree

xmlNodeText Source #


:: String

element name

-> Element

the node

-> String

node text

Get the text from a node

cleanUpText :: [String] -> [String] Source #

For each string in list, break text into strings at newlines, strip leading and trailing whitespace, remove empty strings, remove newlines

strip :: String -> String Source #

Strip leading and trailing whitespace from a String

readInput :: String -> ArkInput Source #

Read an XML string and return the data for input to the ark (ArkInput)

parseSection :: Element -> ArkTextSection Source #

Parse an XML node tree into a section with configuration and parsed text.

Hierarchical text groupings by word, phrase, and sentence

Verbum: Single words and syllables

type SylLen = PenultLength Source #

Every syllable is either Long or Short.

data Verbum Source #

Our data type for a word includes the original text of the word, that text chunked into syllables, the count of those syllables, and a marker of whether the penultimate syllable is short or long.




Eq Verbum Source # 
Instance details

Defined in Lectio


(==) :: Verbum -> Verbum -> Bool #

(/=) :: Verbum -> Verbum -> Bool #

Ord Verbum Source # 
Instance details

Defined in Lectio

Show Verbum Source # 
Instance details

Defined in Lectio

LyricPhrase: Multiple words

data LyricPhrase Source #

A LyricPhrase is a group of Verbum items (words): it contains the list of words, the total count of syllables in the phrase, and a marker for the phrase's penultimate syllable length.




LyricSentence: Multiple phrases

type PhrasesInLyricSentence = Int Source #

Each sentence includes the number of phrases therein

type PhrasesInLyricSection = [PhrasesInLyricSentence] Source #

A list of totals of phrases in a section

LyricSection: Multiple sentences with parameters for text-setting

data LyricSection Source #

A LyricSection includes a list of LyricSentences and an ArkConfig.

Including an ArkConfig structure makes it possible to structure the input text and program the ark to change meters or tones for different sections.

Get phrase lengths for prepared text

sectionPhraseLengths :: LyricSection -> PhrasesInLyricSection Source #

Get the number of phrases per sentence for a whole section.

inputPhraseLengths :: [LyricSection] -> [PhrasesInLyricSection] Source #

Get the phrase lengths for the whole input structure

Methods to read and store textual data into the above structures

newLyricPhrase :: [Verbum] -> LyricPhrase Source #

Take a simple list of Verbum items and make a LyricPhrase structure from it: the original list is stored as phraseText, and the phraseSylCount and phrasePenultLength are calculated from that list. The phraseSylCount is the sum of all the sylCounts of the words in the list. The phrasePenultLength is the penultLength of the last list item.

newVerbum :: String -> Verbum Source #

Take a String and create a Verbum structure:

  • strip the text of diacritics by removing hyphenChar and accentChar characters
  • extract syllables by stripping accents and splitting at hyphens
  • get syllable count from list created in previous step
  • get penultimate syllable length from list of syllables including accents, using penultValue

Helper methods for parsing

penultValue :: [String] -> SylLen Source #

Determine the length of the next-to-last in a list of strings. If the list length is 1 or shorter, or if there is no accentChar at the beginning of the penultimate syllable (found using penult), then the result is Short; otherwise Long.

penult :: [a] -> Maybe a Source #

Return the next-to-last item in a list.

Grouping prose

rephrase Source #


:: Int

maximum syllable count per group

-> LyricPhrase

text already parsed into a LyricPhrase

-> [LyricPhrase]

old phrase broken into list of phrases

Regroup a phrase int groups of words with total syllable count in each group not to exceed a given maximum.

TODO: Replace with more sophisticated algorithm: - what to do if word is longer than maxSyllables? (break it into parts?) - optimize this for best grouping, not just most convenient in-order

Read the whole text

prepareInput :: ArkInput -> [LyricSection] Source #

Prepare the entire input structure