Research about late antique women is often very difficult due to a glaring lack of sources. Letters addressed to women are rare, writings of women even rarer. The correspondence of the bishop of Hippo Regius, Augustine, written between 386 and 430 AD, is no exception. Despite 252 letters written by himself, only 17 are directed to all-female correspondents. Their responses are not extant.
One way to start in such a disparate setting is to find a letter to a woman and a letter to a man in similar situations. Apparently, ep. 218 to Palatinus (CSEL 57, p. 425,9- p. 428,4) and ep. 266 to Florentina (CSEL 57, p. 647,1- p. 650,20) are convenient candidates.
Palatinus, a young noble man, had chosen a life of asceticism, which Augustine encourages to continue. Florentina, also of very noble birth, had chosen the same life and wants Augustine to teach her. However, Florentina’s mother informed Augustine about her daughter’s wish, because the girl was too shy to ask him herself. He offers Florentina his services and encourages her to ask him questions if she finds no answer by herself. In both cases, Augustine acts as a mentor helping young people to hold onto their selected Christian way of life. But does he really treat both Palatinus and Florentina the same way? Is it possible to find differences in his tone?
In this post, I want to find an answer to this question by focusing on the vocabulary. If I just compare word lists without any modification, the results will be meaningless. The letter to Palatinus contains 570 words; the letter to Florentina, which contains 738 words, is a bit longer. The most frequent words to Palatinus are et (26), in (23), ut (12), non (11), and te (10); to Florentina et (22), in (21), est (18), quod (14), and non (13). The explanatory power of this outcome is still rather limited and does not feel very exciting. (If you have a digitized version of the text, you can easily generate word statistics by using Voyant Tools.)
Small words can be very important in cases of authorship attribution. Hereby, position and frequency of prepositions or conjunctions can be typical signs of a certain author. However, if we are looking for content, they are interfering and disturbing, which is why we need “stop words” that do not count. (I use a stop word list by Gene Diaz which I have extended: a, ab, ac, ad, at, atque, aut, autem, cum, de, dum, e, enim, erant, erat, ergo, est, et, etiam, ex, haec, hic, hoc, iam, in, ipse, ita, me, ne, nec, neque, non, per, qua, quae, quam, que, qui, quia, quibus, quidem, quippe, quo, quod, quoniam, re, rebus, rem, res, sed, si, sic, sicut, sive, sum, sunt, tamen, tandem, tantus, te, unde, ut, vel, velut.) You can easily incorporate the stop word list in Voyant Tools by clicking on the symbol marked by the red arrow (see screenshot below). The appearing window will let you click on Edit List and type in (or better copy and paste) the stop words (but be careful: one term per line).
Yet, stop words are not enough for sufficient results. They are still biased by different tokens of the same words. Augustine might use the word “go” ten times, but if it is not lemmatized, we may find “go” four times, “went” three times, and “gone” three times. If there are nine cases of ‘walk’ in the text, it may seem like Augustine is referring to “walking” rather than ‘going,’ which would be wrong in this case. This is why we should lemmatize the text. (To do this, I modified my text files with the TreeTagger (my experiences with that tool I already described in an earlier blog post).
If we compare the two letters now, it looks like this:
|Ep. 218 to Palatinus||Ep. 266 to Florentina|
In ep. 218, Augustine seems to be quite empathetic. He speaks directly to his addressee, using mostly tuus and tu. Ep. 266 looks more distant. The first person (ego, me) is nearly as frequently used as the second person. While Palatinus is involved with Augustine (noster), Augustine disassociates himself from/is more distant to Florentina (meus). In the letter to Palatinus, we can find very positive words (bonus, confido); the words in ep. 266 are more neutral (scio, dico, do, doceo).
Well, an analysis like this still seems to construct a rather weak argument. Just comparing raw frequencies seems to be too simple to get further insights. We could still go a step further by not asking how frequently a word is used in a letter, but how outstandingly frequent a word in a letter is compared to the remaining corpus. I wanted to find out the key words of the letters.
To achieve this, I acquired the software WordSmith (a single user license is available for 50 pounds).
First, I had to click on “WordList” to transform all letters into a word list format – luckily you can select all files together. (I do possess the letters as raw text files – one letter per document. I chose the ones which were already lemmatized by the TreeTagger!) Afterwards, I generated another word list using all letters in one file so that I had a single wordlist for every letter and another wordlist covering all letters together.
Eventually, I was able to create keywords. I chose all wordlist files in the first line. As a reference corpus, I took the file which covered all letters together. Then I created a batch which resulted in an Excel-file covering all letters and their keywords. After some modifications to make it more user-friendly, I published my result as ‘ Keywords in Augustine’s Letters (KAL)’.
But what does a keyword analysis contribute to our analysis of the letters to Palatinus and Florentina?
Ep. 218 to Palatinus
Ep. 266 to Florentina
The results are even more evident than before. While tuus and confido are keywords in ep. 218 to Palatinus, which reveal a friendly atmosphere, in ep. 266, only the issue – becoming a doctor for the girl – is emphasized. Still in 1935, Keenan (p. 53) described this letter as a sign of Augustine’s ‘characteristic kindness and modesty,’ which many researchers also agreed with. But with a more quantitative perspective, Augustine appears rather standoffish in this letter.
Whether Augustine’s different attitudes in these letters are gender-dependent or not, is not said. But it gives a first hint that it could be so. To answer this, we have to analyze the text more deeply and combine our results with interpretative close reading processes. Nevertheless, this quantitative analysis of Augustine’s vocabulary has already shown that things are not always what they seem to be.
Keywords in Augustine’s Letters, ed. Christopher Alexander Nunn (Online-Ressource: https://doi.org/10.11588/data/R5RAXO), heiDATA: Heidelberg Research Data Repository 2018.
TreeTagger, ed. Helmut Schmid (Online-Ressource: http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/), Stuttgart 1994.
Voyant-Tools, ed. Stefan Sinclair / Geoffrey Rockwell (Online-Ressource: http://www.voyant-tools.org), 2016.
WordSmith Tools 7.0, ed. Mike Scott (Online-Ressource: http://www.lexically.net/wordsmith/index.html), Oxford 2016.
Augustini Epistulae 185-270, ed. Alois Goldbacher, Bd. 4, CSEL 57, Prag / Wien / Leipzig 1911.
Keenan, Sr. Mary Emily: The Life and Times of St. Augustine as Revealed in His Letters, Washington D. C. 1935.