It started as a toy problem for flexing my new skills with the R programming language: how might the nuances of philosophical texts be represented visually, and efficiently, for students?
The basic unit is an ideme: my term for a meaningful phrase or word that cannot be used to understand the text without context, like “an evil.” An ideme alone cannot tell us what the text says, much less the sense in which a word with multiple meanings, like “evil,” is used. The sentence that contains the ideme “an evil” could mean something bad happening to someone, or an evil person.
With these issues in mind, simply filtering out “important” uncommon words like “evil” tells us very little about the nuances of an argument within a text. But how to find these idemes?
From reading the text of William James’s The Will to Believe, I figured out a list of tag words, like “because” and “since.” These are stored in a dictionary, to tell R where there is an ideme in a particular sentence. I borrowed the idea of tag words from another method of parsing linear information: DNA sequencing. Genetic sequencing machines break up genomes by tagging particular combinations of nucleotides, and my program does the same with the linear philosophical texts.
Stored in a separate dictionary are operator words that I used similarly, to tell R that the idemes on both sides are related to one another in particular ways. I borrowed the idea from computer logic operators like “and” and “or.” Now that I have broken the text apart, I have to learn to put it back together.