Category Archives: Programs

I was reading a paper on A meta-algorithm for brain extraction in MRI and wondered what other studies used the Dice coefficient as a metric. Interestingly enough I came across a program called WordHoard from Northwestern University.

// The WordHoard project is named after an Old English phrase for the verbal treasure ‘unlocked’ by a wise speaker. It applies to highly canonical literary texts the insights and techniques of corpus linguistics, that is to say, the empirical and computer-assisted study of large bodies of written texts or transcribed speech. In the WordHoard environment, such texts are annotated or tagged by morphological, lexical, prosodic, and narratological criteria. They are mediated through a ‘digital page’ or user interface that lets scholarly but non-technical users explore the greatly increased query potential of textual data kept in such a form.

Look out, John Grisham.

I’m still debating on what kind of similarity metric I will be using, although this blurb on comparing texts has been the most helpful I’ve read up to now.