MLA 2014 Session Notes

*Original plaintext notes (in markdown) are [here](*

## 98. Vulnerable Texts in Digital Literary Studies

### Jeremy Douglass

* Subject is _Meanwhile_, the fiction that began as a poster, became a website, then a tabbed book, then an iOS app. 
* Branching narratives as networks.
* What is M’s status?
* poster – print – canvas UI
* hypertext – electronic – page UI
* tabbed book – print – page UI
* iOS app – electronic – canvas UI
* See also: Queneau’s branching narrative and “choose your own adventure” books. 

### Rachel Sullivan

* Subject is code comments. 
* References: Galloway’s _Protocol_ (2004), Hayles’ _My Mother Was a Computer_ (2005). See also work by Nick Montfort and Rita Raley (code surface || code depth). Adrian MacKenzie, _Cutting Code_ (2006).
* reader/user –> reader/user/programmer
* Bit rot, code bloat…
* Her examples: /ti-explorer/kernel/arrays.lisp
* Jeremy Douglass gave a paper at the 2011 Critical Code Studies conference: article published in _Vectors_ journal. 

### John David Zuern

* Subject is something he is calling “curatorial reading” which he is triangulating / differentiating from close/distant reading. 
* Stuart Moulthrop … Stephanie Strickland … (+?) … These are all examples of electronic literature being preserved by the ELO, Electronic Literature Organization.
* His pantheon of critics:  Stephen Best, Sharon Marcus, Heather Love.
* Flash fiction whose status is unknown “My Name Is Captain, Captain” (Judd Morrissey and Lori Talley): catalog of airshow maneuvers codes each maneuver with a particular symbol. Symbols used in text.
* Curators place objects along various historical and cultural axes.

### Q & A ###

* On code executability: different browsers interpret code differently –> computational environments change and code no longer executes –> a move to preserve computational environments (see Kirschenbaum’s article).

* * *

## 155. Literary Criticism at the Macroscale

### Andrew Piper ###

* Subject: the Wertherian exotext.
* _Sorrows of Werther_ was translated, imitated, etc.
* “post-mimetic” 
* Genette’s 5-part schema of textuality: intertext, paratext, metatext, hypertext, architext + exotext.
* Not looking for the Wertherian, which texts are most like _Werther_, but the Werthericity of these texts: which texts are most like each other. 
* “Scalar reading” 
* English_324: words common to Werther – words common to all novels
* Veroni —> community detection —> nodes that have most links among themselves + nodes that are most “between” 
* [I want to make sure I understand the different between community detection and topic modeling.]

### Hoyt Brown ###

* —> site for Global Literary Networks
* used naive Bayes out of the NLTK

### Underwood’s Commentary + Q & A ###

* TU: Hobbes’ description of the Leviathan.
* “fuzzy matching of 3, 4, 5, 7-grams”

* * *

## 233. Seeing with Numbers: Sociological and Macroanalytic Approaches to Literary Exclusion

### Andrew Goldstone

* ->
* Seeking to answer Moretti’s question: “Who counts?” 200 canonized novels = 0.5% of novels published. (Moretti in audience.)
* “Let’s confront sublimity with computational methods.” (More striking when he said it.)
* -> John Guillory, _Cultural Capital_.
* Scholarly reading as a practice.
* Reception vs. consumption
– Book Scan
– Primary texts are really secondary (from units sold)
* -> [Has anybody looked sales v. influence? -> Maybe in the sociology of reading?]
* Methodolofy
– Down-weighted book chapters: collections skewed results.
– Gini co-efficient used to measure inequality
* Authors symbolically “rich” get “richer” as measured by operationalizing prestige.

### Richard So

* Modernism
– 12000 poems
– 2200 poets
– 10s periodicals
* re: poet: “Is there something in his literary genetic code that authorizes his exclusion?”

### Matthew Jockers

* He first began imagining a “macroeconomics of literature” in 2005.
* Reviewed distinction between macro as quant analysis and micro qual assessment. [Noted: the analysis / assessment distinction.]
* For Brown and So: “What are the literary primitives [from genetics] and what do they signal?”

### Q & A

* Relationship of BoW to style. Word choice is one answer. Others?
* Opporunity: 40,000 new fiction titles / year.
* -> Jim English, _Economies of Prestige_.
* Focus of a study by an audience member: _McSweeney’s_.

* * *

## What is Data in Literary Studies

Session speakers are in pairs:

* First pair: different modes of reading
* Second pair: ontology of literature vs ontology of data
* Third pair: literary data as conceptual resource, as a system

### First Pair

* Daniel Rosenberg on “data” (as rhetorical device).
* Versions of data (Foucault vs cybernetics [see Pickering]).
* Becdel test is algorithmic.

### Second Pair

* What isn’t data?
* Data in the context of topic modeling (machine learning): inventoried all the things that get dropped: function words, honorifics, proper nouns, dialect … and this doesn’t include stemming. By the time everything has gone through this scrubbing process, it is *not* literature. But it is data. … LDA does make particular assumptions about how texts get composed. (If those assumptions are not accurate, what does that mean for the utility of probabilistic modeling? I.e., reverse genesis.)
* -> _Raw Data Is an Oxymoron_.
* -> AAAS has visiting fellowships.

### Third Pair

* [Cutting edge in sociology of reading in this room: Latour and Goffman. E.g., Heather Love.]
* Literature as primary source *and* conceptual source.
* -> Reference to _Objectivity_: see also Cynthia Wall’s account

### Q & A

* Role (goal) of prediction? Literature as a model vs literature as a scenario.
* Johanna Drucker reference that I didn’t follow.
* When book history entered the academy 20-25 years ago, it allowed us to see new things.
* -> How does literature store information? The Homeric epic as technology of information storage and retrieval. [See Lord and Rubin.]
* Moretti: the Annalists turned to the archive to ask different questions about history; it is not yet clear what the archive will mean for literary history.

* * *

## Surface Reading

* “The Way We Read Now” was a special issue of _Representations_.

### Heather Love & Sharon Marcus

* _Reading Methods in Literary Studies_: Recent debates about reading have questioned the core methodological commitments of literary studies. This course will serve as an introduction to these debates through an exploration of topics including the exhaustion of critique, post-hermeneutic criticism, and the relation between interpretation and description. Many of our readings will situate new methods such as surface reading, just reading, distant reading, and reparative reading in the longer history of the discipline, exploring the links between these methods and New Criticism, New Historicism, and Marxist and psychoanalytic criticism. The course will also introduce students to some of the alternatives to…
* _DH Assignments_:
* Get set up on comment with Benito Cereno
* Scan assigned pages of Benito Cereno
* OCR — convert TIFF into .text and clean up txt file
* Comment on doc using Co-Ment
* Text analysis of BC (data mining and visualization) using Voyant, Tapor, Google Ngram, Word Hoard
* Read TEI guidelines, text body, Chapters 1-4, and select on additional chapter. Explore TEI tutorials.
* Comment on your pages of CS unsung Co-Ment
* Diagnose BC sympton
* When it came time to collaborate, they needed to develop a controlled vocabulary. They used a Google Doc to develop the CV.

### Ted Underwood

* “Computer scientists are alien objects stored somewhere else on campus, from the point of view of many humanists.” 
* CS is really philosophical: what does it mean to learn.
* Alan Liu’s essay from last summer. [> Published where? DHQ? PMLA?]
* Trying to model literary characters in the same fashion as topic models: the work is made difficult because character is more complex than it looks.
* Reading Todorov 1971.

### Alex Gill

* A theory built on top of the work of Jerome McGann: a theory about everything being surfaces. McGann’s work is on topology.
* Levenshtein distance is the minimum number of edits that it takes to turn one string of text into another. (Like the game that turns one word into another by changing only one character at a time?)
* E.g., Borges’ library.
* Thanks to computational methods, we are finally paying attention to the ways that texts relate to themselves.

### Questions & Answers

* Beware temporal compression, even elision, that occurs in so-called social networks of fictions. E.g., Moretti’s graphs of Shakespeare.
* Question of using tools to replicate what we already do by hand …  TEI becomes a form of reception history.
* Andy Stouper has an essay: the importance of libraries keeping old books for the marginalia. 
* Teaching DH class on Benito Cereno got Heather Love to re-read _S/Z_ and make her appreciate the text.
* TU prefers not to frame critical methods in opposition to each other.

Possible project: _Structuralism for Digital Humanists_.

* * *

## Making Sense of Big Data

### Anupam Basu ###

* He has a postdoc at Washington University; interested in machine learning and informatics. 
* Big data = terabytes. Compared to genome of fruit fly, humanities “big” is still small.
* EEBO: Early English Books Online:
* Two categories of texts: those in ESTC (super set) and those digitized in EBOTC.
* Complexity is a relatively well-defined concept in computer science. [What is it?]
* Standardization of spelling occurs during the expansion of print during the English civil war, circa 1630. E.g., moue > move.
* Another use of Gini: this time to measure the variation of English orthography. 

### Mark Algee-Hewitt

* ECCO Archive: Eighteenth Century Collections Online.
* Now Bakhtin?! Look at “From the Prehistory of Novelistic Discourse”, “Epic and Novel” and “Discourse in the Novel.”
* _Bleak House_ in five chunks — how were the chunks chosen? — for PCA. [I think he’s using R.]
* Testing Bakhtin’s supposition that novelistic discourse will be more heterogeneic than poetry? (Not sure what MB actually claimed, but this is interesting.) Arrived at an “H score” (heteroglossia).
* Followed H score with a t-test that he box plotted.
* K-L divergence?
* Overall: fiction is more self-similar than poetry.
* Was Bakhtin wrong about competing discourses contained within the novel or is it simply the case that the heteroglossia take place at a different order / level of discourse? [How to quantify register?]

### Laurie Mandell

* Early Modern OCR Project (EMOP):
* Mellon Grant is on-line — it’s the hardest book she ever wrote.
* Goal is to make the entire TCP available. All 300,000 (128,000 EEBO + 182,000 ECCO).
* Using tesseract-ocr (developed by Google).
* 70% (OCR) accuracy is gauged as good enough by historians. But it’s not in our field.
* Go to if you edit a text, you can get the text for free.
* TCP was done by sweat shop labor.

### Questions

* For Mark A-H: his work on Bakhtin highlights something that has been a consistent thread through a number of panels: how do we negotiate the move past texts as bags of words? That is, what are your thoughts on developing a computational identification of speech registers? > MAH has developed something but he wasn’t comfortable describing it.
* One of the things that seems interesting here is that you can develop an algorithmic infrastructure and then use it over many projects.
* _Ah hah!_ Mark is Piper’s collaborator. 

* * *

## 402. Beyond the Digital: Pattern Recognition and Interpretation

* Notes for talks are [on-line](

* A series of 7-minute talks.
* Topic modeling as both a navigational tool and as an interpretive practice.
* See: [](
* Viral Texts draws from the LoC’s _Chronicling America_ archive of newspapers. (Holes in archive are due to states that have not contributed.)
* Nineteenth century newspapers look a lot like contemporary websites and Facebook. 
* * Newsy pieces moved fast, average life of 3 months.
* Literary pieces moved slow, average life of 5 years.
* Cordell’s grant to NEH: [here](

Leave a Reply