Shakespeare and Statistics

Professor Hugh Craig, University of Newcastle


Shakespeare and statistics is not an obvious pairing, but digital texts of his plays and poems are now widely available, and scholars are getting used to analysing them with computers. This lecture will reflect on what we have learned from these studies.

The best-known findings are about authorship. The 2016 New Oxford Shakespeare relies heavily on quantitative work and includes forty-two plays rather than the usual thirty-eight, and at the same time advances the idea of Shakespeare as a collaborator as well as a sole author.

Quantitative work has provided new perspectives on where Shakespeare is exceptional, and on where he (surprisingly) is not. There have also been discoveries about the overall shape of his work, detecting patterns not obvious to the naked eye and only visible through ‘distant reading.’ Comedies and histories, for instance, emerge as the main contrast in genres, rather than comedies and tragedies. Through numerical synopsis Shakespeare’s plays, scenes and dramatic characters can be seen to cluster and contrast with those of his contemporaries, just as some features invisible on the ground become clear from an aerial photograph.

The lecture will also discuss the discomfort many Shakespeareans feel at this quantitative turn in Shakespeare studies. Is something about literary study, or about the humanities, betrayed, when a ruthlessly numerical ruler is run over these works, which mean so much to so many? Computers can count, but they can’t read, so what finally can be learned from them? Are we in danger of being seduced by the quantitative because it is now possible, rather than because it is useful? 

RSVP by 12 September 2018

Presenter Bio:

Renaissance literature expert Professor Hugh Craig is a man of letters. But the computational stylist is equally a man of numbers.

Craig is the Director of the Centre for Linguistic and Literary Computing. He has been an advocate of computer-assisted analysis of language in literature since the controversial field began to emerge in the 1980s. He has devoted decades of research to proving that statistics can help us analyse and appreciate literary texts.

Craig says computational analysis has two applications in the field of literature: it can help authenticate authorship that is unknown or suspected to have been wrongly attributed and it can be used to build a profile of or define a writer's particular style.

"It is still controversial because people in the literary world don't like numbers, they don't trust numbers, and they don't understand how you can do something as banal as counting things in a literary context," he says. "That is why it is fun; because it does challenge people and threaten some people. As you can imagine, I get in some pretty heated discussions."

Craig's work is based largely on frequency data and has led to several breakthrough findings in regard to Shakespearean works. Using his computational techniques he found that Shakespeare was the likely author of a number of scenes from the play The Spanish Tragedy that had previously been attributed to the playwright Ben Jonson. The results are presented in his 2009 co-edited book Shakespeare, Computers and the Mystery of Authorship. 

He has also established that Shakespeare did not have the wide vocabulary many credited him with. "There was a myth that Shakespeare had an extraordinarily large vocabulary, but our analysis shows that he didn't. His talent was in the way he used regular, ordinary words," Craig explains. "What we did was look at the words he used and the frequency with which he used them and compared that to what other playwrights of the time were doing. Our research showed the difference in vocabulary was not striking."  


Terrace Room, Level 6,
Sir Llew Edwards Building (#14), St Lucia

Light refreshments will be served following the lecture