Questions and Searches on Old Bailey Online

23 Feb

The digitized records of the Old Bailey proceedings are extremely large in scope.  This collection of over 100,000 trials allows for the examination of various questions stemming from the British justice system over centuries.  Instead of taking a micro view, historians can step back to look for more gradual changes over time and identify patterns and deviancies through statistical analysis.  Other digital tools can also aid historians in making connections within the text itself.

The proceedings data set would be unwieldy to work with without the help of various tools.  Luckily for the researcher, Old Bailey Online provides many tools for filtering and organizing its data.[1]  Besides basic searches by categories including name, gender, offense, and time period, the web site also allows for statistical analysis, where one can chart, for example, the number of counts of deception allegations over the decades, or even specify it further to only show those cases where the defendant was female and/or ruled guilty.  For this type of analysis, specifics are reduced to mere numbers and it is the amount that is important.  Historians begin by asking “how much/many…?” and then move on to other questions based on what they discover.

The graphing of data is important in encouraging those new questions, which again deal with amounts and change. How does this amount compare to that amount?  How does the number of x change?  Are these numbers reliable? Mapping also adds a “where” to the equation.[2]  Statistical questions yield statistical answers, yet historians know that that is only the beginning of an exploration.  Franco Moretti addressed this when he wrote that graphs do not provide interpretation and that maps often do not provide explanations, but “at least [show] us that there is something that needs to be explained.”[3]  It is the “why” that interests historians, and while these tools can address some of the basic questions and provide a framing context, they cannot go much deeper.

The usefulness of these search tools also depends on what focus a historian takes.  Looking at statistical information related to gender or certain type of crime is simplified because those are default search categories. However, looking for other types of text-based information can be more difficult.  For example, motives and evidence are not as easily phrased in an easy key word format and you have to know exactly what you’re searching for (and all possible spellings) to receive results.

Looking at the text itself is a whole different approach.  Magnus Huber’s research into the spoken language of the proceedings shows a way to also statistically analyze language usage.[4]  He used the search function to look for various appearances of certain words. However, for a linguist, the search function also had limitations because there was no way to search between different types of text, like what parts were transcribed speech.  This is where the concept of text markup (using xml) and linked data come into play.

We talked about this briefly in class, but I think it merits further consideration to think about all of the connections one can add to the Old Bailey proceedings through xml. Marking parts of speech could be useful to a linguist, or for historians, topic modeling could be useful for searching by topics rather than keywords and allow historians to more easily discover information and even conventions of the time.[5] Also important is what Huber is doing: linking the speech to the sociobiographical information of the speaker.  For historians, it would be useful to have a database with more information about the people who appear in the proceedings as well.  While details like age and occupation are encoded, it should be clearer who each individual person is, which can be established through some type of name authority file. Linking to data from other sources (like census records) for the individuals listed is one way to provide historians with information  about the actors in these court proceedings.  I’m interested to hear other ideas as well.

While Old Bailey Online is well set up for statistical analysis, implementation of other digital tools can help add more context and the ability to search and analyze content in an effort to explore the “why” questions.

[1] Old Bailey Online: The Proceedings of the Old Bailey, 1674-1913,

[2] Locating London’s Past, On this site, the trial data can be overlaid with historical maps to show spatial patterns.

[3] Franco Moretti, Graphs, Maps, Trees: Abstract Models for a Literary History (London: Verso, 2005), 9, 39.

[4] Magnus Huber, “The Old Bailey Proceedings, 1674-1834. Evaluating and annotating a corpus of 18th- and 19th-century spoken English,” Varieng 1 (2007),

[5] Tim Hitchcock, “Textmining British Studies: an Overview of Recent Developments,” History Working Papers (2012),


3 Responses to “Questions and Searches on Old Bailey Online”

  1. Elena February 23, 2013 at 6:50 pm #

    Reblogged this on Musings on History.

  2. Tim Rainesalo February 25, 2013 at 10:28 pm #

    This is a very nice piece and raises some interesting questions. I like your description of historians as being concerned primarily with the ‘why’ of events and the irony that, in spite of our detective roots, questions of motivation and evidence are still difficult to find online quickly and efficiently, even in databases as revolutionary as the Old Bailey. This also harkens back to our central concern for preserving and communicating the contextual importance of the document in its historical setting, both on its own and when compared to many, many similar documents.

    Your discussion of the usefulness of sociobiographical analysis in regards to this database is also interesting and makes me wonder if, instead of adding such features to an existing database, it might be just as practical to create a separate, related one to address these needs. Of course, this brings up the issue of financial feasibility. And perhaps even suggesting this speaks to the ‘traditional’ historian’s need to compartmentalize the past even as we try to embrace a greater diversity of perspectives.

    • Elena February 26, 2013 at 8:34 pm #

      Tim, I agree that having some sort of database would be a big production, and if it would be a central document of authority, the question might also be who would host the project and give that authority. I think a sociobiographical database could certainly be helpful, but I think it would also need to have limits to not get out of hand and possibly too personal in some ways. I think we tend to compartmentalize as an organizational strategy, but you’re right that that can also be a danger because of what we might leave out.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: