Archive | March, 2013
4 Mar

ngoodlin

Going through the course readings this week, I was struck by déjà vu as I read Michael Simeone’s piece regarding the difficulties of interdisciplinary collaboration in the project he was involved with, Digging into Image Data to Answer Authorship-Related Questions (DID-ARQ).  The part that leapt out at me was where he discussed how it “was crucial for our…team to establish a common means to collect, share, annotate, and examine large amounts of image data.”[1]  This resonated with me, as it recalled a prior internship experience I had at a small archives, in which I had to dig through the accumulated metadata on digital pictures of several years of past interns in order to find all the various spelling, spacing, and capitalization variations that existed and standardize it so that the search function would work effectively.  The process was time-consuming and paranoia-inducing, as I had to constantly check my…

View original post 790 more words

Advertisements

Getting the Jump on History

4 Mar

Much of the conversation around data mining has focused on the potential value of large data sets for the purpose of macro-analysis. The practice has generated concern among some historians who fear that emphasis on quantitative analysis somehow runs counter to the central narrative mission of historians. Likewise, because quantification of large data sets enables historians to chart, map or otherwise visually represent their findings using customized digital tools, some critics have questioned whether digital historical methods can effectively answer historical questions.[1]

I think it is worth thinking about how digital historians are really engaged in two separate, but inextricably related projects: One involves the compilation and presentation of large sets of data to public or semi-public audiences. The other involves the interpretation and analysis of those data sets or subsets on a variety of scales.

Projects of the first type generally require cooperation between library and information science professionals, information technology experts, content specialists from a variety of disciplines as well as the archivists, librarians and museum professionals whose collections are to be digitally archived, catalogued and curated online. Even an individual working on a small-scale project will ideally incorporate best practices from these various fields. These projects amount to something more than a virtual archive, because their products reflect the perspectives of their various designers and the digital tools used to access the collections reflect the purposes of the various developers vis à vis the intended user audience.

The second type of project can look very much like historical practice common throughout most of the twentieth century, but without the necessity of travel to multiple archives and with less wear and tear on archival materials. It can also appear to be entirely divorced from the traditions and conventions of historical scholarship, as with Dr. Michael Kramer’s work on Digital Sonification. Most digital historical projects fall somewhere between these two extremes, with some element of data visualization or large-scale data mining and analysis.

Toni Weller, in her concluding essay in History in the Digital Age,[2] emphasizes the importance of recognizing that current historical practice allows for a continuum of engagement with the digital aspects of history. Participants do not have to be digital specialists or even identify as digital practitioners in order to incorporate elements of the digital in their scholarship. Indeed, it seems safe to say that historians universally utilize some computing technology in the production of their scholarship. The degree to which they do so, and the extent to which they frame their investigations as digital projects and/or digital products varies greatly depending on the scholar, their area of interest and the purpose of their scholarship.

Whether historians identify as digital humanists or not, existing digital history projects and those in development now present opportunities for scholars to expand the scope of their inquiries and amplify the impact and reach of their findings through digital visualization tools and the use of open access publishing and social and professional media networks. The collaborative work of digital specialists is reducing the barriers to collaboration for all scholars, and has established new avenues for communicating findings with our students, colleagues and general readers. These conduits of communication are also multi-directional, allowing for communication from readers and enthusiasts who may have relevant evidence or expertise to share, thereby enriching the product.

The rapid developments in the digital humanities have given rise to a third kind of project that amounts to a form of meta-scholarship about the work of digital humanities scholars; the study of how practitioners in a variety of fields work in partnership to arrive at innovative solutions to technical, disciplinary and interdisiciplinary challenges. The Cascades, Islands or Streams project provides a good example of this type of endeavor involving large data sets in the natural sciences. Another 2011 recipient of the Digging into Data Challenge Award, DiggiCORE, is attempting to create a software infrastructure that will enable scholars to trace, measure and analyse the interactions of scholars working with a large-scale repository of linked data across multiple platforms. This mind-bending scholarship aims to get out in front of the history of the history of historical scholarship in order to (hopefully) improve our understanding of the past.


[1]Scheinfeldt addresses this question head-on, concluding that digital humanities can and should answer questions, but that it does not have to do so yet, in his essay: “Where’s the Beef? Does Digital Humanities Have to Answer Questions?,” Found History, May 12, 2010, http://www.foundhistory.org/2010/05/12/wheres-the-beef-does-digital-humanities-have-to-answer-questions/.

 

[2] Toni Weller, History in the Digital Age (London; New York: Routledge, 2013), http://lib.myilibrary.com?id=417373.

 

The Perks and Perils of Visualizing the Digital Past

4 Mar

Historians have long recognized the utility of graphs, maps and tables in augmenting textually-driven studies with visualizations of the past. As John Theibault points out, “It is, after all, much easier and more informative to create a chart of lines of descent to the Kings of France than it is to describe the lineage in paragraphs of “begats.””[i] However, it is not enough to add these elements simply for show. Effective visual representations must be “transparent, accurate and information rich.”[ii] All three qualities are paramount in a field where our research resources offer increasingly vast amounts of data ripe for interpretation—far more than we could ever hope to personally engage with in several lifetimes. Given this abundance, I believe visually interpreting big data can make us more efficient and effective historical explorers, but as Eric Grossman rightly points out, this range of information is “useless until we have them organized into conceptual frameworks able to answer useful questions.”[ii] Throughout this semester, we have grappled with the need to engage in more “distant reading” to maintain a fuller perspective of the past on a grand scale without sacrificing historical context. In exploring several projects, I can see that this dream, while still a work in progress, is achievable.

In a previous post, I briefly explored the ways interpreting big data challenges blossoming historians to reconceptualize our relationship to our sources, our peers and our audiences. I wholeheartedly agree with those arguing for increased transparency of digital methodologies and support John Thiebault’s assertion that this is especially important when trying to visually represent and interpret big data. Letting the viewer know how and why we choose to visually represent the past precludes possible misinterpretations of the image’s message and fosters feelings of inclusiveness.[iii] This is important given that the public imbues graphs, maps and other visual aids with a greater weight of scientific objectivity, even though our field is characterized by ambiguity and interpretive uncertainty.[iv]

Indeed, attempting to visually represent the past forces us to grapple with many questions facing digital historians today. Innovative projects like the recurring Digging into Data Challenge, a interdisciplinary project devoted to building networks and pushing historians to think and collaborate in new ways, are already proving that big data is unique suited to visual representation and interpretation. A deep-seeded interest in Egyptology led me to focus on the IMPACT Radiological Mummy Database. As stated on their project homepage, this multi-institutional collaborative project is dedicated to studying “mummified remains and the mummification traditions that produced them, through non-destructive medical imaging technologies.” According to an interim report, this ongoing project seeks to move beyond simple case studies to look at the human body’s transformation into an artifact through tradition and ceremony and, through a collaborative effort with an array of archaeologists, Egyptologists, historians and even doctors, will create a “virtual mummy museum” that will allow the fragile remains to be studied digitally. Data gleamed from CT and other image scanning of the remains will allow for analysis of the prevalence of certain medical conditions in specimens, providing a valuable window into the lives of past peoples.

Constricting analysis to a digital environment through a PACS (Picture Archiving and Communication System) protects the fragile artifacts. More interesting, though, is the way this system addresses questions of access, data storage and the need to protect an individual creator’s intellectual property rights in an environment based on open access between partners—a big issue for today’s digital historians.[v] Using Clear-canvas, a thin-client web server, the project keeps the original scan in a secure server, letting the user see a graphical representation of it without downloading the original data. Not only does this protect the data and the intellectual property rights of the contributor, but it also bypasses the need for specialized visual software or the sacrifice of large amounts of a user’s storage space. It is very gratifying to see someone managing these issues of intellectual property rights without sacrificing a collaborative environment!

Exploring this project also informed my experience with Railroads and the Making of America and I’ll touch on a few of the most intriguing:

TokenX: This application allows users to search for the frequency of word usage in the speeches delivered by William Jennings Bryan during his 1896 Presidential campaign, conducted along several railways. This is particularly fascinating because it will combine the use of GIS imaging with lexical analysis, allowing us to track how his rhetoric and subject matter changed as he moved from stop to stop. This easily leads viewers to question how the social geography of his route affected the content of his speeches and how he tailored them for audiences.

Geographic/spatial analysis: This aspect underlines several different selections on the site, though I particularly enjoyed Strikes, Blacklists, and Dismissals–Railroad Workers’ Spatial History on the Great Plains and the Land Sales in Nebraska features. Taken together, these bits offer us the chance to view the railroad not only as a means of transportation, but also as a force of social change by showing how the ethnic demographics of migration patterns (and thus settlement, economics and local political demographics) changed with the prevalence of the railroads at the local and national level throughout the 1800s. Although I wish the interface in detailing the Strikes, Blackouts and Dismissals provided a bit more context, it is still an effective learning tool and push us to consider how the industrial built environment changes our society.

Exploring each of these projects has helped me grasp the potential for visual interpretation and manipulation of big data. Although optimistic, I am still cautious, for being effective in the digital realm means approaching our interpretations with the respectful dedication to historical contextualization expected of traditional monographic studies. This responsibility extends to the need for transparent digital methodologies in order to ensure that both the audience and our collaborators can get the most out of every project. And as we have seen, successful collaboration will require us to overcome some tough obstacles, with the need to protect individual intellectual property rights among the most difficult to crest. Difficult as this may be, historians must adapt, for globalization is breaking down traditional barriers to intellectual ownership and distribution of information. I am by no means suggesting that we should sacrifice claims to individual contributions for the betterment of a project, but issues of ownership should not overshadow our duty to provide historically authentic and meaningful interpretation—or keep us from collaborating at all. Such barriers are not insurmountable; it merely requires that we change our ways of thinking without sacrificing our fundamental duty to deep, contextual meaning.  After all, using images to explore big data may help us explore the past in new ways, but without the proper context, they can do more harm than good.


[i] John Theibault, “Visualizations and Historical Arguments” in Writing History in the Digital Age, ed. Jack Dougherty and Kristen Nawrotzki (Spring 2012).

[ii] James Grossman, “‘Big Data’: An Opportunity for Historians?,” Perspectives on History (March 2012)

[iii] Theibault.

[iv] Data Visualization, Tooling Up, Stanford University.

[v] Michael Simeone, Jennifer Guiliano, Rob Kooper, Peter Bajcsy, “Digging into data using new collaborative infrastructures supporting humanities-based computer science research” First Monday, 16, no. 5 (2 May 2011).

Aside

Railroaded

4 Mar

I started this blog by trying to use different software program to construct a visual representation of the arguments I wanted to present.  Although I was extremely frustrated by the process, I think I learned some valuable lessons about the process.  The process requires the understanding of the technology, a match between strategy and content and rethinking the construction of the argumentation.  (It was a bit like watching my two-year old eat with a spoon.  It takes so much longer and is so messy it is painful to watch, but unless he experiments he will be stuck eating with his hands for his whole life.)

ImageSo, after a bit I decided that it was instructive and that I would construct a analog narrative.

In monographs and peer-reviewed journal articles over the last 50 years, which I will refer to as academic projects, historians have used visualization strategies to help convey information.  Because the past is a “foreign country,” it has been incumbent on historians to use rich description, drawings, photographs, maps, timelines and other visualizations to help reanimate the past. Due to the cost of reproducing images, the use limitations, and the generally conservative nature of the field these visualizations strategies have been more circumscribed than would be optimal.  The headshot photograph, a standard of academic projects, does little to help the reader visualize the topic  and reenter the world of the past. This is perhaps a last vestige of the great man theory of history.

To best select a visualization strategy, it is incumbent on the author or researcher to identify their goals, audience for the topic, and the nature of the information to be conveyed, the resources available and the limitations (as well as tools to circumvent them). The digital environment opens up rich possibilities for data visualizations, as well as interactive  visualizations, where the reader not only shares authority, but share creation.  To utilize these models, we need to move past using the “digital” to conduct primarily analog tasks, all be it with more  speed, and experiment and embrace the new digital epistemological paths. Tim Hitchcock, in his recent article, “Confronting the Digital: Or How Academic History Writing Lost the Plot,” explores the current state and future for the retrieval, analysis and presentation of data in the historical narratives.[1] He advocates  a fresh look at digital tools and the nature of the narratives historians create.

The Old Bailey Project is an example of the digital environment being used for a variety of purposes. The search functions allow for the use of the database as a research tool, for the retrieval of information.  The “Statistic” functions allow for the visualization of the information.  The background historical sections can be used to contextualize the information. The resource section is didactic, using the project both for teaching about the past, about research, about the project, and about the digital humanities. While the site does contain some interpretations, particularly in the context section, it is largely a tool for the creation of interpretation.

Railroaded by Richard White, in partnership with the Stanford’s Spatial History Project,  does an excellent job of using visualizations for the interpretation, creation, and presentation of the narrative to create new histories.[2] This interactive project web site works in tandem with print book, to tell the story on the development of the trans-continental railroad. The site includes over twenty different visualizations and more than 2,000 interactive footnotes to accompany the book. [3] Some of the visualizations are static, such as the How to Run a Transcontentitial Railroad that is a digital storybook with cut paper slides while others are interactive such as the Hart Photo interactive, which allows you to digital compare a tour on the railroad from 1869 and 2001, as well as explore the photographs spatially.[4] Each visualization relies, to varying degrees on the knowledge of the viewer, but there is an explanatory section for each graph.  The strength of this project, as opposed to others cited by John Theibault, “Visualizations and Historical Arguments,” is that the analog narrative and the digital visualizations can stand on their own, but together have a synergetic effect of strengthening the narrative. This unique approach has lead to White’s new interpretation of the development of the railroad, which he would have been able to, he has argued, come to without the digital data analysis. I am convinced, but the many historians disagree with the methodology and conclusions. [5]

This is a project of the Stanford University Spacial history Project. They have created a geo-database, which we call the “Western Railroads Geodatabase,”  that “serves as a container that helps us organize, access, and analyze primary source data. It also bridges spatial and nonspatial temporal data to allow for analyses of discrete and seemingly unrelated primary sources, such as historic maps and railroad freight tables.”  This important and innovative way to bridge the axis of historical inquiry, across space time and source, may hold the secret   “to control how researchers and the general public access our data and to maintain quality control.[6] The project sponsors also hope that his book will serve as a hybrid publishing model. [7]


[1] Tim Hitchcock, “Confronting the Digital: Or How Academic History Writing Lost the Plot” Cultural and Social History, Volume 10, Number 1, March 2013 , pp. 9-23(15)

[3] I originally ran across this site in relationship to the book, but it is mentioned in John Theibault, “Visualizations and Historical Arguments” in Writing History in the Digital Age, ed. Jack Dougherty and Kristen Nawrotzki (Spring 2012).

[4] This interactive is based in google earth and is essentially the same idea as our walking tour, with an increased level of complexity.

Visualizing for the Public

4 Mar

According to AHA President James Grossman, we have entered an age of ‘Big Data,’ which requires ‘Big History.’[1] This phenomena finds historians tackling large corpuses of data with digital tools, to find new patterns and questions to focus their research. It always requires historians to reexamine their approaches to displaying their research. Working with complex data sets creates unique opportunities to move beyond graphs, maps, and trees, or at least to create interactive interfaces to allow readers to interact with results.

Visualization is one of these display options, a way to create “images derived from processing information… which presents that information more effectively than regular texts could.”[2] John Theibault expounds on two distinct uses for these visualizations within historical scholarship: a way to identify patterns within big data sets to pursue new research questions, or as a means to enhancing an argument.[3] The former method requires digital tools be incorporated from the inception of the project, a technique that Richard White identifies. He explains that visualizations are “a means of doing research; it generates questions that might go unasked… and it undermines or substantiates stories upon which we built our own versions of the past.”[4] As many on this blog explored last week, visualizations created from big data sets, and a combination of close and far reading of this material can prove provocative for historians to push their research and explore new avenues of inquiry.

As an aspiring public historian, my mind traditional wanders from that of academic research towards applications for these visualizations. This method provides an exciting and refreshing avenue to potentially engage audiences with complex historical topics. Digital project especially provide a space for historians to rethink how they display their research and even to make these data sets interactive.

An Example from The Spatial History Project http://goo.gl/ts186

An Example from The Spatial History Project images-1

Stanford University’s Shaping the West project is a prime example of such endeavors. Produced in conjunction Richard White’s research on the transcontinental railroad, this project developed digital tools to analyze visually how railroad influenced experiences in the nineteenth century American West. The map produced explore how perceptions of settlers in the west were influenced by patterns, like land holdings, communications and commerce. These visualizations move beyond simple maps, and vary tremendously, from chronicles of railroad accidents, to geographic residencies of Railroad Company stockholders or cattle production in the American West. Further, most maps are interactive, allowing viewers to alter data sets by years or geographic location and explore changes which occur. All of this allows views to see historical research and interpretation (because I would argue that visualization is a way of interpreting historical data) in a very different way from a customary book format. The variety of graphs, also provide insight into the myriad of influences the transcontinental railroad affected. By providing data for not solely on train stations, but economic, social, and cultural factors explores the multitude of historical sources available to examine this topic.

images

Other successful digital history visualizations allow more interactivity within data sets. Users can in essence play with the data, exploring and finding their own interpretations within visualizations of their own creation. Unlike Shaping the West, which has already done much of the historical interpretation, Mapping Text, a project out of Stanford University and the University of North Texas, allows viewers to create their own visualizations from a data set. Here, the project took over 230,000 digitized Texan newspaper pages from Chronicling America and paired it with a University of North Texas project called Portal to Texas History. The website allows viewers to conduct a qualitative survey of the newspapers, place these searches within a contextualized timeline, and see visualizations. These allow for analysis of patterns based on common words, era, or location. This visualization provides an opportunity to break down the large data set, and creates opportunities for exploration for visitors. This project, and other like it, create an avenue for non-historians and historians alike to connect to historical data and provides a tactile and straightforward method to find patterns within data.

Of course, visualizations come with their own precautionary tales, and the potential to confuse viewers or distort data. But, with these precautions in mind, visualizations coupled with digital history projects can provide exciting opportunities to engage the public with sophisticated and complex research conclusions in an understandable way.


[1] James Grossman, “Big Data’: An Opportunity fro Historians?,” Perspectives on History (American Historical Association, March 2012).

[2] John Theibault, “Visualizations and Historical Arguments,” in Writing History in the Digital Age, ed. Jack Dougherty and Kristen Nawrotzki (Spring 2012).

[3] Ibid.

[4] Richard White, “What is Spatial History?,” Spatial History Lab: Working Papers (February 1, 2010)

Visualizations for Historians and the Public

4 Mar

Digital techniques have allowed historians to research and present their findings in new ways.  Now historians can add topic modeling and visualizations to their tools for historical interpretation. As a fairly visual person, I was intrigued to look into some ways in which historians have been using visualizations in their work.  I will offer some guiding thoughts on visualizations and review a few projects, particularly looking at presentations for the public.

First I want to consider the purposes for visualization.  The Stanford introduction to visualization breaks it into two components – visualization as a research process and visualization for communication purposes.[1]  However, the boundaries between these components can be fluid, since historians can also publically present their research.

The form of the visualization may depend on its purpose and the data available.  The Periodic Table of Visualization breaks down visualizations even further to distinguish between a) those that chart and simplify pure data and those that are also representative of concepts and strategies, and b) those that show processes and those that show structures.[2]  This latter distinction is important for whether one is looking at change over time or at a single point of time.

For large data sets, visualization can be a useful way to analyze data.  Not only can the data organize the data, but also reveal information or new questions of inquiry. In my opinion, one successful visualization that fits the research model is the “Transcontinental Railroad Development, 1879-1893” map on the Stanford Spatial History Project web site.  This visualization was made in order to answer a specific question: were railroads built ahead of demand? By charting railroad lines in reference to population growth one can more clearly visualize the answer to this question.[3]

To present visualizations to a broader audience, a historian must keep several things in mind.  Above all, the visualization must be clearly understandable and useable.  The data presented should be provided with any illuminating context to explain what the visualization reveals and why it is important.  There should also be a key to the visualization, explaining factors used, the meanings of colors, lines, etc.  This is also a good time to make clear issues of scale, so as not to misrepresent any trends of data, as well as any limits on the data set itself, for a historian must craft a visualization with the same openness as an essay. I think the average person is so used to encountering scientific graphs that purport accuracy that we must make it even more clear that the data used for historical graphs is subjected to the same possible limitations in scope and bias as all of our sources.[4]

Another good example of a visualization used for both research and presentation purposes is a visualization from the Stanford site examining the Railway Unions.[5]  It is very well created for users to understand and interact with.  There is an about section that gives background and explains what trends to look for.  Colors provide reference to when a change has taken place and a graph on the side also tracks the number of unions.  The key explains the colors and symbols and a timeline allows for the animation of development over time.  Animations seem to be a good way to show a flow of change over time.  My one critique is that their sources for the union data are not cited.

Republic of Letters

View of “Mapping the Republic of Letters”

“Mapping the Republic of Letters” is another visualization project that depicts the geographical correspondence connections during the Enlightenment.[6]  It factors in space, time, and authors, also showing in graphical form the most-networked cities and correspondents, and allows the user to filter display. This project allows one to more clearly explore the geographical and chronological spread of the Enlightenment and who its greatest actors were.

However, in some ways the data set is so large that it overwhelms the visual. The geographical density of many European countries means that by examining connections over a large span of time, one is confronted with a mass of lines.  To then view which cities every node represents crowds the space.  Certain selections, like flow, are explained in a key, but at least to me, did not seem to be easily readable enough to learn anything from this animation.  Partially this is because the site seems to primarily focus on correspondents, allowing for greater filtering of them, while places cannot be filtered.  Though the site is visually striking, I personally found it difficult to identify some of the larger patterns at work.  I think the data could benefit from being overlaid over an actual (historical) map, like some of the maps on http://geocommons.com/.

This might point to a general challenge of visualizing relationships between points.  Peoplemovin is a good example focusing on the connections of item (here: country) at a time, but it could get very complicated if you imagine trying to compare multiple countries in its current format.[7]  Another Stanford map relating networks between the boards of railroad corporations is also more difficult to read because of its spatial organization.  While the space between people and line length have no significance, it is hard to ignore this visual organization.[8]

Something else I’d like to consider is what role visualizations can play in public history.  Public historians can learn a lot from journalists in their construction of narratives.  The New York Times is a good example of how stories can be presented using data and various graphs.[9]  Sometimes they ask for contributions, for example words that people use to describe their feelings on a certain topic, which are then graphed.[10]  This interactivity is something that many visualization projects could benefit from.  Though many historians will deal with historical rather than contributed data sets, this article also shows that visualizations can also allow comment sections.[11]  This can allow for users to comment on how they’ve interacted with the maps, conclusions they’ve arrived at, contextual information, and other suggestions.  How else do you think public historians can use visualizations?

Visualizations can be important in the research of historians and allow the ability to address different factors in large data sets.  Historians have been working on ways to visualize their data, from the growth of unions to correspondence connections, which haven’t been without challenges in how to best represent data.  I’ve been most interested in the usability of these visualizations by the greater public.  The best projects offer good explanations, allow for the user to filter by various factors, and are visually clear to understand.  I think that more public historians should use visualizations in order to present data to the public and offer another way for them to interact with history.


[1] Stanford University, “Data Visualization: Getting Started”, Tooling Up. http://toolingup.stanford.edu/?page_id=1265

[2] Ralph Lengler and Martin J. Eppler, “A Periodic Table of Visualization Methods.” http://www.visual-literacy.org/periodic_table/periodic_table.html

[3] Toral Patel, Killeen Hanson, et. al., “Transcontinental Railroad Development, 1879-1893,” Stanford Spatial History Project. http://www.stanford.edu/group/spatialhistory/cgi-bin/site/viz.php?id=341&project_id=0

[4] Several of these general ideas are also discussed in John Theibault, “Visualizations and Historical Arguments” in Writing History in the Digital Age, ed. Jack Dougherty and Kristen Nawrotzki (Spring 2012), http://writinghistory.trincoll.edu/evidence/theibault-2012-spring/.

[5] Evgenia Shnayder, Killeen Hanson, et. al., “The Rise in the American Railway Union, 1893-1894,” Stanford Spatial History Project. http://www.stanford.edu/group/spatialhistory/cgi-bin/site/viz.php?id=139&project_id=0

[6] “Mapping the Republic of Letters,” http://www.stanford.edu/group/toolingup/rplviz/rplviz.swf

[7] Carlo Zapponi, PeopleMovin, http://peoplemov.in/

[8] Emily Brodman, Stephanie Chan, et. al., “Western Railroads and Eastern Capital: Regional Networks on Railroad Boards of Directors, 1872-1894,” Stanford Spatial History Project, http://www.stanford.edu/group/spatialhistory/cgi-bin/site/viz.php?id=113&project_id=0.

[10] Gabriel Dance, Andrew Kueneman and Aron Pilhofer, “What One Word Describes Your Current State of Mind?” November 3, 2009, http://www.nytimes.com/interactive/2008/11/04/us/politics/20081104_ELECTION_WORDTRAIN.html.

[11] Jeffrey Heer, Fernanda Viegas, and Martin Watten, “Voyagers and Voyeurs,” http://vis.berkeley.edu/papers/sense.us/2007-sense.us-CHI.pdf.

Mapping the Lakes

3 Mar

As the digital corpus of historical data continues to expand, historians are faced with the task of making sense of the “zillions of pieces of information that traverse the Internet.”[1] It is becoming clearer and clearer that the vast digital archive at once presents great opportunities, as well as numerous challenges. We touched on a few of these challenges in class last week, and they became even more apparent as I did the readings assigned for this week’s class. On one hand, visual representations of “long data” tend to require supplemental interpretation. Further, corpuses of data can become “blobs” of non-knowledge, so much the product of mathematical algorithms that all traces of lived human experience become nearly inextractable.[2] In this post, I will explore the challenge of creating historical visualizations, and look at a project that successfully employs textual and visual analysis to interpret the past.

In paying particular attention to challenges associated with visual representations of the past, I was struck by the idea in John Theibault’s piece, “Visualizations and Historical Arguments” that as visual representations of the past become more complex and less self-explanatory, there is a risk of widening the gap between “expert and novice interpreters.”[3] Users who do not have a great deal of experience in reading complex charts or other visual representations of data (be them qualitative or quantitative) might be deterred from taking the time to explore visualizations that are not easy to understand at first glance. Complex visualizations might have the same effect on “digital natives” who find themselves taking less and less time to consume information before moving on to explore other projects or articles. Theibault uses several examples of visual representations of the past in order to illustrate his point that visual histories are becoming more and more complex as we move deeper into the digital age. For instance, he cites Shaping the West as one project whose complexity necessitates “help” and “about” sections on its webpage in order to assist those users who have trouble navigating the site and interpreting its data, or who might be inclined to navigate away from the site when it at first does not seem self-explanatory.

In thinking about the risk of creating a gap between expert and novice users, I recalled one project that reconciles its more complex digital visualization and textual analysis aspects with a great deal of contextualization and critical commentary. Mapping the Lakes: A Literary GIS emerged from the work of the Wordsworth Centre for the Study of Poetry at Lancaster University. Though the leaders of this project are not historians, per say, but digital humanists and literary scholars, it is very much in the vein of the projects being launched by historians. The project aims to create spatial interpretations of England’s Lake District based on the travel writings of poets Thomas Gray (1716-1771) and Samuel Taylor Coleridge (1772-1834). Analyses of the authors’ writings reveal feelings related to their travels, and also allow the creation of visual representations of the routes they took. An example of the maps used on the website can be seen here.

Mapping the Lakes is an example of successful textual and visual interpretation and overcomes challenges associated with representing complex datasets due in part to the fact that it includes a great deal of methodological transparency and theoretical explanation. It also combines visual and textual analysis in an interesting way. The project ultimately asks how digital technology can facilitate thinking about the spatial history of the Lake District. It explores how Gray and Coleridge had different attitudes regarding space and therefore experienced their travels in very different ways (see the mood map).[4] Users might not realize this just by looking at the comparative map provided, so the supplemental explanation about theories of space and the context on the writings of Gray and Coleridge are immensely helpful. The methodological transparency of the Mapping the Lakes project is further evident on the “Aims and Objectives” section of the website, where users can read basic information about GIS and more detailed explanations regarding the progression of the project from idea to reality.[5] Explanations such as these make the project accessible to all users.

Some might argue that the supplemental text such as the methodology section, and the information provided on theories of space is no different from the “help” and “about” sections on the Shaping the West project, which cater to the “novice” user. In addition, if digital users don’t have the attention span required to spend time to understand complex visualizations without supplemental text, why should they be expected to take the time to read the contextualizations and explanations that are provided on the Mapping the Lakes site? I think that the answer lies in how the material is presented. For me, the project website is easy to use, and unimposing. The supplemental text is sufficient, but not overwhelming. In addition, I found myself reading the text as I scrolled down the page to find the maps that illustrate the authors’ journeys. The text even delves into such things as why Google Earth “facilitates a further understanding of the ways in which both Gray and Coleridge document physical movement through environment.”[6] The supplemental material on Mapping the Lakes is a key component and adds a great deal to the project’s interpretation of the travel writings. While the creators of the project take time to explain why their methods work and facilitate spatial understandings of the Lake District, they also acknowledge the challenges and shortcomings within the project. This self-criticality enhanced the quality and transparency of the project.

Ultimately, Mapping the Lakes overcomes some of the most common challenges associated with creating visualizations of past events and places. It does so namely by balancing a more traditional, yet complementary set of historical explanations and written analyses with its digital applications of data. The transparency of method evident in this project is also unique, and sets an inspiring example for other digital (and non-digital) humanists.


[1] James Grossman, “‘Big Data’: An Opportunity for Historians?,” Perspectives on History (March 2012), http://www.historians.org/perspectives/issues/2012/1203/Big-Data_An-Opportunity-for-Historians.cfm (accessed March 1, 2013).

[2] John Theibault, “Visualizations and Historical Arguments” in Writing History in the Digital Age, ed. Jack Dougherty and Kristen Nawrotzki (Spring 2012),

http://writinghistory.trincoll.edu/evidence/theibault-2012-spring/ (accessed March 1, 2013).

[3] Ibid.

[4] Lancaster University Department of English and Creative Writing, “Gray and Coleridge: Comparative Maps”, Mapping the Lakes: A Literary GIS, http://www.lancs.ac.uk/mappingthelakes/Comparative%20Maps.html (accessed March 1, 2013).

[5] Lancaster University Department of English and Creative Writing, “Aims and Objectives”, Mapping the Lakes: A Literary GIS, http://www.lancs.ac.uk/mappingthelakes/GIS%20Aims.htm (accessed March 1, 2013).

[6] Lancaster University Department of English and Creative Writing, “Interactive Maps”, Mapping the Lakes: A Literary GIS, http://www.lancs.ac.uk/mappingthelakes/GIS%20Aims.htm (accessed March 1, 2013). 

Visualizations: Seeing history in new ways

2 Mar
An Example from The Spatial History Project http://goo.gl/ts186

An Example from The Spatial History Project http://goo.gl/ts186

When assigned a book to read for class, I usually get very excited when I see graphs and images because it means there are fewer pages for me to read.[1] Within the realm of digital history, however, visuals are becoming much more than a page-filler. Digital visualizations are being used by many disciplines to conceive of and illustrate scholarly works in new ways. For historians, visualizations help to interpret big data, provide new perspectives, and understand spatial relationships.

Although most of us can probably guess what visualization is, it is helpful to really understand what I mean when I use this term. A visualization is an image, graphic, or map that illustrates data. There are two primary functions of a visualization. The first type is probably the one we are most familiar with and would be most likely to use. These are visuals used as illustrations to supplement in visual form what has already been conveyed in writing. The second type of visualization takes it a step further and conveys new information in a way that can only really be done with an image. According to Geoff McGhee, “the real power of visualization comes in its ability to make powerful arguments, and show data in a way that raises new questions.”[2] The fact of the matter is that although we primarily conveying information through words, some things are best understood through images. This clarity that visuals can bring is dependent on the visualization and the data. In order to be effective and not just confusing and unnecessary, a visualization must be “transparent, accurate, and information rich.”[3] With careful attention to detail and audience, visualization can serve as a powerful tool.

Even though visuals and images have arguable always been used to some extent in the field of history, digital technology is exponentially increasing the use of this tool. Digital technology not only facilitates in the creation of visualizations that could only have been dreamed off several decades ago, it also has lead in the recent rise of big data that I discussed in my last blog post. Big data is unique in historians sources in that it lends itself to visualizations more than most sources. The Digging into Data Challenge that occurred in 2009, 2011, and again this year, presents numerous examples of “how “big data” changes the research landscape for the humanities and social sciences.”[4] Several of the projects that have been funded by this program use visualizations to ask new historical questions and interpret the past through images. One example of this is the Trading Consequences project which is investigating nineteenth century economic and environmental consequences of commodity trading. This project is ongoing and they are currently exploring the best way to create visualizations of the data they have digitized. In their project blog they are clear that their goal is, “the development of visualization concepts that will reveal a range of temporal, geographic and content-related perspectives on the commodity data, and that will highlight different conceptual angles and relations within the data.”[5]  Visualizations for this project have the potential to aid the historical process by providing a variety of new perspectives on a topic.

One other effective use of visualizations in a history project is the Spatial History Project at Stanford University. This project is a completion of interrelated history projects that all specifically focus on visualization. They describe the purpose of their projects as: “We organize our data in geospatial databases to better facilitate the integration of spatial and nonspatial data, and then use visual analysis to help identify patterns and anomalies…We embrace visualization as a way not simply to illustrate conclusions, but a means of doing research.”[6] These projects are representative of the most popular way for historians to utilize visualizations. Digital visualizations are immensely helpful for mapping and understanding the past on spatial dimensions. In projects like these, images aid understanding and demonstrate relationships in ways that text never could.

Visualizations are an emerging tool in the historian’s toolbox. Digital technology has provided new ways for historians to work with large data sets. When used effectively, visualizations can enhance a history project that works with big data. As these few projects illustrate, visualizations help interpret big data, provide new perspectives, and understand spatial relationships.

Sometimes, a picture is worth a thousand words.


[1] Admittedly this may not be the best attitude to have about assigned reading but I have a feeling I am not alone.

[2] Tooling Up for Digital Humanities, “Data Visualization”, Stanford University, http://toolingup.stanford.edu/?page_id=1255 (accessed February 25, 2013).

[3] John Theibault, “Visualizations and Historical Arguments” in Writing History in the Digital Age, ed. Jack Dougherty and Kristen Nawrotzki (Spring 2012), (accessed February 26, 2013).

[4] Digging into Data Challenge, “Welcome to the Challenge,” Digging into Data Challenge, http://www.diggingintodata.org/ (accessed March 1, 2013).

[5] “Progress to Date on Trading Consequences Visualization,” Trading Consequences (blog), February 28, 2013, http://tradingconsequences.blogs.edina.ac.uk/2013/02/28/progress-to-date-on-trading-consequences-visualizations/

[6] CESTA, “About Us: About the Project,” The Spatial History Project, 2013, http://www.stanford.edu/group/spatialhistory/cgi-bin/site/page.php?id=1

Are We Entering The Age of Big Data?

2 Mar

big-dataThe preponderance of online digital content, resources, and technology has signaled the beginning of many new and exciting projects within the field of history. James Grossman, Executive Director of the American Historical Association, recently shared some personal musings on whether we have entered “The Age of Big Data” in historical research, and there is reason to believe that we have in fact entered a new age of change. So much digital content now exists that historians must being learning how to incorporate research tools that help analyze data sets too large for one person to study. We are now working in a world of overabundance, where thousands of letters, land sales, wills, and maps are now readily available at our fingertips.[1] According to Tim Hitchcock, this means we are entering “a different and generically distinct form of historical analysis [that] fundamentally changes the character of the historical agenda that is otherwise in place.”[2] If this is true, what digital tools can we use to better understand these character changes?

Throughout the semester, our class has been exploring the validity of utilizing “graphs, maps, and trees” in historical research. We have also contemplated the possibility that these visualizations may supplant the “close reading” techniques we have learned throughout our entire educational careers. I believe such visualizations are complimentary to “close reading” and enhance our studies in several reasons. For one, I wholeheartedly agree with those who have rightly pointed out that data mining and visualization have helped us ask new questions, detect unique patters, and provide new answers about the past.[3] However, what is more important, in my opinion, is that these analytical tools help us make sense of big datasets and enable us to be more efficient in our reading of large amounts of content. We cannot ask questions about big data until we put that data into a form that is readily understandable for both creator and researcher. John Theibault brought this to my attention with his paper “Visualizations and Historical Arguments,” which analyzes what visualization methods best “deploy the visual capabilities of the computer to show what we wish to communicate.”[4] Clearly defined and readable digital content is just as important as a readable book, clearly.

I explored several different websites to see data mining and visualization projects in action. My research interests are primarily focused on 19th Century American history, so I found myself browsing sites that had digital projects focusing on this period. The first site I visited was Robert K. Nelson’s “Mining the Dispatch,” which–following Theibault’s description of visualization–was strikingly beautiful in its density (amount of information) and transparency (ability of information to be easily understood by an audience).[5] Nelson utilized the MALLET topic modeling program to extract specific topics from a large collection of newspaper articles from the Richmond Daily Dispatch, which he then broke up into smaller categories and subcategories that were turned into visual graphs.[6] These graphs demonstrate interesting information about the role of slavery, nationalism, economics, politics, and many other topics on the people who subscribed to the Richmond Daily Dispatch during the Civil War. I personally find it interesting that the number of “for hire and wanted ads” looking for African American labor always spiked around the month of January. Why was this?

I also enjoyed going through the University of Nebraska-Lincoln’s website “Railroads and the Making of Modern America,” which has a wide range of digital tools for data mining and visualization.[7] To wit:

1. TokenX: Described as a” text visualization, analysis, and play tool,” this program, in conjunction with the RMMA project, allows users to conduct text mining on a collection of speeches William Jennings Bryan made on four railroad trips during his unsuccessful bid for the 1896 U.S. Presidency. Users can pick out specific speeches and then organize these speeches into a word cloud, like I did with this August 8, 1896 speech.

2. The Aurora Project: This visualization project outlines the relationship between slavery and the growth of railroads in American from 1840-1870. I liked being able to study the interactive pie charts and bar graphs that helped me visualize the growth of both (until the Civil War for slavery) on a year-to-year basis, although I was unable to get the interactive map at the top of the page to work on my internet browser. I also appreciated the “Scholarly Interpretation” page, which helped provide a context for the visual data and reinforced the fact that many slaves did much more than farm labor at plantations.

3. Land Sales in Nebraska: This simple, animated cinematic map achieves several important tasks. Although I’d like a little more functionality and interactivity with it, the map gives us a partial glimpse into the relationship between people in Nebraska and their geography from 1865-1886 and, perhaps more importantly, shows us how this relationship changed over time. We can also detect interesting patterns with this animation. For example, one will notice that from roughly 1873-1876, railroad construction slowed dramatically, a result of the Panic of 1873 that crippled the entire U.S. economy.

It is important for us to once again remember that digital content requires us to proceed with caution when analyzing big data sets. When we create visualizations that document change over time, we need to display “rhetorical honesty” and document our entire research process–not just the end results–so that the audience can see where our numbers are coming from, much like an economist or scientist’s report that utilizes quantitative data.[8] When we use text mining, we still need to provide a responsible interpretation that puts those words into a proper historical context. Most importantly, collaboration is key in helping us to make sense of big data. To this end, I agree with Michael Simeone, Jennifer Guiliano, Rob Kooper, and Peter Bajcsy in that we should look for ways to build “cloud computing” repositories for collecting resources and sharing digital artifacts amongst scholars in an interdisciplinary setting. Through these systems, we may begin to “think about joining the historian’s analytical and narrative skills to the statistician’s methods of organization and analysis[,] or the historian’s facility with sifting and contextualizing information to the computer scientist’s (or marketing professional’s) ability to generate and process data.”[9] Indeed, we can use digital resources to find new opportunities to make the entire field of humanities a more important part of our lives.


[1] James Grossman, “‘Big Data:’ An Opportunity for Historians?” accessed February 26, 2013. http://www.historians.org/perspectives/issues/2012/1203/Big-Data_An-Opportunity-for-Historians.cfm; Michael Simeone, Jennifer Guiliano, Rob Kooper, Peter Bajcsy, “Digging into data using new collaborative infrastructures supporting humanities-based computer science research,” First Monday, 16, no. 5 (2 May 2011). Accessed February 26, 2013. It should also be pointed out that most archival material is not digitized. A recent report by the Smithsonian states that the institution is hoping to digitize 14 million objects in its collection, which amounts to roughly 10 percent of its total collection. See “Digitization of Smithsonian Collections.” Accessed February 21, 2013.  http://newsdesk.si.edu/factsheets/digitization-smithsonian-collections

[2] Tim Hitchcock, “Academic History Writing and the Headache of Big Data.” Accessed March 1, 2013. http://historyonics.blogspot.com/2012/01/academic-history-writing-and-headache.html

[3] Grossman, “‘Big Data’: An Opportunity for Historians?”; Council on Library and Information Resources, “Using Zotero and TAPoR on the Old Bailey Proceedings: Data Mining with Criminal Intent.” Accessed February 27, 2013. http://www.clir.org/pubs/reports/pub151/case-studies/dmci

[4] John Theibault, “Visualizations and Historical Arguments” (Spring 2012 Version). Accessed February 27, 2013.

[5] Ibid. Robert K. Nelson, “Mining the Dispatch.” Accessed February 27, 2013.

[6] For more info on topic modeling, see Scott B. Weingart, “Topic Modeling for Humanists: A Guided Tour,” accessed February 28, 2013.

[7] William G. Thomas et al., “Railroads and the Making of Modern America.” Accessed February 27,2013.

[8] Theibault, “Visulaizations and Historical Arguments.”

[9] Simone, Guiliano, Kooper, Bajcsy, “Digging into Data”; quote from Grossman, “‘Big Data’: An Opportunity for Historians?”

Visualizing Change

1 Mar

There is a point in every historian’s research when they feel inundated with materials and completely overwhelmed. This happens when a large research project hits critical mass and is what happened to me this week as I tried to understand data mining and visualization. One of the ways we survive this crazy world is to take things one word, one sentence, or one paragraph at a time, otherwise we would be crushed. As I was forced to think big this week, I felt that crushing weight. What happens when there are too many documents and literally not enough man hours in one’s entire life to analyze them all? This kind of source overload is known as big data. In order to combat this deluge, historians have begun to employ data mining and visualization techniques to study these monstrous collections.  The resulting projects are changing the way we analyze history through giving us a big picture view on a narrow topic.

The Digging into Data Challenge offers money to collaborative projects that find useful ways to employ big data.[1] They funded the IMPACT Mummy Radiological Database which contains 49 institutions’ mummy scans. The site includes information about “provenience, dating, mummification features, metric and non-metric testing, damage, restorations, and any associated artifacts, as well as metadata on the imaging studies.”[2]  I found the concepts on their page fascinating. I wish I had a reason to apply to use this database because the casual user cannot access the data.[3] However, if the board approves a researcher, they can access not only the scans but also the coding of the site so they can manipulate the searching features. This kind of database is important for scientific and historical studies as it enables new connections between objects held at various institutions across the globe.[4]

While data mining through massive databases is changing the way some historians research, it is just the first step. Next we have to grapple with how to interpret that data. Visualization tools guide historians’ research and help to display vast amounts of information in an easily digestible format. Sites like VisualEyes offer historians tools to create visualizations of their research that are easily manipulated and interpretative. The Texas Slavery Project was born from a graduate student’s research project. It brings life to what would have been a boring chart. The maps and graphs demonstrate change over time in shades of startling red. Moreover, the tutorial demonstrates that creating this visualization was not that complicated.[5] However, there are more sophisticated visualization projects that result from interdisciplinary and collaborative efforts, such as the Mapping the Republic of Letters project. However, the Texas Slavery Project and the Mapping the Republic of Letters project achieve the same end goal despite the difference in the scale of the projects. They take complicated qualitative and quantitative data and distill those complex data sets into simpler visualizations. This creates a project that enables historians to answer different sets of questions than if they had just sat in an archive reading manuscript collections. Moreover, it allows a broader audience to access and understand their work.

One of the key reasons many visualization and data mining projects are successful and revolutionary is because they require collaboration between humanists, computer scientists, and other specialists. This collaboration can be difficult, as sometimes the people you end up working with live on the other side of an ocean, but can be worth it.[6] These collaborative ventures allow historians to see the landscape of the field from a different scale.

As noted by professionals, this kind of massive project requires innovative thinking. It stops being feasible to employ traditional research tactics to analyze big data because the question of “when” a historian can finish a project turns into a question of “if” it is even feasible due to the volume of sources.[7] Therefore, these new methods change the field, the questions historians can ask, and the product. Our first blog assignment was to define digital history. This led to a discussion by many of my peers as to whether digital history would change our field. Projects that involve data mining and visualization are proof that the way historians are asking questions and proving their theories are changing.


[1] Digging into Data Challenge, “Welcome to the Challenge,” Digging into Data Challenge, http://www.diggingintodata.org/ (accessed February 28, 2013).

[2] IMPACT, “IMPACT Context Database,” IMPACT Radiological Mummy Database, http://www.impactdb.uwo.ca/IMPACTdb/Context_db.html (accessed February 28, 2013).

[3] IMPACT, “How to Create a Custom Search of IMPACT,” IMPACT Radiological Mummy Database, http://www.impactdb.uwo.ca/IMPACTdb/Create_Reports_files/How%20To%20Search%20IMPACT%20-%20Custom.pdf (accessed February 28, 2013). The restrictive nature of the database protects the institutions’ intellectual property. One does not have to pay to use it. However, they must be approved by a board.

[4] Although I could not see the data, I chose to reflect on this specific project because of the interdisciplinary promise of this site. It is useful to scientists and humanists. In a world that seems to increasingly be dominated by STEM, these kinds of data mining projects hold great promise for historians. And for full disclosure, I also chose this project to reflect on because I think mummies are cool.

[5] University of Virginia, “VisualEyes Tutorial” VisualEyes, http://www.viseyes.org/VisualEyesTutorial.pdf (accessed February 28, 2013). Certainly building a site would take a little time, but as a novice I was able to understand the basics of how this program worked.

[6] One of the main points of this article was to explain how these collaborative big data projects worked and how to overcome some of the issues that arise when undertaking such a project. Michael Simeone, Jennifer Guiliano, Rob Kooper, Peter Bajcsy, “Digging into data using new collaborative infrastructures supporting humanities-based computer science research” First Monday, 16 no. 5 (May 2, 2011).

[7] Ibid.