kholterhoff.github.io

View My GitHub Profile

Artifact #3: Data Visualization

Calendar Common First Week Assignment Artifact #1 Artifact #2 Artifact #3 Twitter Multimedia Portfolio Working Bibliography

Students in groups of 4 or 5 will formulate a research question and create a data visualization using a periodical from HathiTrust. With the help of Georgia Tech’s Data Visualization Librarian Ximin Mi (ximin.mi@library.gatech.edu), students will build upon the skills gained in Artifact #2 by using data sets to make arguments about literary history and culture. Unlike the close reading assignments practiced in Artifacts #1 and #2, students will practice distant reading an entire corpus (comprised of numerous issues of periodical) in order to make claims. Students will argue for the relevance of their research question and present their findings in the form of a coherent, well designed, and easily comprehensible data visualization.

This assignment is an exercise in Distant Reading, a phrase coined by Franco Moretti to distinguish a way of studying literature that privileges quantitative rather than qualitative research. As Moretti explains in “Conjectures on World Literature” (New Left Review, 2000):

“the trouble with close reading (in all of its incarnations, from the new criticism to deconstruction) is that it necessarily depends on an extremely small canon…you invest so much in individual texts only if you think that very few of them really matter…Distant reading: where distance, let me repeat it, is a condition of knowledge: it allows you to focus on units that are much smaller or much larger than the text: devices, themes, tropes—or genres and systems. And if, between the very small and the very large, the text itself disappears, well, it is one of those cases when one can justifiably say, Less is more. If we want to understand the system in its entirety, we must accept losing something.”

For this artifact, students will demonstrate their ability to use large sets of data to ask specific and focused questions. They will then share their findings using data visualizations. But what is Data Visualization? According to Andy Kirk’s Seeing Data:

“Visualisations aim to help people make sense of and explore data. Experts believe representing data in visual ways can help communicate what the data means. It can also allow people the opportunity to analyse and examine large datasets, which would otherwise be difficult to understand.”

Goals

This assignment challenges students in four ways:

Formulating a Research Question

  “Judge a man by his questions rather than by his answers.” ― Voltaire

Writing a good question will be the most difficult part of this assignment. Groups will select a periodical from the list below (one group per periodical). Using this corpus, students will formulate a research question that querying this massive data set will permit them to answer. Students will likely need to formulate and debate several questions before selecting one that can be answered.

Students should conduct some preliminary research into the history and significance of their periodical. Some of these periodicals specialize in politics, others in literature. Some are illustrated, others are not. Some are politically conservative, others liberal. Some began early in the 19th century, others appeared in later. Some are high-brow, others low-brow. Some are newspapers, others are magazines. The type of periodical you select will determine the kind of questions you may ask.

Groups may ask historical questions like: Did the 1871 Franco-Prussian war increase the number of references to France in fiction? Or, did the 1857 Matrimonial Causes Act change the propinquity of references to divorce with references to sin? I suggest looking over historical timelines of English History or Victorian Legislation to see what types of questions your data set will permit you to ask.

Groups may ask questions about the aesthetics of these periodicals. What is the average number of illustrations to accompany different types of fictions? Were some authors more likely to have their fictions illustrated than others? What is the average number of illustrations to accompany a Sherlock Holmes story in The Strand? What is the distribution of illustrators that appeared in this periodical (do most artists appear only once or more often)?

Keep the “So What” of your research question in mind. Why does answering your research question matter within the context of our course themes? The impact of your question should be readily apparent. If you want to know how often shoes are mentioned in the Pall Mall Magazine between 1893 and 1914, I expect a compelling argument to accompany this research question. Be prepared to defend your question to your instructor and your peers in a Research Proposal.

List of Periodicals

Selected Periodicals

Research Proposal

Groups will submit a 800-1,000 word Research Proposal prior to beginning their Data Visualization. A research proposal articulates what your group plans to accomplish, why this research matters, and the methods you will employ.

What your group plans to accomplish: All groups will submit a visualization summarizing their results which limits the type of questions you can ask. For instance, “Yes”/”No” questions do not translate into visualizations well. Questions should be specific and complex enough that you can generate a well designed visualization to show audiences what you discovered about this topic.

Why this research matters: Explain the “So What” of your project. Why will answering this question matter? Groups will conduct research to address this portion of the research proposal. For instance, if your research question concerns the role of women journalists in the Illustrated London News, you may want to cite Barbara Onslow’s Women of the Press in Nineteenth-Century Britain (Palgrave, 2000). Proposals must cite a minimum of three academic secondary sources to demonstrate the value of your project.

The methods you will employ: After practicing extracting data from HathiTrust, and massaging this data, groups should have some idea about what steps they will take to conduct this research. Explain in a step-by-step manner the methods your group will use to complete this part of the assignment.

Data Visualization

Once groups collect their data sets from Hathitrust, and organize their findings in order to answer their questions they will use their findings to create a visualization. I suggest students review visualization types and options using Douglas Armstrong’s D3 Gallery on Github, Andy Kirk’s Visualising Data, and A Periodic Table of Visualization Methods. We will use the tools Tableau and Plotly to create our visualizations.

In class we will use your data sets from Artifact #2 to practice creating data visualizations. I will offer extra credit for any student that submits a polished visualization of these practice visualizations for inclusion in Visualizing Visual Haggard. If you choose to create a visualization for VVH, you must 1) meet with me to discuss your visualization some time prior to you final submission; and 2) email me your project by Monday, 4/23, at midnight (kate.holterhoff@gmail.com).

Be creative and brainstorm as many visualization ideas and hypotheses as possible. Keep the acronym ASK in mind during this process (Accuracy, Story, Knowledge). Although your visualization should be Accurate, think about the different types of Stories and the different forms of Knowledge your visualization might create. While groups should try out several options as part of their process, each team must determine which visualization works best and be able to explain why.

Visualizations are arguments about datasets. For this reason visualizations must have a header (stating your thesis), and will grapple with a focused portion of the data. Creating visualizations will enable groups to think both critically and creatively about the implications of their data. An important component of this artifact is justifying how to make sense of their data at a distance, rather than up close, in order to identify pertinent trends and patterns.

Visualizations must include:

Timeline

Reflection