Down and Dirty Guide to Literary Research with Digital Humanities Tools: Text Mining Basics

2010-10-05 - IMG_2791

Miao and Jason get things done with computers!

As part of the final Digital Pedagogy seminar of fall 2012, Margaret Konkol, Patrick McHenry, Olga Menagarishvili, and I will lead the discussion on “trends in the digital humanities.” You can find out more about our readings and other DH resources by reading our TECHStyle post here.

As part of my contribution to the seminar, I will give a demo titled, “Down and Dirty Guide to Literary Research with Digital Humanities Tools: Text Mining Basics.” In my presentation, I will show how traditional literary scholars can employ computers, cameras, and software to enhance their research.

To supplement my presentation, I created the following outline with links to useful resources.

Down and Dirty Guide to Literary Research with Digital Humanities Tools: Text Mining Basics

  1. Text Analysis and Text Mining
    1. My working definition of text mining: “Studying texts with computers and software to uncover new patterns, overlooked connections, and deeper meaning.”
    2. What is Text Analysis: Electronic Texts and Text Analysis by Geoffrey Rockwell and Ian Lancashire
    3. Text mining on Wikipedia
    4. Text Mining as a Research Tool by Ryan Shaw (an excellent resource with a presentation and links to more useful material on and offline)
  2. Advantages to Digital Research Materials
    1. Ask Interesting Questions That Would Otherwise Be Too Difficult or Time Consuming to Ask
    2. Efficiency
    3. Thoroughness
    4. Find New Patterns
    5. Develop Greater Insight
  3. Types of Digital Research Materials
    1. Your Notes
    2. eBooks
    3. eJournals
  4. Digitizing Your Own Research Materials
    1. What to Digitize
      1. Primary Sources
      2. Secondary Sources
    2. How to Digitize
      1. Acquire
        1. Camera > high resolution JPG
        2. Scanner > high resolution TIFF or JPG
      2. Collate as PDF
        1. Adobe Acrobat X Pro (now XI!)
        2. PDFCreator
        3. Mac OS X Preview
      3. Perform Optical Character Recognition (OCR) to generate machine readable/searchable plain text
        1. Adobe Acrobat X Pro
          1. Print PDF to a letter size PDF
          2. Tool > Recognize Text
        2. DevonThink
        3. Use Google
        4. Others?
      4. Save As/Export plain text > .txt files
      5. Engage the “Text” in New Ways
        1. New Ways of Seeing “Texts”
          1. Keyword Search
          2. Line Search
          3. Word Counts
          4. Concordance
          5. Patterns
        2. Tools to Help with Seeing “Texts”
          1. AntConc
          2. BBEdit (“It doesn’t suck” ®)
          3. MacOS X and Linux: cat, find, grep, and print (use “man cat” and “man grep” to learn more from the Terminal. More info herehere, here, here, and here.)
          4. DevonThink
          5. Notepad++
          6. Mac OS X Spotlight/Windows 7 Search
          7. TextEdit
          8. Others?
IMG_0987

Miao awaits digitization.

I am a professor of English at the New York City College of Technology, CUNY whose teaching includes composition and technical communication, and research focuses on 20th/21st-century American culture, science fiction, neuroscience, and digital technology.

Tagged with: , , , , , , , ,
Posted in Georgia Tech, Pedagogy, Research
One comment on “Down and Dirty Guide to Literary Research with Digital Humanities Tools: Text Mining Basics
  1. […] Jason W. Ellis’ “Down and Dirty Guide to Literary Research with Digital Humanities Tools: Text Mining Basics&#… […]

Comments are closed.

Who is Dynamic Subspace?

Dr. Jason W. Ellis shares his interdisciplinary research and pedagogy on DynamicSubspace.net. Its focus includes the exploration of science, technology, and cultural issues through science fiction and neuroscientific approaches. It includes vintage computing, LEGO, and other wonderful things, too.

He is an Assistant Professor of English at the New York City College of Technology, CUNY (City Tech) where he teaches college writing, technical communication, and science fiction.

He holds a Ph.D. in English from Kent State University, M.A. in Science Fiction Studies from the University of Liverpool, and B.S. in Science, Technology, and Culture from Georgia Tech.

He welcomes questions, comments, and inquiries for collaboration via email at jellis at citytech dot cuny dot edu or Twitter @dynamicsubspace.

Archives

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 847 other followers

Blog Stats
  • 484,835 visits