Activity 2.2.6 (Text mining and analysis): Use Voyant to explore a text corpus

  1. Review the following example of a researcher using Voyant: Using corpus linguistics and data visualization to understand trends in apocalyptic fiction
  2. Select a corpus of text to analyze. You have several options on how to generate this corpus:
    1. Go to Project Gutenberg and pick a book whose text is 100% available (for example, the English Version of Noli Me Tangere by José Rizal). Use the body of the text as the corpus in Voyant.
    2. Go to Google Dataset Search and find some data you can use. For example, here’s the results of a search related to book publications. You can download the dataset and then use it in your text analysis.
  3. Prepare the text for analysis as described in Finding and Preparing Text
  4. Go to Voyant and use it to explore trends in your text corpus.
  5. Are there any interesting patterns that you can see? Be prepared to share your findings.

Advanced version of this assignment: If you can get a copy of Benedict Anderson’s book, Why Counting Counts, you can learn more about his interest use of text mining to argue about how there was a change in national consciousness between the writing of Noli Me Tangere and El Filibusterismo, and that change can be seen in the shift in Rizal’s use of language. See whether you can verify his claims!