Participants will be introduced to the freely-accessible text analysis tools, Voyant and through hands-on examples will learn the features of this tool for analyzing both single and multiple texts.
These are BIG data files so not easily accessible on a laptop. Ask for our help.
Voyant can read the following file types: .txt, .html,. htm, .xml, MS Word .doc or .docx, .rtf, and .pdf
Single documents (a corpus) or multiple documents (a corpora) can be analyzed.
You can load text one of three ways:
Open the American Red Cross Text-Book from the text box to the left, then open either the first HTML version or the plain text UTF-8 version. Select all of the text using the Ctrl A keystroke on a PC or the Command A keystroke on a MAC, copy it then paste into the box on the Voyant home page.
Click the REVEAL button and in a few minutes you'll see the results.
The Voyant result screen is divided into 5 segments or "skins," each displaying a different format of text analysis.
Mouse over the upper right corner of the Cirrus skin to reveal the task icons available in this skin.
The word cloud is the most obvious of the visualizations available in Voyant. The default minimum is 55 words:
Take a moment to notice what happens in each of the other 4 skins as you click on individual terms in the word cloud.
Explore the Trends skin:
Customize Your Trends Skin:
Let's explore the relationship of several terms in this text. I'm interested in how the words "mother" "father" and "child" relate to one another throughout the text. Type each of these words into the search box at the bottom of the Trends skin to add them to the visualization.
Hover your mouse pointer over the ? to see a list of syntax variations you can perform. Perhaps most useful is the option for proximity searching: ~5 to locate words within 5 words of each other (or other number that is appropriate for the situation).
The Context skin is where we will explore the other visualization tools that are part of Voyant.
Move your mouse into the tile bar on the Context skin and click on the Window icon that will appear. You'll see a menu of options. Let's look at the Corpus Tools option. You're familiar with several of these - Cirrus, Terms, and Summary - as they are already visible in the standard Voyant display. We'll take a closer look at the Word Tree, Topics, and the Scatterplot tools.
Lastly, we'll look at how Voyant processes and helps you analyze several texts as a unit - known as a corpora.
My example for this workshop uses several articles I've downloaded from my Zotero file which I am hypothetically using to write a literature or systematic review article. I want to see if Voyant will help me identify the key themes among the articles and the relationships of those themes across the selected literature.
Note changes in the Reader, Trends, and Summary skins so that the separate documents are identified.
A variety of sources can be used to obtain text for analysis:
Your first pass may reveal data you don't care about:
These are indications that your text is "dirty" - has additional coding or older typefaces - that it is reading and adding into your results or that it can't read and is creating a best approximation. You will want to clean these up before running them through the text analyzer tool again.