Text analysis with R : for students of literature /
| Main Author: | |
|---|---|
| Corporate Author: | |
| Other Authors: | |
| Format: | eBook |
| Language: | English |
| Published: |
Cham :
Springer,
2020.
|
| Edition: | 2nd ed. |
| Series: | Quantitative methods in the humanities and social sciences.
|
| Subjects: | |
| Online Access: | Connect to the full text of this electronic book |
Table of Contents:
- Intro
- Preface to the Second Edition
- Preface from the First Edition (Still Relevant)
- Contents
- About the Authors
- List of Figures
- List of Tables
- Part I Microanalysis
- 1 R Basics
- 1.1 Introduction
- 1.2 Download and Install R
- 1.3 Download and Install RStudio
- 1.4 Download the Supporting Materials
- 1.5 RStudio
- 1.6 Let's Get Started
- 1.7 Saving Commands and R Scripts
- 1.8 Assignment Operators
- 1.9 Practice
- References
- 2 First Foray into Text Analysis with R
- 2.1 Loading the First Text File
- 2.2 A Word About Warnings, Errors, Typos, and Crashes
- 2.3 Separate Content from Metadata
- 2.4 Reprocessing the Content
- 2.5 Beginning Some Analysis
- 2.6 Practice
- 3 Accessing and Comparing Word Frequency Data
- 3.1 Introduction
- 3.2 Start Up Code
- 3.3 Accessing Word Data
- 3.4 Recycling
- 3.5 Practice
- 4 Token Distribution and Regular Expressions
- 4.1 Introduction
- 4.2 Start Up Code
- 4.3 A Word About Coding Style
- 4.4 Dispersion Plots
- 4.5 Searching with grep
- 4.6 Practice
- Reference
- 5 Token Distribution Analysis
- 5.1 Cleaning the Workspace
- 5.2 Start Up Code
- 5.3 Identifying Chapter Breaks with grep
- 5.4 The for Loop and if Conditional
- 5.5 The for Loop in Eight Parts
- 5.5.1
- 5.5.2
- 5.5.3
- 5.5.4
- 5.5.5
- 5.5.6
- 5.5.7
- 5.5.8
- 5.6 Accessing and Processing List Items
- 5.6.1 rbind
- 5.6.2 More Recycling
- 5.6.3 apply
- 5.6.4 do.call (do dot call)
- 5.6.5 cbind
- 5.7 Practice
- 6 Correlation
- 6.1 Introduction
- 6.2 Start Up Code
- 6.3 Correlation Analysis
- 6.4 A Word About Data Frames
- 6.5 Testing Correlation with Randomization
- 6.6 Practice
- 7 Measures of Lexical Variety
- 7.1 Lexical Variety and the Type-Token Ratio
- 7.2 Start Up Code
- 7.3 Mean Word Frequency
- 7.4 Extracting Word Usage Means
- 7.5 Ranking the Values
- 7.6 Calculating the TTR inside lapply
- 7.7 A Further Use of Correlation
- 7.8 Practice
- Reference
- 8 Hapax Richness
- 8.1 Introduction
- 8.2 Start Up Code
- 8.3 sapply
- 8.4 An Inline Conditional Function
- 8.5 Practice
- 9 Do It KWIC
- 9.1 Introduction
- 9.2 Custom Functions
- 9.3 A Tokenization Function
- 9.4 Finding Keywords and Their Contextual Neighbors
- 9.5 Practice
- Reference
- 10 Do It KWIC(er) (and Better)
- 10.1 Getting Organized
- 10.2 Separating Functions for Reuse
- 10.3 User Interaction
- 10.4 readline
- 10.5 Building a Better KWIC Function
- 10.6 Fixing Some Problems
- 10.7 Practice
- Part II Metadata
- 11 Introduction to dplyr
- 11.1 Start Up Code
- 11.2 Using stack to Create a Data Frame
- 11.3 Installing and Loading dplyr
- 11.4 Using mutate, filter, arrange, and select
- 11.4.1 Mutate
- 11.4.2 filter
- 11.4.3 select
- 11.4.4 arrange
- 11.5 Practice
- 12 Parsing TEI XML
- 12.1 Introduction
- 12.2 The Text Encoding Initiative (TEI)
- 12.3 Parsing XML with R Using the Xml2 Package
- 12.4 Accessing the Textual Content