Natural Language Processing Data Exploration

This text analysis uses NLP functions from R tm{} package to explore and understand the corpus before the implementation of a n-gram prediction model.

The corpus is the “English-US” dataset obtained from HC Corpora.  See their readme file for details on the corpora available.

 

See the report here.

 

freqCharts-1