Introduction to Topic Modelling in R

Offered By
Coursera Project Network
In this Guided Project, you will:

Load textual data into R, and pre-process it

Convert textual data into a document feature matrix Run an LDA topic model on your data

Clock1
BeginnerBeginner
CloudNo download needed
VideoSplit-screen video
Comment DotsEnglish
LaptopDesktop only

By the end of this project, you will know how to load and pre-process a data set of text documents by converting the data set into a document feature matrix and reducing it’s dimensionality. You will also know how to run an unsupervised machine learning LDA topic model (Latent Dirichlet Allocation). You will know how to plot the change in topics over time as well as explore the distribution of topic probability in each document.

Skills you will develop

  • sampling
  • Topic Modelling
  • Unsupervised Learning
  • Data Visualization (DataViz)
  • Text Corpus

Learn step-by-step

In a video that plays in a split-screen with your work area, your instructor will walk you through these steps:

  1. Load textual data into R, and pre-process it to prepare it for topic modelling

  2. Convert textual data into a document feature matrix and reduce its dimensionality before applying the model.

  3. Run an LDA topic model on your data and explore the topics identified by the model as well as the most frequently used words associated with each topic.

  4. Plot the change in topics over time in your data as well as to explore the distribution of topic probabilities in each of your textual documents.

How Guided Projects work

Your workspace is a cloud desktop right in your browser, no download required

In a split-screen video, your instructor guides you step-by-step

Frequently asked questions

Frequently Asked Questions

More questions? Visit the Learner Help Center.