In this guided project you will learn how to import textual data stored in raw text files into R, turn these files into a corpus (a collection of textual documents), reshape them into paragraphs from documents and tokenize the text all using the R software package quanteda. You will then learn how to classify the texts using the Naive Bayes algorithm.
Introduction to Text Classification in R with quanteda
Taught in English
Instructor: Nicole Baerg
Included with
Guided Project
Recommended experience
What you'll learn
Import text documents, reshape texts from documents to paragraphs, and turn your texts into a machine readable format.
Classify presidential concession speeches by political party using a Naive Bayes algorithm and assess the accuracy of the predictions.
Skills you'll practice
Details to know
Add to your LinkedIn profile
Guided Project
Recommended experience
See how employees at top companies are mastering in-demand skills
Learn, practice, and apply job-ready skills in less than 2 hours
- Receive training from industry experts
- Gain hands-on experience solving real-world job tasks
- Build confidence using the latest tools and technologies
About this Guided Project
Learn step-by-step
In a video that plays in a split-screen with your work area, your instructor will walk you through these steps:
Load text documents into R studio, convert a number of text documents into a corpus, and extract data from text document file names and add them to a new column in a dataframe.
Reshape the dataset into paragraphs from documents and check for balance in your labels.
Split up a text document corpus into tokens, or individual words and punctuations. Then clean the data by removing specific words and spellings.
Create a Document Feature Matrix, divide it into train and test sets and run a Naive Bayes model. Then examine the model’s prediction accuracy and learn about accuracy, precision, and recall.
Run Naive Bayes models for a second and third time. Then examine the models' predictions and compare the model outputs with results from the previous task.
Recommended experience
Basic knowledge of the statistical programming language R
2 project images
Instructor
Offered by
How you'll learn
Skill-based, hands-on learning
Practice new skills by completing job-related tasks.
Expert guidance
Follow along with pre-recorded videos from experts using a unique side-by-side interface.
No downloads or installation required
Access the tools and resources you need in a pre-configured cloud workspace.
Available only on desktop
This Guided Project is designed for laptops or desktop computers with a reliable Internet connection, not mobile devices.
Why people choose Coursera for their career
New to Data Analysis? Start here.
Open new doors with Coursera Plus
Unlimited access to 7,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription
Advance your career with an online degree
Earn a degree from world-class universities - 100% online
Join over 3,400 global companies that choose Coursera for Business
Upskill your employees to excel in the digital economy
Frequently asked questions
By purchasing a Guided Project, you'll get everything you need to complete the Guided Project including access to a cloud desktop workspace through your web browser that contains the files and software you need to get started, plus step-by-step video instruction from a subject matter expert.
Because your workspace contains a cloud desktop that is sized for a laptop or desktop computer, Guided Projects are not available on your mobile device.
Guided Project instructors are subject matter experts who have experience in the skill, tool or domain of their project and are passionate about sharing their knowledge to impact millions of learners around the world.