This course provides an unique opportunity for you to learn key components of text mining and analytics aided by the real world datasets and the text mining toolkit written in Java. Hands-on experience in core text mining techniques including text preprocessing, sentiment analysis, and topic modeling help learners be trained to be a competent data scientists.
Empowered by bringing lecture notes together with lab sessions based on the y-TextMiner toolkit developed for the class, learners will be able to develop interesting text mining applications.
What's included
4 videos1 reading1 peer review
Show info about module content
4 videos•Total 53 minutes
1.1 Description of the course including the objectives and outcomes•15 minutes
1.2 Explanations of the y-TextMiner package and the datasets•9 minutes
1.3 How-to-do: workspace installation and setup•15 minutes
1.4 How-to-use: the y-TextMiner package (download it at http://informatics.yonsei.ac.kr/yTextMiner/yTextMiner1.2.zip)•14 minutes
1 reading•Total 10 minutes
What is Text Mining?•10 minutes
1 peer review•Total 60 minutes
y-TextMiner installation and a simple Java program•60 minutes
Text Preprocessing
Module 2•2 hours to complete
Module details
What's included
5 videos1 reading1 peer review
Show info about module content
5 videos•Total 67 minutes
2.1 Description of possible project ideas•10 minutes
2.2 What is text mining?•10 minutes
2.3 Description of preprocessing techniques•12 minutes
2.4 How-to-do: normalization including tokenization and lemmatization•21 minutes
2.5 How-to-do: N-Grams•14 minutes
1 reading•Total 10 minutes
Text Preprocessing•10 minutes
1 peer review•Total 60 minutes
Preprocessing Practice•60 minutes
Text Analysis Techniques
Module 3•2 hours to complete
Module details
What's included
6 videos2 readings1 peer review
Show info about module content
6 videos•Total 62 minutes
3.1 Description of stopword removal, stemming, and POS tagging•13 minutes
3.2 Explanations of named entity recognition•12 minutes
3.3 Explanations of dependency parsing•8 minutes
3.4 How-to-do: stopword removal and stemming•14 minutes
3.5 How-to-do: NER and POS Tagging•6 minutes
3.6 How-to-do: constituency and dependency parsing•9 minutes
2 readings•Total 20 minutes
Stemming and Lemmatization•10 minutes
Named Entity Recognition•10 minutes
1 peer review•Total 60 minutes
Text Analysis Practice•60 minutes
Term Weighting and Document Classification
Module 4•2 hours to complete
Module details
What's included
5 videos2 readings1 peer review
Show info about module content
5 videos•Total 52 minutes
4.1 Explanations of TF*IDF•9 minutes
4.2 Explanations of document classification•11 minutes
4.3 Explanations of sentiment analysis•10 minutes
4.4 How-to-do: computation of tf*idf weighting•11 minutes
4.5 How-to-do: classification with Logistic Regression•11 minutes
2 readings•Total 20 minutes
Text Classification•10 minutes
TF-IDF•10 minutes
1 peer review•Total 60 minutes
Document Classification Practice•60 minutes
Sentiment Analysis
Module 5•2 hours to complete
Module details
What's included
6 videos1 reading1 peer review
Show info about module content
6 videos•Total 59 minutes
5.1 Explanations of sentiment analysis with supervised learning•10 minutes
5.2 Explanations of sentiment analysis with unsupervised learning•11 minutes
5.3 Explanations of sentiment analysis with CoreNLP, LingPipe and SentiWordNet•10 minutes
5.4 How-to-do: sentiment analysis with CoreNLP•9 minutes
5.5 How-to-do: sentiment analysis with LingPipe•10 minutes
5.6 How-to-do: sentiment analysis with SentiWordNet•10 minutes
1 reading•Total 10 minutes
Opinion mining and sentiment analysis by Bo Pang and Lillian Lee•10 minutes
1 peer review•Total 60 minutes
Sentiment Analysis Practice•60 minutes
Topic Modeling
Module 6•3 hours to complete
Module details
What's included
5 videos1 reading1 peer review
Show info about module content
5 videos•Total 55 minutes
6.1 Description of Topic Modeling•8 minutes
6.2 Explanations of LDA and DMR•10 minutes
6.3 Description of Topic Modeling with Mallet•14 minutes
6.4 How-to-do: LDA•11 minutes
6.5 How-to-do: DMR•11 minutes
1 reading•Total 10 minutes
Introduction to Probabilistic Topic Models by David Blei•10 minutes
Yonsei University was established in 1885 and is the oldest private university in Korea.
Yonsei’s main campus is situated minutes away from the economic, political, and cultural centers of Seoul’s metropolitan downtown. Yonsei has 3,500 eminent faculty members who are conducting cutting-edge research across all academic disciplines. There are 18 graduate schools, 22 colleges and 133 subsidiary institutions hosting a selective pool of students from around the world.
Yonsei is proud of its history and reputation as a leading institution of higher education and research in Asia.
When will I have access to the lectures and assignments?
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I purchase the Certificate?
When you purchase a Certificate you get access to all course materials, including graded assignments. Upon completing the course, your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Is financial aid available?
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.