This course introduces the technologies behind web and search engines, including document indexing, searching and ranking. You will also learn different performance metrics for evaluating search quality, methods for understanding user intent and document semantics, and advanced applications including recommendation systems and summarization. Real-life examples and case studies are provided to reinforce the understanding of search algorithms.

Search Engines for Web and Enterprise Data

Search Engines for Web and Enterprise Data


Instructors: Kenneth Wai-Ting Leung
Instructors


Access provided by Yenepoya University
2,222 already enrolled
Details to know

Add to your LinkedIn profile
16 assignments
See how employees at top companies are mastering in-demand skills

There are 16 modules in this course
Welcome to the first module of this course! In this module, you will learn: (1) The major tasks involved in web search. (2) The history, evolution, impacts and challenges of web search engine.
What's included
2 videos1 reading1 assignment
2 videos•Total 22 minutes
- Lecture 1.1 - Example of Search Engines & Federated vs Meta Search•12 minutes
- Lecture 1.2 - Difficulties & Document Retrieval Model & Evolution of Search Engines•10 minutes
1 reading•Total 60 minutes
- Introduction to Search Engines for Web and Enterprise Data•60 minutes
1 assignment•Total 30 minutes
- Lecture 1 - Introduction to Search Engines for Web and Enterprise Data•30 minutes
In this module, you will learn: (1) Different business models of web search engine.
What's included
2 videos1 reading1 assignment
2 videos•Total 17 minutes
- Lecture 2.1 - Search Engine Business Model & Keyword Advertising•10 minutes
- Lecture 2.2 - Search Engine Related Jobs & Charging Methods & Business History•7 minutes
1 reading•Total 60 minutes
- Lecture 2 - Search Engine Business Model•60 minutes
1 assignment•Total 30 minutes
- Lecture 2 - Search Engine Business Model•30 minutes
In this module, you will learn: (1) Different information retrieval models, Boolean Models and Statistical models. (2) How to determine important words in a document using TFxIDF.
What's included
2 videos1 reading1 assignment
2 videos•Total 20 minutes
- Lecture 3.1 - Retrieval Models•11 minutes
- Lecture 3.2 - TFxIDF•9 minutes
1 reading•Total 60 minutes
- Lecture 3- TFxIDF•60 minutes
1 assignment•Total 30 minutes
- Lecture 3 - TFxIDF•30 minutes
In this module, you will learn: (1) How to represent a document/query as a vector of keywords. 2) How to determine the degree of similarity between a pair of vectors using different similarity measures, including Inner Product, Cosine Similarity, Jaccard Coefficient, Dice Coefficient.
What's included
4 videos1 reading1 assignment
4 videos•Total 32 minutes
- Lecture 4.1 - Vector Space Model & Similarity•12 minutes
- Lecture 4.2 - Interesting Things We Can Do in VSM•5 minutes
- Lecture 4.3 - Choices of Similarity Measures & Query Term Weight•6 minutes
- Lecture 4.4 - Term Independence Assumption & Synonyms & Unbalanced Property of VSM•9 minutes
1 reading•Total 60 minutes
- Lecture 4 - Vector Space Model•60 minutes
1 assignment•Total 30 minutes
- Lecture 4 - Vector Space Model•30 minutes
In this module, you will learn: (1) How to index documents using inverted files. 2) How to perform update and deletion on inverted files.
What's included
4 videos1 reading1 assignment
4 videos•Total 34 minutes
- Lecture 5.1 - Keyword Index & Postings List•11 minutes
- Lecture 5.2 - Pros and Cons & Extensions•8 minutes
- Lecture 5.3 - Insertion, Deletion and Update•10 minutes
- Lecture 5.4 - Scalability Issues and Possible Solutions•4 minutes
1 reading•Total 60 minutes
- Lecture 5- Inverted Files•60 minutes
1 assignment•Total 30 minutes
- Lecture 5 - Inverted Files•30 minutes
In this module, you will learn: (1) How to use Extended Boolean Model to rank documents. 2) How to evaluate conjunctive and disjunctive queries using Extended Boolean Model.
What's included
2 videos1 reading1 assignment
2 videos•Total 16 minutes
- Lecture 6.1 - Soft Operators and Observations•9 minutes
- Lecture 6.2 - Soft Operator Visualization & P-norm Model•6 minutes
1 reading•Total 60 minutes
- Lecture 6 - Extended Boolean Model•60 minutes
1 assignment•Total 30 minutes
- Lecture 6 - Extended Boolean Model•30 minutes
In this module, you will learn: (1) The history and evolution of link-based ranking methods. 2) How to determine query/document similarities using HyPursuit, WISE, and PageRank. 3) Possible extensions that can be applied to Pagerank.
What's included
3 videos1 reading1 assignment
3 videos•Total 38 minutes
- Lecture 7.1 - HyPursuit and WISE•13 minutes
- Lecture 7.2 - PageRank•10 minutes
- Lecture 7.3 - Other aspects / Applications of PageRank•15 minutes
1 reading•Total 60 minutes
- Lecture 7 - PageRank•60 minutes
1 assignment•Total 30 minutes
- Lecture 7 - PageRank•30 minutes
In this module, you will learn: (1) How to calculate hub and authority scores of web documents using HITS algorithm. 2) Understand the re-ranking process involved in HITS algorithm.
What's included
4 videos1 reading1 assignment
4 videos•Total 25 minutes
- Lecture 8.1 - HITS Algorithm•11 minutes
- Lecture 8.2 - Convergence and Normalization of HITS•3 minutes
- Lecture 8.3 - Integrating PR and HITS in Search Engines•4 minutes
- Lecture 8.4 - Observations of HITS and PR•8 minutes
1 reading•Total 60 minutes
- Lecture 8 - HITS Algorithm•60 minutes
1 assignment•Total 30 minutes
- Lecture 8 - HITS Algorithm•30 minutes
In this module, you will learn: (1) How to evaluate retrieval effectiveness of an information retrieval using Precision, Recall, F-Measure, Average-Precision, DCG, and NDCG. 2) What are the subjective relevance measures to be used on an information retrieval system.
What's included
3 videos1 reading1 assignment
3 videos•Total 36 minutes
- Lecture 9.1 - Explicit Evaluation & Recall, Precision, and Fallout•13 minutes
- Lecture 9.2 - Handling Inconsistency & Finding Relevant Items & Plotting Graphs•12 minutes
- Lecture 9.3 - More Performance Measures•11 minutes
1 reading•Total 60 minutes
- Lecture 9 - Performance Evaluation of IR System•60 minutes
1 assignment•Total 30 minutes
- Lecture 9 - Performance Evaluation of IR System•30 minutes
In this module, you will learn: (1) How to use the TREC collection for benchmarking. 2) The characteristics of the TREC collection.
What's included
1 video1 reading1 assignment
1 video•Total 12 minutes
- Lecture 10 - Benchmarking•12 minutes
1 reading•Total 60 minutes
- Lecture 10 - Benchmarking•60 minutes
1 assignment•Total 30 minutes
- Lecture 10 - Benchmarking•30 minutes
In this module, you will learn: (1) What is stemming. 2) Different Content-Sensitive and Context-Free stemming algorithms. 3) How to calculate Successor Variety and Entropy for stemming.
What's included
4 videos1 reading1 assignment
4 videos•Total 33 minutes
- Lecture 11.1 - Indexing Process Overview•8 minutes
- Lecture 11.2 - Stemming Overview & Affix removal Algorithms•10 minutes
- Lecture 11.3 - Corpora Based Statistical Stemming•10 minutes
- Lecture 11.4 - Purpose of Obtaining the Stem of a Word•4 minutes
1 reading•Total 60 minutes
- Lecture 11 - Stopword Removal and Stemming•60 minutes
1 assignment•Total 30 minutes
- Lecture 11 - Stopword Removal and Stemming•30 minutes
In this module, you will learn: (1) How to perform document space modification using relevance feedback. 2) How to perform query modification using relevance feedback.
What's included
3 videos1 reading1 assignment
3 videos•Total 28 minutes
- Lecture 12.1 - Overview & Manual vs Automatic Feedback & Implicit vs Explicit Feedback•6 minutes
- Lecture 12.2 - Query Modification•11 minutes
- Lecture 12.3 - Document Modification•11 minutes
1 reading•Total 10 minutes
- Lecture 12 - Relevance Feedback•10 minutes
1 assignment•Total 30 minutes
- Lecture 12 - Relevance Feedback•30 minutes
In this module, you will learn: (1) Relative preference is more useful than absolute preference in personalization. 2) The importance of eye-tracking user study in personalized web search. 3) How to model preferences as a weighted vector.
What's included
4 videos1 reading1 assignment
4 videos•Total 32 minutes
- Lecture 13.1 - Overview of Personalized Web Search•6 minutes
- Lecture 13.2 - Eye-tracking Experiment & Clickthrough Analysis•6 minutes
- Lecture 13.3 - Preference Mining Strategies•7 minutes
- Lecture 13.4 - Apply User Preferences to Ranking•13 minutes
1 reading•Total 60 minutes
- Lecture 13 - Personalized Web Search•60 minutes
1 assignment•Total 30 minutes
- Lecture 13 - Personalized Web Search•30 minutes
In this module, you will learn: (1) How to calculate discrimination value for index term selection. 2) The importance of word usage in documents in search engine design.
What's included
3 videos1 reading1 assignment
3 videos•Total 25 minutes
- Lecture 14.1 - Zipf’s Law•10 minutes
- Lecture 14.2 - Term Discrimination Values•9 minutes
- Lecture 14.3 - Term Discrimination Value vs Document Frequency & Applying DV in Term Selection•6 minutes
1 reading•Total 60 minutes
- Lecture 14 - Index Term Selection•60 minutes
1 assignment•Total 30 minutes
- Lecture 14 - Index Term Selection•30 minutes
In this module, you will learn: (1) How to use collocated terms in lieu of strict phrases in search. 2) How to identify collocated terms using Pointwise Mutual Information (PMI). 3) How to utilize N-grams for search.
What's included
3 videos1 reading1 assignment
3 videos•Total 25 minutes
- Lecture 15.1 - N-gram•9 minutes
- Lecture 15.2 - Collocation and Co-occurrence•9 minutes
- Lecture 15.3 - Pointwise Mutual Information•8 minutes
1 reading•Total 60 minutes
- Lecture 15 - Discovering Phrases and Correlated Terms•60 minutes
1 assignment•Total 30 minutes
- Lecture 15 - Discovering Phrases and Correlated Terms•30 minutes
In this module, you will learn: (1) The challenges of enterprise search. 2) The differences between web search and enterprise search.
What's included
3 videos1 reading1 assignment
3 videos•Total 18 minutes
- Lecture 16.1 - Enterprise Search and Challenges•7 minutes
- Lecture 16.2 - Enterprise Search Engine•5 minutes
- Lecture 16.3 - Advanced Requirements of Enterprise SE•6 minutes
1 reading•Total 60 minutes
- Lecture 16 - Enterprise Search Engine•60 minutes
1 assignment•Total 30 minutes
- Lecture 16 - Enterprise Search Engine•30 minutes
Instructors


Offered by

Offered by

HKUST is a world-class research-intensive university that focuses on science, technology, and business as well as humanities and social science. HKUST offers an international campus, and a holistic and interdisciplinary pedagogy to nurture well-rounded graduates with a global vision, a strong entrepreneurial spirit, and innovative thinking.
Why people choose Coursera for their career

Felipe M.

Jennifer J.

Larry W.
