RAG Meetup at Pinecone HQEvaluating RAG Applications Workshop with Weights and BiasesRegister
Preview Mode ()

Natural Language Processing for Semantic Search

By James Briggs

Learn how to make machines understand language as people do. This free course covers everything you need to build state-of-the-art language models, from machine translation to question-answering, and more.

Share:

Introduction

Semantic search has long been a critical component in the technology stacks of giants such as Google, Amazon, and Netflix. The recent democratization of these technologies has ignited a search renaissance, and these once guarded technologies are being discovered and quickly adopted by organizations across every imaginable industry.

Why the explosion of interest in semantic search? It unlocks an essential recipe to many products and applications, the scope of which is unknown but already broad. Search engines, autocorrect, translation, recommendation engines, error logging, and much more are already heavy users of semantic search. Many tools that can benefit from a meaningful language search or clustering function are supercharged by semantic search.

Two pillars support semantic search; vector search and NLP. In this course, we focus on the pillar of NLP and how it brings ‘semantic’ to semantic search. We introduce concepts and theory throughout the course before backing them up with real, industry-standard code and libraries.

You will learn what dense vectors are and why they’re fundamental to NLP and semantic search. We cover how to build state-of-the-art language models covering semantic similarity, multilingual embeddings, unsupervised training, and more. Learn how to apply these in the real world, where we often lack suitable datasets or masses of computing power.

In short, you will learn everything you need to know to begin applying NLP in your semantic search use-cases.

Let’s begin!

Chapter 1
Dense Vectors
An overview of dense vector embeddings with NLP.
Chapter 2
Sentence Transformers and Embeddings
How sentence transformers and embeddings can be used for a range of semantic similarity applications.
Chapter 3
Training Sentence Transformers with Softmax Loss
The original way of training sentence transformers like SBERT for semantic search.
Chapter 4
Training Sentence Transformers with Multiple Negatives Ranking Loss
How to create sentence transformers by fine-tuning with MNR loss.
Chapter 5
Multilingual Sentence Transformers
How to create multilingual sentence transformers with knowledge distillation.
Chapter 6
Unsupervised Training for Sentence Transformers
How to create sentence transformer models without labelled data.
Chapter 7
An Introduction to Open Domain Question-Answering
The illustrated overview to open domain question-answering.
Chapter 8
Retrievers for Question-Answering
How to fine-tune retriever models to find relevant contexts in vector databases.
Chapter 9
Readers for Question-Answering
How to fine-tune reader models to identify answers from relevant contexts.
Chapter 10
Data Augmentation with BERT
Augmented SBERT (AugSBERT) is a training strategy to enhance domain-specific datasets.
Chapter 11
Domain Transfer with BERT
Transfer information from an out-of-domain (or source) dataset to a target domain.
Chapter 12
Unsupervised Training with Query Generation (GenQ)
Fine-tune retrievers for asymmetric semantic search using GenQ.
Chapter 13
Generative Pseudo-Labeling (GPL)
A powerful technique for domain adaptation using unstructured text data.

New chapters coming soon!

Get email updates when they're published:

Chapter 14

Training Sentence Transformers

The most popular methods for training sentence transformers, and tips for each.

Chapter 15

And more...