Embedding

Also known as a "vector embedding", an embedding is a numerical representation of information in a high-dimensional vector space. For example, a token embedding (or word embedding) encodes the meaning of words for use in large language models (LLMs).

Because artificial neural networks (AI models) are a sequence of mathematical operations, they require numerical structures as input. Vector embeddings are numerical structures that provide a way to express a wide range of complex concepts. They can be used to capture information about all sorts of things, including words, groups of words, sounds, images, and more.

For example, tokenizing a word like "bank" into a simple number can't encode the different meanings in "bank loan" and "river bank." By converting the token into a high-dimensional vector, we can encode (or "embed") a variety of word meanings that help the model understand word relationships using a notion of closeness along various vector dimensions (expressed through euclidean distance). In this way, when a model encounters the embedding for the word "bank," it can recognize the relationship it has with nearby words such as "loan" or "river," based on the closeness they each have to each other on different vector dimensions (perhaps a "finance" dimension vs a "geography" dimension that were learned during training).

Although word embeddings are a type of static embedding that encode the meaning of individual words as input to an LLM, an LLM also builds its own embeddings that are hidden inside the model. For example, as an LLM tries to understand the relationship between each word from an input sequence, it compresses more information into each token embedding based on the attention scores computed in the self-attention layer.

Embedding models

Whereas the token embeddings described above use a vector space to represent the meaning of individual tokens, the output from an embedding model uses a vector space to represent the meaning of the input data (a document) as a whole. In this way, an embedding model allows you to programmatically search and compare different documents by analyzing their corresponding embeddings, which can reveal nuanced meaning and semantics far beyond what a pure text comparison can achieve.