What is Sentence Transformers?
Sentence embeddings? Semantic search? Cosine similarity?!?! đ± Just a few short weeks ago, these terms were so confusing to me that they made my head spin. Iâd heard that Sentence Transformers was a powerful and versatile library for working with language and image data and I was eager to play around with it, but I was worried that I would be out of my depth. As it turns out, I couldnât have been more wrong!
Sentence Transformers is among the libraries that Hugging Face integrates with, where itâs described with the following:
Compute dense vector representations for sentences, paragraphs, and images
In a nutshell, Sentence Transformers answers one question: What if we could treat sentences as points in a multi-dimensional vector space? This means that ST lets you give it an arbitrary string of text (e.g., âIâm so glad I learned to code with Python!â), and itâll transform it into a vector, such as [0.2, 0.5, 1.3, 0.9]. Another sentence, such as âPython is a great programming language.â, would be transformed into a different vector. These vectors are called âembeddings,â and they play an essential role in Machine Learning. If these two sentences were embedded with the same model, then both would coexist in the same vector space, allowing for many interesting possibilities.
What makes ST particularly useful is that, once youâve generated some embeddings, you can use the built-in utility functions to compare how similar one sentence is to another, including synonyms! đ€Ż One way to do this is with the âCosine Similarityâ function. With ST, you can skip all the pesky math and call the very handy util.cos_sim function to get a score from -1 to 1 that signifies how âsimilarâ the embedded sentences are in the vector space they share â the bigger the score is, the more similar the sentences are!
Sentence embeddings? Semantic search? Cosine similarity?!?! đ± Just a few short weeks ago, these terms were so confusing to me that they made my head spin. Iâd heard that Sentence Transformers was a powerful and versatile library for working with language and image data and I was eager to play around with it, but I was worried that I would be out of my depth. As it turns out, I couldnât have been more wrong!
Sentence Transformers is among the libraries that Hugging Face integrates with, where itâs described with the following:
Compute dense vector representations for sentences, paragraphs, and images
In a nutshell, Sentence Transformers answers one question: What if we could treat sentences as points in a multi-dimensional vector space? This means that ST lets you give it an arbitrary string of text (e.g., âIâm so glad I learned to code with Python!â), and itâll transform it into a vector, such as [0.2, 0.5, 1.3, 0.9]. Another sentence, such as âPython is a great programming language.â, would be transformed into a different vector. These vectors are called âembeddings,â and they play an essential role in Machine Learning. If these two sentences were embedded with the same model, then both would coexist in the same vector space, allowing for many interesting possibilities.
What makes ST particularly useful is that, once youâve generated some embeddings, you can use the built-in utility functions to compare how similar one sentence is to another, including synonyms! đ€Ż One way to do this is with the âCosine Similarityâ function. With ST, you can skip all the pesky math and call the very handy util.cos_sim function to get a score from -1 to 1 that signifies how âsimilarâ the embedded sentences are in the vector space they share â the bigger the score is, the more similar the sentences are!