TIL 20250209

Things I Learnt today

  • Embedding models have a maximum context length and if your input is larger than that, you'll need to chunk it, ideally with overlap.
  • You sometimes need to include a prompt telling the embedding models you’re generating embeddings, both for generating embeddings for comparison and for searching.
  • Faiss is very fast at k-means clustering
  • Julia Evans writes interesting blog posts