Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

LLM vectors do have decent linear properties already. But for document embedding purposes they are often further trained for retrieval via cosine similarity, which enhances this, e.g. see table 1 in [1], avg retrieval performancs using BERT goes up from 54 to 76 after fine-tuning for embeddings.

[1] https://arxiv.org/pdf/1908.10084.pdf



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: