Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You might be interested in "Text Embeddings Reveal (Almost) As Much As Text":

> We train our model to decode text embeddings from two state-of-the-art embedding models, and also show that our model can recover important personal information (full names) from a dataset of clinical notes.

https://arxiv.org/pdf/2310.06816.pdf

There's certainly information loss, but there is also a lot of information still present.



Yeah, that paper is what I was thinking about. https://simonwillison.net/2024/Jan/8/text-embeddings-reveal-...

“a multi-step method that iteratively corrects and re-embeds text is able to recover 92% of 32-token text inputs exactly”.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: