Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This paper presents a novel approach to incorporate memory into a transformer. It does not demonstrate that this approach works in a useful manner. While the approach is interesting, I’m skeptical that the RNN has enough capacity to encode the memory in it's output. I would have liked to see more detail on the synthetic benchmark they used. The memory component may be learning the benchmark rather than a generalized feature.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: