Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A couple thoughts. 1) An alternative might be to have an extra NULL output where the attention can be diverted. This might be what existing models are using commas for, but make it explicit. 2) What he proposes has a similar effect on the other weights without explicitly having the NULL present. In this light it should work, but does it have the advantage he thinks?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: