Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I recently answered a Quora question about this that has many more details http://qr.ae/1viyQ.

The core NLP behind Prismatic is topic modeling; we don't use anything off the shelf, but something crafted pretty specifically to our needs.

We do model user similarity based upon interest overlap and social graph analysis. So we know how likely you are to care about what someone in your extended network shares.



Do you plan to publish the algorithm/model at some point, or is it too closely connected to the business's "secret sauce"?

The latter would be perfectly understandable, so not meant to be a hostile question, more curiosity out of personal interest of how you handle that aspect of private-sector research. Worries about that are part of what keeps me personally from jumping from academia. I'd like to do less paper-writing and more applied work, which fits nicely, but I'd still want to do some paper-writing and be able to discuss techniques publicly, which seems to fit more awkwardly. I assume it's possible to pull off both, but all the people I know who've left academia for startups have stopped publishing completely, and many of them doing ML/big-data stuff are quite secretive about their techniques.


We will publish on interesting aspects of our models. Frankly, our main reason for not doing so is time and resources.

In terms of industry research on the whole and publishing. It is true that you can't publish everything, but look at Google. Arguably some of the influential systems papers of the last decade or so have come out of Google. Google doesn't publish all the details of their search algorithms, but it turns out that because they address real-world problems they've done enough great stuff that some of it can safely be shared without endangering their moat.

Another aspect to consider is that while industry publishes less, we do tend to churn out useful open-sourced (Prismatic will definitely be doing that soon) that is of at least comprable utility to people out there as most papers.


Has academia published any relevant systems papers since 2007?


Hard to say: systems has a long lag time before good ideas get "proven" good by being successes in the marketplace. For example, paravirtualization was investigated from around 2000, and then Cambridge released Xen in 2003. It caught on by the end of the decade, in the late 2000s. If something released in 2010 will end up having similar impact, we'll know it by the late 2010s...




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: