Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: See what words are trending on HN (hackernewstrends.herokuapp.com)
29 points by hpvic03 on Nov 28, 2013 | hide | past | favorite | 21 comments



This is a little something I did for fun. The trending hash tags on twitter are an interesting metric for what people are talking about, so I thought it would be cool to make something like that for Hacker News. But we don't have hash tags, so the app just uses words.

Note that you can click on a word to get some of the posts it's mentioned in.

Edit: Also, a heads up -- it's running on a Heroku free dyno, and it's already feeling a little slow.


Can you try stemming the words (bitcoin, bitcoins) as well as combining synonymous words like (go, golang).


Good stuff! This would be awesome as a word cloud.


That's a good idea. I could do that.


I posted this on Twitter this morning, but watch http://www.google.com/trends/explore?q=bitcoin#q=bitcoin&cmp... and the chart in this post carefully if you bought Bitcoin speculatively (ie, not because you actually care about Bitcoin or want to be long on it). When the Google chart drops and this goes off the top 5 on HN but the price hasn't plummeted, you're at a mid-term peak and you should sell.

The price of Bitcoin is being driven by media and hype right now - deserved or undeserved, in the near term there WILL be a dip in that and the price will drop for no "good" reason. Keep an eye on that if you're looking for a peak at which to sell, or (maybe more importantly) a bottom at which to buy.


Couple of suggestions:

Exclude commonly trending words such as Google. It isn't trending if it's already huge.

I've got '[' and ']' for the last week, exclude these too.

EDIT: As a matter of fact something seems to be wrong if Google is classified as trending. There haven't been a big surge in posts about Google in the recent past. Either something is wrong, I am wrong or the algorithm is still training (e.g.: learning what is the normal rate of appearance of certain words)


I've added [ and ] to the filter list, though I'll have to remove some stuff from the db to get rid of the past results.

When I was developing it I saw Google trending all the time. I think Google just comes up in conversation on HN very frequently.

I suppose I could remove Google it if it doesn't have meaning. But maybe it does. This is interesting: if Google stops trending, perhaps that's a canary signal that it's not relevant anymore.


I think what the parent means is that the derivative of the appearances is a lot more interesting than the number of appearances (and possibly as a percentage of the total number of appearances). Google moving from 100 to 105 isn't very interesting, but a new language going from 0 to 10 mentions might be very significant. (edit: and Google owning x% and staying there isn't very interesting either)

In other words, frequency of occurrence is interesting, but statistically unlikely occurrences (more or less frequent than expected) is even more interesting.


Fun to browse, and I could certainly see this enabling me to notice some interesting things I had missed on HN.

Two minor things I noticed:

Currently, 7 is listed as tending now with seven mentions. Not that numbers trending could never be interesting e.g. if 600 were trending due to the discussion about the lowering of the prime gap [0], but 7 seems to be trending just because of submission titles.

Both [ and ] are tending in the past week. This seems to be due to submission titles tagged with e.g. [video], [pdf], [<year>].

[0] https://news.ycombinator.com/item?id=6784383


I've added [ and ] to the filter list, so they won't be included in the future.

It's true that there is a bit of noise. I think it's pretty good overall though. There is a filter list of 866 words that greatly improved the results after I implemented it.


Wow! Top5 has "bitcoin","income","bitcoins", and "bank".

Is HN all about the money?

EDIT: also, very neat work! However, you should consider excluding "]" and "[" from the list.


Thanks. Haha, it seems so, at least this week.

I just added [ and ] to the filter list.


Filter out words that are less than 3 characters maybe?


Right now "ockhams" is trending with 17 mentions and "razor" is trending with 15 mentions.

Does that mean that someone just said "ockhams" twice?


You just did. And it shows in the stats I happen to see right now.


If there's not a bug in the app, then yes.


Or someone misspelled it twice.


Hnwatcher.com provides graph that shows numbers of time a specific word was mentioned on HN over time.


Nice work. Before clicking, I knew bitcoin would hold the top spot. Kind of hope this craze dies down a bit so more non-bitcoin stories can appear.


Are you using Lucene or Elastic search for indexing words?

I would be great to read about your project more.


No, it's nothing fancy. It's just your standard Rails app. It saves a record for each word found along with time data, then runs queries grouping by word count and filtering between now and whatever time you choose. I'm sure that's a naive implementation and the app could certainly be optimized.

I could open source it if you really want to take a look.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: