I have no idea what "4.9 level" means. When I used to use uber (before it stopped working reliably in the UK about 5 years ago) they would ask me "how many stars" on a range of 1-5. There was no indication of what this meant or how to judge it. Presumably the idea is that the star ratings are relative and follow a normal distribution, so most are in the middle (say about 70% between 2/could be better and 4/really good, with about 15% being 5/excellent and 15% being 1 terrible.
It's a bad system, but pretty much every other online rating system is uncalibrated too (they just have different social expectations than the "always give 5 stars" that most people use for taxis and deliveries).
There are plenty of things that could be done, like using the algorithm that was developed for Netflix's prize contest, but nobody seems to be doing them.
No system built for a society that (sometimes) values kindness over honesty will ever give you honest answers. My wife still wants to leave big tips when the service is lousy "they're counting on it to be able to make rent". With stars or similar ratings systems, even cheapskates can be kind. Have you ever seen a thread where someone starts talking about "brutal honesty", and how people react to that term?
>If more than 50% of your replies are "this service was above average", then you have a meaningless system.
Only true mathematically if ratings are for internal comparison only. It could be that the service at restaurant A is consistently above average, because the service at restaurant B is consistently below average. Only their aggregate ratings should show it split half and half.
Does anyone know if Uber rescales the star ratings? For example if you have someone like GP who consistently gives out 3s or 4s, do those get rescaled back up to 5s based on the user’s average?
Most of the general population have no idea what a normal distribution is. You have to put yourself into a common mindset to figure out the meaning of things designed for the general population. It doesn't need to be explained because half the population is too dumb to understand a written explaination anyway.
5 means they did good, anything less means you have complaints that you think are serious enough to warrant effective "demotion" of the driver.
So stupid system with no instructions on how to calibrate.
How do you say someone did a better than expected job if the "baseline" is 5?
If they expect the normal to be 5, why give 1-4 as options. Why not "good" and "bad"
Not to mention the cultural differences between countries. While America loves being over-the-top with fake "oh you are so awesome", other countries (say Japan and the UK) are far more reserved.
> If they expect the normal to be 5, why give 1-4 as options. Why not "good" and "bad"
Some places do use a thumbs up/down rating but I think a lot of companies like the stress the 1-5 or 1-10 ratings put on workers. If you view your workers as easily replaceable, stress is a feature and the false precision lets you say that any particular outcome was data driven.
I think a 3-states scale would be good. Something like “happy”, “not happy”, and ”feared for your life”. You’d want a way to identify outliers. But using a quantitative scale for this is not a good idea.
> Most of the general population have no idea what a normal distribution is.
They have an intuitive understanding of what an average is, though. And when provided a scale the vast majority of people assume that the average is in the middle. If some smart arse grading system is not intuitively understood by someone’s customers (and seriously, people have something like 10 seconds to make that choice), it’s not the users who are stupid.
No, most people when asked to rate something think back to their school days, where "has done all the work we asked for" is either the max grade (A/10/20), or just under (say, A-/9/18). So, most people give 5 stars (or more rarely 4 stars) to drivers that have met their expectations.
That is very country specific. Here scale is 4 to 10. 4 being failure and 10 being essentially perfection. Average work being 7 with 6 below and 8 pretty good and 9 good. Scaling this to 1 to 5 would mean 1 is failure, 2 is not entirely failed, 3 average, 4 good and 5 perfection.
Most people don't think a 3 of 5 rating is "good", rather it's half-way to utter failure. 5 star ratings as used by the general public are calibrated as 5 meeting all reasonable expectations and no special rating given for exceptional performance. Most people don't need instruction on how to rate on this scale, they just give everybody a 5 unless something happened to piss them off.