Math Whiz needed

We need a math whiz to provide advice for the Ultimate Guide.  The Ultimate Guide (UG) is a database of San Francisco Bay Area restaurants providing vegan and vegetarian food in the SF Bay Area, as well as businesses owned by vegetarians.

The goal of UG is to promote eating veg by making it as simple as possible to find restaurants serving veg food.  To support the best restaurants based on your reviews, it creates the Top Ten list of veg restaurants.  This is currently a straight average (total number of points divided by number of reviews).

For example:
Millennium has 34 reviews and a rating (average) of 4.34
Udupi Palace has 9 reviews and a rating (average) of 4.33

We need a math whiz to help us create a formula that also reflects the number of reviews the restaurant has, as well, in the final rating.

And if you’ve got a website or blog or email signature, please link back to The Ultimate Guide – help us promote veg eating!

Tags: , , , ,


  1. Mark Kurowski says:

    Hey there!

    I don’t think it makes ANY sense to change the rating based on the number of reviews. A rating is a rating, and the number of reviews is the number of reviews. Think about it: if you’re a restaurant owner, would you rather be rated at 5 stars by 3 people, or have a lower-than-5 rating because you had only 3 reviewers instead of, say 30. I’d rather not be rated at all if you’re going to penalize me for not having a large set of reviewers.

    Many sites have a practice of not posting a rating (at least not in search results) unless the reviewed item has a minimum number of reviews, e.g., 2, 3, or 5.

    One good option would be to provide “minimum number of reviews” as a variable along with “rating” (and all of the other variables “city”, etc.).

    Good luck, and thanks!

  2. BAVeg says:

    That’s an interesting point that you raise, Mark. We aren’t trying to “penalize” businesses who have only a few reviews. The whole goal of the UG is to promote veg food, and to promote the best veg food.

    The only place where individual ratings are used in comparison to each other are for deriving the Top Ten list of restaurants.

    The Top Ten list already requires a minimum of ten reviews to be eligible.

    The more reviews that a restaurant has, we believe that provides a stronger relevance to their rating and therefore is worthy of a stronger weight. In the example that I gave above, it seems like Millennium is deserving of a higher weighting of its reviews because it represents more reviews/people than does another restaurant with fewer reviews/people.

    It seems like there should be a way mathematically to express that, to provide a more favorable rating to restaurants with more reviews.

    I like your suggestion about including minimum number of reviews as well as rating as search criteria.


  3. Steve Simitzis says:

    I’m with Mark. I consider myself a “math whiz” and there is absolutely no value to skewing the average rating for higher number of reviews. In fact, such a metric would be misleading and inaccurate.

    Further, which reviews would be more highly rated for larger numbers of reviews? Positive ratings? Negative ratings? All ratings (which would be the same “no ratings”)? It doesn’t make sense and would, by definition, no longer be an average.

    I can understand wanting to provide more relevant options, and in that case, you can simply default to “most ratings” in your sort order. But I know of no other review system that intentionally skews averages. Or if they are, they are uncovered and lose credibility as a result.

    Good luck!

  4. Mark Kurowski says:

    Ahah, I see. Well, I still stand by my thought, in that given your list has a minimum of 10 reviews, then the rating is the rating.

    Millennium has been around a long, long time, and is THE veg*n showcase restaurant, so why does it have only 34 reviews? Udupi Palace is also great, but it’s clearly an “ethnic” destination, so one would expect fewer reviews, no? And if the number of reviews has weight (beyond a minimum), wouldn’t that encourage fake reviews or incented-or-paid-for reviews? And I still think it’s unfair to new(er) restaurants.

    I think you’d have to have separate lists, e.g., Top 10 Indian, Top 10 Thai, Top 10 General. And that’ll get messy.

    On the other hand, if it were made clear that the list is a weighted list, i.e., not based solely on the rating, then you can choose any weight you want (and let the readers know) for all variables; sort of like the ratings of colleges and universities.

    Then you’d have to have a rating system for the number of reviews, or some other formula. For example, the number of reviews might be weighted at 10% of the overall score, and a rating based on 10-20 reviews would get 20% of that 10%, 21-40 reviews gets 40%, etc. Or that 10% can be based on some other formula (won’t go into further detail here).

    I just saw Steve’s reply, and Yes, exactly….the list can be sorted or even just eyeballed for number of reviews. After all, it’s a list of only 10, not of 100.


  5. greg says:

    “Millennium has 34 reviews and a rating (average) of 4.34
    Udupi Palace has 9 reviews and a rating (average) of 4.33”

    I think this is a policy issue more than a math issue. And the UG has already implemented one good policy for the top 10, e.g. minimum of ten reviews.

    From a math perspective one thing you could do is publish the standard deviation, which is an indicator of how variable the opinions are about the restaurant.

    From a policy perspective you could do something like one of the following, to give greater weight to restaurants with more reviews:
    1 – add .005 points for each review.
    2 – reduce the gap between their score and 5.0 based upon, for example, number of reviews divided by 100.

    With option 1 Millenium goes from 4.34 to 4.51 and Udupi goes from 4.33 to 4.37.

    With option 2 Millenium goes from 4.34 to 4.56 and Udupi goes from 4.33 to 4.39.

    In both cases the policy says that a restaurant with more customers willing to write a review should get a greater weighting. The policy has the effect of under-weighting negative reviews; which is OK with me because negative reviews are often irrelevant or immaterial rants from customers who did not try the place more than once. Policy 2 is more aggressive in the under-weighting, but has the downside that a bad restaurant with 100 reviews gets a 5.00; policy 1 might be referable in this regard. Policy 2 could be constrained to only eliminate a maximum of, for example, 50% of the gap.

  6. will says:

    all points being made here already are quite relevant.

    greg’s idea of using the standard deviation to chart the amount
    of variance from review to review makes a lot of sense.

    i also agree that it would be beneficial to keep the usual review
    ratings (straight average), while posting the amount of variance
    as a _seperate_ number — one that helps the reader understand
    the significance of the basic average score.

  7. BAVeg says:

    Steve, that’s an interesting suggestion of using the rating in returning the results after a search. Currently we use the rating only in the Top Ten list. I wonder whether that would tend to emphasize the more established restaurants over newer ones, though.

    Mark, as to why Millennium only has 34 reviews — only 34 people have reviewed it. I think there are a lot of people who use the Guide as a reference, but unfortunately far fewer do reviews. Any suggestions for getting more people to do reviews?

    And, just to be clear, if we did implement a rating change that is no longer a straight average, we would have some explanation and link to what it is and how it’s calculated so it’s transparent to everyone.

    Greg, I agree, this is more of a policy issue than simply a math calculation. Thank you for adding that critical distinction. I will need to take a closer look at what you suggest (re: policy 1, policy 2). I agree that a negative reviews amidst generally positive ones can be an isolated incident. At one point in time, we discussed the value of discarding the highest and lowest reviews from the rating calculation.

    I agree with you and Will that perhaps adding standard deviation would also be helpful.

    I think there are many types of calculations out there and it’s just a matter of finding the most meaningful one.


  8. Mark Kurowski says:

    After having (re)looked at your Top 10 list, I don’t see a need for a change, as it’s clear from a mere glance that the list is ratings-based and also how many reviews constitutes any given rating.

    Re my comment about Millennium, I just brought that up as an example of the possibly arbitrary nature of using the number of reviews to weight the rating. A separate issue that arises in a weighted system in this situation is price/cost of the food. Which is better, a 4.5-star gourmet place, or a 4.9-star take-out lunch place? Obviously that depends on what the eater’s desires are for any particular meal/occasion. Anyway, it all can get very complicated very quickly.

    Perhaps another search variable for restaurants (separate from the top 10 list) could be Average Cost.


  9. Sara J says:

    Hi all – I am NOT a math whiz but perhaps I can speak from a non-whiz perspective. I think weighting ratings based on the number of reviews would be misleading. Personally, if I see that one restaurant gets a higher rating than another restaurant, I am going to assume that on average, reviewers have rated the first restaurant higher. If I later find out that the first restaurant is rated higher because it got 30 reviews whereas the second restaurant got 20 reviews, I would feel misled, and to be honest, it would make me question all the other ratings, especially since I wouldn’t have the skills to “calculate” the “skew” factor nor would I want to spend time doing so.

    When I look up restaurants in the UG, I normally scan through all or most of the reviews, and one can see at a glance how many reviews a restaurant has, and also how variable the reviews are. I think people can be trusted to make reasonable assessments based on the number and range of reviews; most people understand that if there are only a few reviews, then the results might not be representative. My thoughts anyway…

  10. Erhhung says:

    While there’s a benefit to pointing out that a restaurant has been reviewed more than others, that should not reflect in its ultimate ranking. You should have a separate top ten list–in fact, why not multiple top ten lists by category, like most recently reviewed, most reviewed, most helpful reviews, etc. Like most shopping sites, a visitor should be able to indicate whether a review is helpful without actually having to write a review (I don’t think that will deter someone from writing a review if s/he truly wants to write one). Once you have this extra data point, then you may use an algorithm akin to Google’s page rank to give more weight (either positive or negative) to a rating by a reviewer who has gotten more nods from other reviews in the Guide–basically giving weight to most trustworthy reviewers. If a new restaurant only has the minimum required reviews, but they are all from well-respected critics, then that should count more (not necessarily as higher score, but more affirmation) than a restaurant reviewed by a bunch of unknowns or spammers. You certainly don’t want to reward a restaurant if many people give it a thumbs down–and, frankly, many people only speak up when something bothers them, not when it’s just pretty good.

    So I support the \top ten by xyz\ lists approach, and let people decide from which angle they’d like to see from.

  11. Jon Spear says:

    I also believe that the simple rating system should remain, so that the average rating represents a true “mean” value.

    People who are curious for more information can look at the individual reviews to get more information, and notice both the frequency of the reviews and also the dates of their submission.

    If any changes were to be made for “weighting” the average, I would think that recent reviews should take precedence over old reviews.

    For instance, I have eaten regularly for several years at Manzanita Restaurant in Oakland, which, IMHO, has gone through some ups and downs. I recently upgraded my review from 4 stars to 5, to reflect what I think are positive changes in the restaurant.

    If you want to play with weighting the average, you could assign a factor, to each vote. For example, reviews made within the previous year could count as one full vote. Reviews made from 12-24 months ago could count as 0.8 votes, those made 24-36 months ago could count as 0.6, etc. The weighted average would be:
    [the sum of all ratings, each one multiplied by its factor], divided by: [the sum of all the factors]

    But that might not be worth the trouble…


  12. BAVeg says:

    It’s been interesting reading everyone’s thoughts about this.

    Jon, I agree with you. The ‘currency’ of a review is also an important factor. We are considering making that a factor in any revision we do to the Top Ten list. Thanks for updating your review about Manzanita. I wish more UG users would also take the extra step to also do reviews.

    Erhhung, I like the ‘trusted reviewer’ idea. I am not sure if we will see the return on investment for this (i.e. enough reviewers to make this worthwhile for the amount of volunteer effort for the coding/testing.) Far more people use UG for searches, etc than for doing reviews.

    So, in that vein, I would encourage everyone who has replied to this thread to take a minute right now and review your favorite 5 restaurants in the Guide. If you reviewed them and it was over 6 months ago, update the review. If your favorite restaurant isn’t in the Guide, then please add it.


  13. BAVeg says:

    Oh, wanted to also add to Erhhung’s suggestion – we already do list the most recent 20 reviews. It lives on the Top Ten page, just scroll down.


Powered by InterServer