Review Skeptic: An Algorithmic Approach to Fight Fake Reviews

I found this post interesting and I thought I'd share here:
five-starsSometimes you can spot a fake review a mile away. (Remember that old post about the fake florist reviews in California?) But some of the services that offer phony reviews for a small fee are getting smarter and less obvious about their spammy ways.

Some really smart people are developing software programs that aim to spot fake reviews algorithmically, and you can play with one of them yourself.

It’s called Review Skeptic, and it’s as simple as pasting the review text into the site and letting the computer guess if it’s authentic. The public website is out there “for entertainment purposes only,” the site says … and it is kinda fun to see how accurate it is.


The site says it’s best to use English-language hotel reviews, but I did some limited testing of my own using a variety of business types: I cut-and-pasted the full text of my 10 most recent Yelp reviews, and Review Skeptic identified all of them as truthful. Whew.

Review Skeptic is the work of a group of Cornell University researchers, and the result of testing on 400 fake and 400 authentic hotel reviews. There’s a link at the bottom of the site to their research paper, which explains that the software had 90 percent accuracy during testing.

How to Detect Fake Reviews

I’m no scientist, but the way I read the material in that PDF is that Review Skeptic classifies words and text patterns and looks for signs of authenticity or deception. And one of the researchers, Myle Ott, just explained it like this to TIME.com:

…the software takes note of subtle signs that most people overlook. “Truthful reviews tend to have more punctuation, such as dollar signs, which indicate a specific that’d only be known to someone who has been there,” he said. “There are also more specific details, like the hotel location or that the room was small or large.”

Fake reviews, by contrast, tended to have more superlatives and adverbs in the writing (makes sense) and more details that were “external to the hotel,” such as whom the reviewer was traveling with. The fakes were also filled with pronouns, rather than proper names — because someone who had never been to a hotel wouldn’t know the name of the bellman or the woman at the front desk.

Interesting stuff, but I kinda wish the “secret sauce” was kept secret.

That TIME article mentions that no “major websites” are using the software behind Review Skeptic, but I’d be shocked if Google and other major review sites aren’t also using algorithms to identify review spam. Yelp, in fact, is well known for having a review filter in place — although my understanding is that Yelp’s filter focuses as much, if not more, on the user than on the words used in reviews.

Anyway, if algorithms and software can do a better job than we humans of identifying review spam, here’s hoping Review Skeptic and similar products catch on more widely. On that note, one last thing: According to this New York Times article from last summer, the Cornell researchers have been contacted by Amazon, TripAdvisor, Hilton Hotels and other sites … and Google contacted Ott to ask for his resumé.

(Stock image via Shutterstock.com. Used under license.)

