[PYTHON] Camouflaged Spam Reviewer Discovery Algorithm

Overview

Logo

Continuing from Algorithm for finding collusion spam reviewers To find spam reviewers on online shopping and restaurant review sites We have prepared an algorithm FRAUDAR that finds spam reviewers that camouflage ordinary reviewers.

FRAUDAR is an algorithm that won the Best Paper Award at the 2016 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2016) and was implemented by the authors. Is open to the public](https://www.andrew.cmu.edu/user/bhooi/projects/fraudar/index.html).

This time, Algorithm for discovering colluded spam reviewers can be analyzed more easily. We created the same interface as (https://qiita.com/jkawamoto/items/d2284316cc37cd810bfd).

How to use

The FRAUDAR wrapper rgmining-fraudar created this time is now on PyPI. Since it is registered, it can be installed with the pip command.

$ pip install --upgrade rgmining-fraudar

A package called fraudar has been added, and furadar.ReviewGraph in it is a graph class that implements this algorithm. The constructor of the ReviewGraph class receives a parameter of how many types of camouflage patterns to consider as an option and a sub-algorithm used internally, but I think that only the former should be given according to the dataset (both defaults are fine).

import fraudar

#For example, consider a camouflage of 10 patterns.
n = 10 
graph = fraudar.ReviewGraph(n)

Then add reviewers, products, and reviews to the graph. It can be added in the same way as the example in Algorithm for finding colluded spam reviewers.

reviewers = [graph.new_reviewer("reviewer-{0}".format(i)) for i in range(2)]
products = [graph.new_product("product-{0}".format(i)) for i in range(3)]
graph.add_review(reviewers[0], products[0], 0.2)
graph.add_review(reviewers[0], products[1], 0.9)
graph.add_review(reviewers[0], products[2], 0.6)
graph.add_review(reviewers[1], products[0], 0.1)
graph.add_review(reviewers[1], products[1], 0.7)

It becomes. Reviewers and products are created using the new_reviewer, new_product methods of ReviewGraph. The review is added by the ʻadd_review` method.

To execute the algorithm, call the ʻupdate` method only once.

graph.update()

Finally, the analysis result is acquired. The reviewer returned by the new_reviewer method has an attribute of ʻanomalous_score`. This attribute is set to 1 if the reviewer is determined to be singular (spammer), otherwise to 0.

for r in graph.reviewers:
    print(r.name, r.anomalous_score)

Also, the product object returned by the new_product method has an attribute called summary. This value returns the average of the reviews of the reviewers who were not judged to be singular.

for p in graph.products:
    print(p.name, p.summary)

Summary

Following on from Fraud Eagle algorithm for finding collusion spam reviewers, we created a wrapper for the FRAUDAR algorithm. Both have Common API so that they can handle Dataset for evaluation of spam reviewer detection algorithm. io / modules / dataset_io.html # graph-interface) is provided, so I think you can easily compare the behavior.

Recommended Posts

Camouflaged Spam Reviewer Discovery Algorithm
Data set for evaluation of spam reviewer detection algorithm
Algorithm for finding collusion spam reviewers