Identifying Hijacked Reviews

Daryani, Monika Manohar

dc.contributor.advisor	Caverlee, James
dc.creator	Daryani, Monika Manohar
dc.date.accessioned	2022-02-23T18:12:11Z
dc.date.available	2023-05-01T06:37:23Z
dc.date.created	2021-05
dc.date.issued	2021-04-23
dc.date.submitted	May 2021
dc.identifier.uri	https://hdl.handle.net/1969.1/195772
dc.description.abstract	Customers on online marketplaces have to proceed with extreme caution before buying any product as they cannot evaluate it physically. Reviews are crucial metrics to gauge the quality and authenticity of the item. This dependence on reviews has led to a rise in the number of unethical sellers who exploit the review system in e-commerce websites via fraudulent techniques, like fake reviews. Fake reviews have been an actively researched domain for the last decade as an independent review problem and a behavior pattern recognition problem. While almost everyone is looking around to detect fake reviews, we are looking at a different facet of e-commerce fraud which is called “Review Hijacking” (or “Review reuse” or “Bait-and-Switch review”). Review hijacking is a new review manipulation tactic in which black-hat sellers “hijack” existing review listings of a product and use them to sell their products with no reviews. These items may be discontinued and unrelated but contain many positive reviews. More favorable ratings lead to better search ranking, make a new product appear well-reviewed and legitimate, and, ultimately, boost sales. There has been little academic research for this review scam. Hence, we introduce what review hijacking is, the methods used to employ it, challenges to identify, and the impact it has caused. We further find techniques to uncover such cases. We analyze the extent of this problem by applying various Information Retrieval methods like Boolean Retrieval (BIR), TF-IDF and Topic modeling on Amazon public datasets. Then, we synthetically label our data using Weak Supervision and by swapping the product-review pairs to run supervised learning models. We employ Deep Learning methods like Siamese LSTM and BERT Sentence Pair Classification to detect this e-commerce fraud efficiently on a larger scale.	en
dc.format.mimetype	application/pdf
dc.language.iso	en
dc.subject	Review fraud	en
dc.subject	Amazon reviews	en
dc.subject	NLP	en
dc.subject	BERT	en
dc.subject	Review Hijacking	en
dc.subject	Synthetic data labeling	en
dc.subject	Information Retrieval	en
dc.subject	Siamese network	en
dc.subject	LSTM	en
dc.subject	Natural Language Processing	en
dc.subject	Machine Learning	en
dc.title	Identifying Hijacked Reviews	en
dc.type	Thesis	en
thesis.degree.department	Computer Science and Engineering	en
thesis.degree.discipline	Computer Science	en
thesis.degree.grantor	Texas A&M University	en
thesis.degree.name	Master of Science	en
thesis.degree.level	Masters	en
dc.contributor.committeeMember	Chaspari, Theodora
dc.contributor.committeeMember	Burkart, Patrick
dc.type.material	text	en
dc.date.updated	2022-02-23T18:12:11Z
local.embargo.terms	2023-05-01
local.etdauthor.orcid	0000-0002-0282-6995

Files in this item

Name:: DARYANI-THESIS-2021.pdf
Size:: 4.318Mb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Electronic Theses, Dissertations, and Records of Study (2002– )
Texas A&M University Theses, Dissertations, and Records of Study (2002– )

Show simple item record