User Visible Fault Detection for E-Commerce Sites
Abstract
During the past decade, online shopping has already penetrated into the everyday life as the E-commerce business is growing by leaps and bounds. However, unexpected faults and unplanned downtime of web servers, related to both hardware and software problems continue to cause E-commerce companies to lose revenue. User-visible faults is the major reason E-commerce sites lose revenue and moreover, lose customer loyalty. Detecting user-visible faults sometimes is a difficult task as some of these faults do not produce any error message in the server log files. Therefore, a fault detector which can detect user-visible faults autonomously is needed to support E-commerce site.
End-users appear to change their behavior toward an E-commerce site when they encounter a user-visible fault. This change is reflected in the request transition probability matrix, based on similar mathematical definitions found in state transitions of random processes, such as the Markov processes (chains). This work proposes a system which detects abnormal changes in end-user behavior as reflected in the request transition matrix of web servers. The proposed fault detector is capable of detecting not only changes reflected in the request transition probability matrix but also analyzing the request transition probability matrix to determine whether a real fault or just a special event is taking place. This reduces the false alarm rate. The detector can also localize faults by determining which request and which web page exhibits the fault.
The detection time for injected faults and the Receiver Operating Characteristics (ROC) curves at a given maximum detection time are used to analyze the performance of the proposed fault detector. The detection results are compared with those obtained by one of the most promising user-visible fault detectors published in the literature. For detection time of more than about 30 minutes, the results demonstrate that the proposed fault detector dramatically reduces the false alarm rate. Furthermore, it distinguishes between real faults and special events. It also provide stable detection times for injected faults irrespective of end-user workload and improves the detection times for two of the four injected faults. At the detection time less than about 30 minutes, the proposed detector’s fault detection rate suffers but still with acceptable false alarm rate. At these low detection rates, the comparable detector from the literature provides marginally better fault detection at significantly higher false positive rates. Furthermore, both detectors have unacceptably low detection rates.
A realtime, online version of the fault detection system is also developed in this research. This fault detection system is easy to deploy because it does not require any modifications to the web server source code. Also, it does not produce significant overhead for the monitored web server. Overall, the proposed fault detector has all of the following desirable characteristics: early fault detection, false alarm rate reduction, easy deployment, and effective fault localization.
Citation
Chu, Pang-Chun 1974- (2011). User Visible Fault Detection for E-Commerce Sites. Doctoral dissertation, Texas A & M University. Available electronically from https : / /hdl .handle .net /1969 .1 /ETD -TAMU -2011 -12 -10547.