Songqiao Han I Xiyang Hu
Anomaly detection (AD), which is also known as outlier detection, is a key machine learning (ML) task with numerous applications, including anti-money laundering , rare disease detection , social media analysis [110, 114], and intrusion detection . AD algorithms aim to identify data instances that deviate significantly from the majority of data objects [35, 82, 87, 96], and numerous methods have been developed in the last few decades [2, 53, 64, 65, 76, 93, 104, 118]. Among them, majority are designed for tabular data (i.e., no time dependency and graph structure). Thus, we focuses on the tabular AD algorithms and datasets in this work.
Although there are already some benchmark and evaluation works for tabular AD [14, 23, 25, 30, 99], they generally have the limitations as follows: (i) primary emphasis on unsupervised methods only without including emerging (semi-)supervised AD methods; (ii) limited analysis of the algorithm performance w.r.t. anomaly types; (iii) the lack of analysis on model robustness (e.g., noisy labels and irrelevant features); (iv) the absence of using statistical tests for algorithm comparison; and (v) no coverage of more complex NLP and CV datasets, which have attracted extensive attention nowadays.
To address these limitations, we design (to our best knowledge) the most comprehensive tabular anomaly detection benchmark called ADBench. By analyzing both research needs and deployment requirements in industry, we design the experiments with three major angles in anomaly detection (see §3.3): (i) the availability of supervision (e.g., ground truth labels) by including 14 unsupervised, 7 semi-supervised, and 9 supervised methods; (ii) algorithm performance under different types of anomalies by simulating the environments with 4 types of anomalies; and (iii) algorithm robustness and stability under 3 settings of data corruptions. Fig. 1 provides an overview of ADBench. Read more