Explainable AI for Cybersecurity Detection on Small and Noisy Datasets: A Comparative Study

Explainable AI (XAI) is increasingly required for intrusion detection systems (IDS) because security analysts must justify alerts, prioritize response, and audit model behavior. In operational environments, however, supervised IDS commonly faces two constraints: limited labeled training data and imperfect supervision arising from delayed ground truth and weak labeling pipelines. This study presents a comparative evaluation of explainable multi-class intrusion detection under controlled small-data and noisy-label regimes using UNSW-NB15 and CICIDS2017. We simulate data scarcity by stratified downsampling of the training set and simulate label noise using both symmetric corruption and a security-realistic benignification mechanism that preferentially flips attack labels toward benign. Representative detector families are trained using empirical risk minimization and noise-mitigation strategies, and explanations are generated using SHAP, LIME, and Integrated Gradients. The evaluation jointly considers detection effectiveness, probability reliability, and explanation quality using Macro-F1, AUROC, AUPRC, Expected Calibration Error, and explanation metrics that capture faithfulness, stability, sparsity, and drift. Results show that performance and explanation reliability degrade nonlinearly when small data and noisy labels co-occur, with benignification noise causing the most severe losses. Noise-tolerant learning reduces these losses and improves calibration and explanation stability, indicating that training choices affect not only accuracy but also the reliability of analyst-facing explanations under scarce and noisy supervision.