Bias lens: systemic bias detection with explainable analysis

Bhavaneetharan, L; Jawwadh, S

Bias lens: systemic bias detection with explainable analysis

Files

Paper 17 - ADScAI 2025.pdf (201.49 KB)

Date

2025

Authors

Bhavaneetharan, L

Jawwadh, S

Publisher

Department of Computer Science and Engineering

Abstract

Natural Language Processing (NLP) models have revolutionized the way information is processed, yet they often inherit and even amplify societal biases present in training data [3]. These biases can lead to unfair, discriminatory outcomes in high-stakes applications. Traditional bias detection methods generally rely on binary classification, which oversimplifies the complex and nuanced nature of biased language. In response, we propose Bias Lens—a comprehensive framework that combines fine-grained multi-label token classification with a suite of Explainable AI (XAI) methods to not only detect but also elucidate bias in text. Our approach leverages advanced techniques, most notably Integrated Gap Gradients (IG2), to provide detailed neuron-level attributions, while also incorporating complementary methods (LIME, SHAP, DeepLIFT, Attention, Counterfactual, LRP, Occlusion, and Saliency) to create a multifaceted explanation ecosystem. This dual capability enhances transparency and facilitates trust among users such as content moderators and AI auditors.