Bias lens: systemic bias detection with explainable analysis
Loading...
Date
2025
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Department of Computer Science and Engineering
Abstract
Natural Language Processing (NLP) models have revolutionized the way information is processed, yet they often inherit and even amplify societal biases present in training data [3]. These biases can lead to unfair, discriminatory outcomes in high-stakes applications. Traditional bias detection methods generally rely on binary classification, which oversimplifies the complex and nuanced nature of biased language. In response, we propose Bias Lens—a comprehensive framework that combines fine-grained multi-label token classification with a suite of Explainable AI (XAI) methods to not only detect but also elucidate bias in text. Our approach leverages advanced techniques, most notably Integrated Gap Gradients (IG2), to provide detailed neuron-level attributions, while also incorporating complementary methods (LIME, SHAP, DeepLIFT, Attention, Counterfactual, LRP, Occlusion, and Saliency) to create a multifaceted explanation ecosystem. This dual capability enhances transparency and facilitates trust among users such as content moderators and AI auditors.
