Bias lens: systemic bias detection with explainable analysis

Loading...
Thumbnail Image

Date

2025

Journal Title

Journal ISSN

Volume Title

Publisher

Department of Computer Science and Engineering

Abstract

Natural Language Processing (NLP) models have revolutionized the way information is processed, yet they often inherit and even amplify societal biases present in training data [3]. These biases can lead to unfair, discriminatory outcomes in high-stakes applications. Traditional bias detection methods generally rely on binary classification, which oversimplifies the complex and nuanced nature of biased language. In response, we propose Bias Lens—a comprehensive framework that combines fine-grained multi-label token classification with a suite of Explainable AI (XAI) methods to not only detect but also elucidate bias in text. Our approach leverages advanced techniques, most notably Integrated Gap Gradients (IG2), to provide detailed neuron-level attributions, while also incorporating complementary methods (LIME, SHAP, DeepLIFT, Attention, Counterfactual, LRP, Occlusion, and Saliency) to create a multifaceted explanation ecosystem. This dual capability enhances transparency and facilitates trust among users such as content moderators and AI auditors.

Description

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By