This study investigates the performance of generative artificial intelligence in detecting and mitigating bias within text datasets, addressing a critical challenge in the development of fair and ethical AI systems. This research aims to provide a comprehensive evaluation framework that integrates both bias detection and mitigation, which are often studied separately in existing literature. The methodology employs multiple text datasets, including social media, news articles, and hate speech corpora, to capture diverse forms of bias. Generative models based on transformer architectures, particularly GPT-based and fine-tuned models, are evaluated alongside baseline models. Bias detection is conducted using prompt-based, classifier-based, and lexicon-based approaches, while mitigation strategies include prompt engineering, debiasing algorithms, reinforcement learning with human feedback (RLHF), and data augmentation. Model performance is assessed using a combination of classification metrics (accuracy, precision, recall, F1-score), fairness metrics (demographic parity and equal opportunity), and text quality measures (perplexity, coherence, and semantic similarity). The results indicate that all mitigation techniques contribute to reducing bias, with RLHF and hybrid approaches achieving the highest effectiveness, reducing bias scores by over 50% while significantly improving fairness metrics. This study contributes to AI fairness research by proposing an integrated evaluation framework and demonstrating that it is possible to achieve substantial bias reduction without compromising overall model performance. The findings provide practical insights for the development of more transparent, reliable, and ethically aligned generative AI systems, supporting their responsible deployment in sensitive domains such as healthcare, finance, and hiring.
Copyrights © 2026