Home/Use Cases/VoiceValor: Survivor-Led Auditing of AI Moderation

VoiceValor: Survivor-Led Auditing of AI Moderation

VoiceValor is an AI auditing framework that enables survivors of online abuse to evaluate how effectively social media platforms detect and moderate harmful content.

About Use Case

VoiceValor addresses a critical challenge in digital safety: the inability of automated moderation systems to consistently detect and respond to gender-based harassment and abuse online. Social media platforms increasingly rely on artificial intelligence to identify harmful content such as hate speech, harassment, and threats. However, these systems often fail to capture the nuanced forms of abuse that women experience, particularly when harassment occurs through coded language, sarcasm, or coordinated harassment campaigns.

The VoiceValor initiative introduces a survivor-led auditing model designed to evaluate and improve AI moderation systems. The platform invites individuals who have experienced online harassment to contribute anonymized examples of abusive interactions and participate in evaluating how moderation systems respond to such content. These contributions create datasets that help identify patterns in harassment and expose weaknesses in automated moderation models.

Machine learning techniques analyze submitted content and compare it with moderation decisions made by social media platforms. By identifying discrepancies between harmful content and platform responses, the system can highlight areas where moderation algorithms fail to protect users effectively. The findings are then used to generate recommendations for improving content moderation models and policies.

A key feature of VoiceValor is its emphasis on participatory governance. Survivors of online abuse are directly involved in defining evaluation criteria and reviewing system outputs. This approach ensures that the auditing process reflects lived experiences rather than relying solely on technical definitions of harmful content.

The platform also emphasizes transparency and accountability. Reports generated by the system provide insights into how moderation systems perform across different languages, cultural contexts, and forms of abuse. These insights can help technology companies improve their AI systems and ensure that moderation policies better protect vulnerable communities.

VoiceValor demonstrates how participatory AI governance can strengthen online safety systems. By combining machine learning analysis with survivor perspectives, the initiative provides a structured mechanism for evaluating and improving the effectiveness of automated content moderation technologies.

For additional context and detailed documentation of this use case, please refer to pages 39-44 in the attached Casebook.