Astrocade QA Calibration Tool
Back to projects$ cat astrocade-qa-calibration-tool.json
title: Astrocade QA Calibration Tool
category: development
client: Astrocade AI
date: 2025 - Present
stack:PythonLLM-Assisted ReviewModeration ToolingAnalyticsHuman-in-the-Loop QA


Project Overview
Built and operated a human-in-the-loop QA calibration toolchain for Astrocade's UGC moderation pipeline, improving precision/recall tuning, reducing repeat rejections, and speeding daily publishing decisions.
Situation
• Overview
Astrocade needed a reliable moderation workflow that balanced automation speed with reviewer judgment quality while maintaining a one-day review turnaround target.
• Challenge
False or repeated rejections and inconsistent reviewer interpretation were creating friction for creators and slowing publishing throughput.
• Context
The moderation stack included auto-review logic, multiple human review layers, and creator feedback loops that required measurable calibration over time.
Task
• Overview
Design and run a QA calibration system that continuously measures moderation quality and improves decision consistency across automation and human review.
• Precision/Recall Tuning
Create operational controls for threshold and policy tuning to reduce false rejects while preserving safety standards.
• Workflow Throughput
Identify and remove bottlenecks across review queues, escalation paths, and feedback handling.
Action
1. Built calibration workflows
Implemented repeatable QA passes that compare model-assisted and human decisions, then flag disagreement patterns for targeted fixes.
2. Shipped tooling and backend fixes
Improved queue behavior, triage visibility, and reviewer ergonomics to reduce latency and increase throughput.
3. Ran recurring audits
Led calibration audits and documentation updates to align enforcement logic with reviewer judgment and policy intent.
4. Closed creator feedback loops
Integrated rejection feedback into workflow updates so recurring edge cases could be addressed systematically.
Result
• Higher moderation consistency
Improved alignment between automated decisions and human reviewers through recurring calibration and audit cycles.
• Reduced repeat/false rejections
Precision and recall tuning cut unnecessary friction for creators and improved trust in review outcomes.
• Faster daily operations
Pipeline and tooling improvements supported reliable day-level turnaround targets for publishing review.