How does Mizaan learn from feedback?

Mizaan uses Retrieval-Augmented Generation (RAG), not fine-tuning. Human overrides are stored with vector embeddings. On each new scoring request, similar past cases are retrieved and injected into the LLM prompt as few-shot examples, allowing the system to learn immediately without model retraining.

Is Mizaan privacy-first?

Yes. Mizaan runs locally by default using Ollama. Your data never leaves your infrastructure unless you explicitly choose a cloud LLM provider. All data is stored in your own PostgreSQL instance.

What LLM providers does Mizaan support?

Mizaan is provider-agnostic. It supports Ollama (local, default), OpenAI, Anthropic Claude, and Azure OpenAI. You can switch providers with a single configuration change.

by Mannat AI

Measuringwhatactuallymatters

Mizaan is a multi-agent AI scoring engine that evaluates work through reasoning, validates through auditing, and learns through human feedback.

Request Early Access See how it works

Work Item

Sales Call

90-minute discovery call with Fortune 500 CFO. Discussed pain points, demoed solution, scheduled follow-up.

Analyzing work item...

See it in action

How Mizaan measures what matters

Scoringsystemsweren'tbuiltfortheworkpeopleactuallydo

Opaque Algorithms

Black-box scoring gives you a number with no explanation. Teams lose trust when they can't understand why they got the score they did.

Human Bias

Manual reviews are inconsistent. Different reviewers give different scores for the same work, creating unfairness across teams.

Volume over Impact

Traditional metrics reward activity, not outcomes. Ten quick calls count more than one strategic conversation that closes a deal.

Threeagents.Onefairscore.

Primary Scorer

Analyzes work against your rubric. Produces a score with detailed reasoning and confidence level.

Auditor

Reviews the primary score for consistency. Compares against historical patterns. Flags anomalies.

Human Interface

Presents conflicts to human reviewers. Captures override decisions as feedback for learning.

The system learns from every human decision.

Frominputtoinsightinfivesteps

Input

Submit any work item — a sales call transcript, support ticket, code review, or performance report.

Reasoning

The Primary Scorer analyzes the work against your rubric, retrieving similar past cases to calibrate its judgment.

Validation

The Auditor independently reviews the score for consistency, checking against historical patterns and flagging anomalies.

Resolution

When agents disagree, conflicts are escalated to human reviewers with full reasoning from both sides.

Learning

Every human decision is embedded and stored. The system retrieves relevant feedback to improve future scores — no retraining needed.

Oneengine.Everyteam.

Score sales activities by impact, not volume

Evaluate calls by strategic depth, not just duration
Recognize relationship-building over cold outreach volume
Learn from manager overrides on deal-closing activities
Compare rep performance with consistent, explainable criteria

Work Item

90-min discovery call with Fortune 500 CFO

Score

8.5

Reasoning

Extended engagement with C-suite decision-maker. Deep discovery of pain points with demo. Strategic relationship-building activity.

impact: 9

effort: 8

quality: 9

AIthatearnsitsauthority

Privacy-First

Run locally with Ollama. Your data never leaves your infrastructure. Zero third-party inference by default.

Explainable Reasoning

Every score comes with detailed reasoning. See exactly why the AI scored the way it did, dimension by dimension.

Human Authority

Humans always have the final say. Override any score with reasoning, and the system learns from your judgment.

Full Audit Trail

Every score, override, and learning event is logged. Complete transparency for compliance and review.

Read our full transparency report

Builtforengineerswhocareaboutthestack

LLM-Agnostic

Swap providers with one line. Ollama, Claude, GPT-4, Azure — same scoring logic, your choice of model.

Agent Orchestration

LangChain + LangGraph for mature multi-agent coordination with built-in tool use and memory management.

Feedback Loops

RAG-based learning via pgvector. Human overrides become few-shot examples — no fine-tuning infrastructure needed.

Secure APIs

FastAPI with async support, rate limiting, and structured validation. Production-ready from day one.

from mizaan import ScoringEngine
from mizaan.providers import OllamaProvider

# Initialize with local LLM — zero cloud dependency
engine = ScoringEngine(
    provider=OllamaProvider(model="mistral"),
    rubric="sales-performance-v2"
)

# Score a work item
result = engine.score(
    item="90-min discovery call with Fortune 500 CFO",
    context={"type": "sales_call", "duration": 90}
)

print(result.score)       # 8.5
print(result.reasoning)   # "Extended engagement with..."
print(result.confidence)  # 0.87
print(result.dimensions)  # {"impact": 9, "effort": 8, ...}

Python

FastAPI

LangChain

PostgreSQL

pgvector

Redis

Docker

Ollama

Fromenginetoecosystem

Engine

Now

Core scoring pipeline with Ollama, single rubric, feedback learning, and REST API.

Three-agent scoring pipeline
RAG-based feedback learning
Local LLM with Ollama
Scoring & feedback API

Platform

Multi-provider support, custom rubrics, team dashboards, and analytics.

Claude & OpenAI adapters
Custom rubric builder
Team analytics dashboard
Webhook integrations

Ecosystem

Future

Enterprise deployment, marketplace for rubrics, and cross-org benchmarking.

Azure OpenAI & VPC deployment
Rubric marketplace
Cross-organization benchmarking
Advanced cost optimization

“
Every person deserves to have their work measured with the same care they put into doing it.

Mizaan was born from a simple frustration: watching talented people get reduced to numbers that didn't reflect their actual impact. The word "Mizaan" means balance and scale in Arabic — fairness in measurement. Under Mannat AI, we're building tools that bring transparency and fairness to how work is evaluated. Not by removing humans from the loop, but by giving them better tools to make better decisions. This isn't about replacing judgment. It's about augmenting it with reasoning, validation, and continuous learning.

Founder, Mannat AI

Beamongthefirsttomeasurewhatmatters

Join the early access program and help shape the future of fair, explainable work scoring.