Can I evaluate agents that are not written in Python?

Yes, you can evaluate external services such as Bland AI agents by supplying the required authentication and pathway details to the evaluator.

How are custom metrics defined?

Create a Metric object with a name, definition, and scoring type (binary or continuous) then pass it when creating or loading a project.

Does the package require an internet connection at runtime?

Yes, because transcription and metric scoring rely on calls to the configured model providers.

MixedVoices — Autonomous Agents Review, Install & Alternatives (2026)

What is MixedVoices?

MixedVoices is an open-source Python package that helps developers track and improve voice agent behavior through detailed call analysis and testing. It processes recordings to generate transcripts, success classifications, flow diagrams, and scores on metrics such as empathy, latency, and interruptions.

Users configure models for analytics and transcription, then add recordings or run simulated evaluations against custom agents or services like Bland AI. Results are stored per project and version so teams can compare iterations and identify patterns in conversation paths.

It is intended for teams building or maintaining production voice agents who need quantitative feedback before deployment and ongoing monitoring after launch.

Capabilities

analyze ai voice agents

track performance metrics

visualize call flows

run complex simulations

evaluate production readiness

What you can build with MixedVoices

Post-call review

Upload recordings to obtain transcripts, success labels, and metric scores that highlight strengths and weaknesses in live conversations.

Pre-deployment testing

Generate test cases from transcripts or descriptions, then run simulated dialogues to validate agent responses and metric performance.

Version comparison

Create multiple agent versions with different prompts and review side-by-side flow charts and metric trends to select the best iteration.

Install MixedVoices

Install

pip install mixedvoices

Quick start

pip install mixedvoices

1Run pip install mixedvoices to add the package.
2Execute mixedvoices config and supply the required API keys for OpenAI or Deepgram.
3Create a project and version with your agent prompt using the Python API.
4Add recordings or generate test cases, then run analysis or an evaluator.
5Open the dashboard to inspect flow charts, metrics, and simulation outcomes.

MixedVoices: pros & cons

Pros

+Quick Python integration with both blocking and non-blocking recording analysis
+Built-in metrics plus support for user-defined binary or continuous scores
+Test case generation from multiple sources including existing transcripts
+Support for evaluating both custom agents and third-party services like Bland AI

Cons

–Analytics and transcription currently limited to OpenAI and Deepgram models
–Requires separate implementation of the BaseAgent respond method for custom agents
–Dashboard and analysis features are tied to a local or self-hosted setup

Did you find this helpful?

Frequently asked questions

It supports all OpenAI GPT models from gpt-3.5 onward by default, with transcription options including OpenAI Whisper and Deepgram Nova-2.

User reviews

Verified reviews from the community shape this listing's rating.

Loading reviews…

Sign in to review

Similar agents

Other automation options worth comparing.

AppAgent

Agent · Automation

Verified

Multimodal agent that controls Android apps through taps and swipes.

6.8kOpen source

Lindy

Hosted · Automation

Verified

AI assistant that manages emails, meetings, and daily admin tasks.

Paid

MultiOn

Hosted · Automation

Verified

Superintelligence running locally on your device for daily tasks.

Free early preview access

MixedVoices

What is MixedVoices?

Capabilities

What you can build with MixedVoices

Post-call review

Pre-deployment testing

Version comparison

Install MixedVoices

MixedVoices: pros & cons

Pros

Cons

Frequently asked questions

What models does MixedVoices support for analytics?

Can I evaluate agents that are not written in Python?

How are custom metrics defined?

Does the package require an internet connection at runtime?

User reviews

Similar agents

AppAgent

Lindy

MultiOn

Promote MixedVoices