Aether.ai
Accelerating cancer diagnosis with AI-powered computational pathology and automated image analysis.

The open-source standard for federated medical AI benchmarking and clinical validation.
MedPerf is an open-source framework spearheaded by MLCommons aimed at standardizing the evaluation of medical AI models on decentralized, real-world data. Its architecture addresses the critical bottleneck of data privacy in healthcare by facilitating 'Federated Evaluation.' Instead of moving sensitive patient data to a central server, MedPerf orchestrates the movement of models (encapsulated in MLCubes) to the data owners' infrastructure. In the 2026 landscape, MedPerf has matured into a critical piece of the clinical validation pipeline, enabling researchers and regulatory bodies to assess algorithm performance across diverse populations without violating HIPAA or GDPR. The platform utilizes a three-pillar actor system: Benchmark Owners (who define tasks), Data Owners (who provide local clinical data), and Model Owners (who submit algorithms for testing). By ensuring reproducibility through containerization and providing an auditable trail of performance metrics, MedPerf bridges the gap between laboratory development and clinical deployment, fostering trust in AI-driven diagnostic and prognostic tools.
Uses MLCubes to wrap models and data preparation scripts, ensuring they run identically across different hardware (CPUs, GPUs, TPUs).
Accelerating cancer diagnosis with AI-powered computational pathology and automated image analysis.
Augmented Intelligence for Enhanced Surgical Vision and Decision Support.
Clinically-validated AI symptom assessment and care navigation for diabetes management.
AI-driven personalized physiology for proactive healthcare and precision medicine.
Verified feedback from the global deployment network.
Post queries, share implementation strategies, and help other users.
Only aggregate statistics and performance scores are transmitted to the server; raw data remains behind the hospital firewall.
Each dataset is uniquely identified by a hash, ensuring that the same data is used for consistent benchmarking over time.
The server manages logic and scheduling while the client handles heavy lifting, allowing for massive scalability.
Automated checks to ensure clinical data matches the expected input format for specific medical tasks.
Allows benchmark owners to inject custom Python scripts for calculating specialized medical metrics like Dice scores or AUC-ROC.
Built-in approval workflows where data owners must explicitly approve models before execution.
A developer wants to test a lung nodule detection model across five different hospitals without the hospitals sharing images.
Registry Updated:2/7/2026
Aggregated results are compared on a leaderboard.
An AI company needs to provide evidence of model robustness across different demographics for regulatory approval.
Ensuring a deployed AI model doesn't suffer from 'drift' as clinical equipment or patient populations change over years.