- What’s so good about it?
- MAI-DxO Detailed Functions and Characteristics:
- 1. Multimodel teamwork
- 2. Progressive Diagnosis**
- 3. Cost-benefit analysis (Cost-Efficiency)**
- 4. Self-censorship and validation
- 5. Simulation team**
- MAI-DxO effects and performance
- MAI-DxO shows remarkable results, especially in:
- The strengths and limitations of MAI-DXO**
- 1. ** Strength**
Microsoft has developed a system called MAI-DxO to enable AI to conduct diagnostics step by step as a true doctor**, including questions, examinations, analyses of results, and ultimately to determine the cause of the disease, rather than to give a direct answer. It simulates multiple doctors working together, progressively reasoning and accurately diagnosing complex cases. It combines the advantages of different AI models and works together to accomplish diagnostic tasks.
What’s so good about it?
-
More accurate: It is easier to diagnose complex and difficult diseases than doctors, with a success rate of up to 85 per cent.
-
** Save more money: It won’t give you a bunch of useless checks, it’ll ** cost itself and it’ll save more money and be more efficient.
-
** More like a doctor**: Not a single answer, but, “Think about it and ask questions and analyze it step by step.”
-
** Federation consultations:** It’s actually a lot of AIs (like GPTs, Claudes)** combined to work**, like a few doctors discussing it together, it’s very comprehensive.
Microsoft sent AI to “challenge” the real cases in 304 medical journals, which resulted in a diagnostic correctness rate of 85% and a professional doctor on average 20%. This is a big difference!
What’s the advantage over a doctor?
MAI-DxO Detailed Functions and Characteristics:
** Five virtual doctor role co-diagnosis (Panel of Virtual Doctors)**
MAI-DXO is not a single AI model, but** a system that coordinates multiple AI models, simulates collaboration between doctors, forming a virtual “doctor team”. This team has different roles and has different tasks:
每个“医生”角色根据自己的职责,提出建议并进行讨论,最终做出决定。这种方法通过多轮决策**来模拟医生如何在有限的信息下不断收集证据、提出新问题、下达检查指令,直到最终确认诊断。
-
** Three decision-making steps: ** Question: AI raised issues related to patients’ symptoms, history, etc.
-
** Inspection**: AI requires the corresponding diagnostic tests (e.g. CT, blood tests, etc.) based on the reasoning.
-
** Diagnosis**: AI gives the diagnosis after sufficient evidence has been accumulated.
1. Multimodel teamwork
-
** Multi-AI model collaboration: MAI-DxO is not a single AI model, but rather a coordinating device that combines several basic language models (e.g. GPT, Claude, Llama, etc.) to work.
-
These AI models, like doctors in different fields “discussion cases” together, can be reasoned and decision-making on different aspects.
MAI-Dx Orchestrator can transform any language model into a virtual clinical team of specialists: it can ask follow-up questions, arrange tests or diagnoses, then conduct a cost check and validate its reasoning, and finally decide whether to proceed.
2. Progressive Diagnosis**
-
** Progressive diagnosis**: MAI-DxO does not rely solely on a one-time diagnosis, but rather on a “step-by-step reasoning” approach.
-
For example: if the patient says there is coughing and fever, MAI-DxO first considers whether it is cold and then requests blood tests, X-rays, etc., to identify the cause of the disease.
3. Cost-benefit analysis (Cost-Efficiency)**
-
** Cost awareness: MAI-DxO can automatically assess the cost of each step during the diagnostic process and make ** more economical decisions. It does not blindly require all tests, but weighs the relationship between diagnostic accuracy and cost.
-
This is important for reducing unnecessary medical expenses and avoiding excessive examinations, especially in cost-intensive health systems.
4. Self-censorship and validation
- ** Self-censorship: Whenever MAI-DxO makes a reasoning or diagnosis, it performs ** self-certification to check whether it is reasonable and adjusted.
5. Simulation team**
-
** Virtual doctor team: MAI-DxO is more than a “single expert”, which simulates ** multiple doctors (different professional backgrounds) working together to solve problems. It is designed like a virtual doctor team, with various diagnostic methods complementing each other, forming a diverse decision-making process.
-
Modular coordination: This coordination mechanism enables MAI-DxO to combine several different reasoning models to enable it to make the best judgement in complex situations.
MAI-DxO effects and performance
MAI-DxO has demonstrated very strong performance in a series of rigorous assessments, particularly in complex diagnostic missions, as follows: 1. Performance in diagnostic accuracy:
-
85.5% Accuracy: in comparison with 304 NEJM cases (New England Medical Journal), MAI-DxO paired with OpenAI o3, the diagnostic accuracy rate in these complex cases was 85.5%.
-
** Comparison of the performance of doctors: In the same mission, the average accuracy of 21 experienced doctors (United States and British doctors with 5 to 20 years of clinical experience) was only **20 per cent, well below that of MAI-DxO.
-
And MAI-DxO upgraded all the models on the 304 cases.
2. Performance in cost efficiency:
-
MAI-DXO not only has accurate diagnosis, but can also do ** lower costs. ** In traditional medical examinations, doctors may do more tests for fear of leaks, while MAI-DXO can ** reduce unnecessary testing ** and thus reduce the total cost of diagnosis.
3. Strong Resource management capacity**:
-
** Cost-benefit balance**: MAI-DxO has a cost control function that reasonably controls costs on the basis of assurance of diagnostic accuracy.
-
This capacity is very important, especially in environments where resources are limited, medical costs are significant or large numbers of cases need to be handled.
4. Model collaboration:
- The strengths of multiple models: Through collaboration between different AI models, MAI-DxO is able to provide more comprehensive and accurate diagnostic results using the strengths of different models in the ** complex diagnostic tasks**.
By showing the accuracy and cost of different AI systems, the chart above helps us to understand which systems (e.g. MAI-DxO) are accurate and cost-effective in their diagnosis, far beyond the performance of doctors and single AI models.
MAI-DxO shows remarkable results, especially in:
-
** High-precision** diagnostics and ** cost-effectiveness**, far beyond traditional manual diagnostics.
-
Its** multi-model collaboration** and** self-certification** capabilities give it a clear advantage in the resolution of complex cases.
Nevertheless, MAI-DxO still needs more ** clinical validation** and ** physical environmental testing, especially in the diagnosis of common daily diseases. It is still in the ** research phase and will need more scrutiny in terms of safety and compliance in the future.
The strengths and limitations of MAI-DXO**
1. ** Strength**
-
The balance between accuracy and cost: MAI-DxO can significantly reduce the cost of diagnosis without sacrificing accuracy.
-
** Simulation team**: Individual deviations and errors in premature conclusions were avoided through the collaboration of multiple virtual physician roles.
-
** Efficient information extraction: Through ** step-by-step consultation, testing, the system is able to efficiently tap valuable information and avoid redundant operations.
-
Enhancing weak model performance: By coordinating multiple roles, MAI-DxO helps weaker models to improve their reasoning and reduce the rate of misdiagnosis.
2. ** Limitations**
-
** Special case priority: Current experimental data from the most challenging **NEJM cases do not represent routine common cases and it is therefore not possible to verify the performance of MAI-DxO in common diseases.
-
** Failure to address the emotional and ethical problems of patients**: MAI-DxO focuses on diagnostic accuracy and cost-effectiveness, and lacks the emotional, ethical or communication treatment of patients.
-
Global Difference: Cost estimates are based mainly on United States prices and do not reflect the actual costs of different health systems globally.
#3. ** Outlook for the future**
-
** Adaptation to common diseases: MAI-DxO’s performance in common diseases, especially under ** low-resource environment, will be validated in the future.
-
Integration of more medical data: Integrating imaging data, genome information, etc. could be considered to further improve accuracy.
-
Clinical practical application: The future can be integrated with the clinical environment and extended to areas such as smart consultations, intelligent assistants, etc.
Papers: https://arxiv.org/pdf/2506.22405