What problem does DxGPT solve?
Medical diagnosis, especially for rare diseases, can be a complex and lengthy process. Doctors often need to review multiple symptoms, medical history, and test results, which can make reaching the correct diagnosis challenging. This not only delays appropriate treatment for the patient but also increases uncertainty in the decision-making process.
In the case of rare diseases, the challenge is even greater, as many doctors have little to no direct experience with these uncommon conditions. Therefore, access to tools that help generate potential diagnoses can be crucial to improving the quality of medical care.
How does DxGPT solve it?
DxGPT uses advanced artificial intelligence (AI) technology based on GPT-4 language models, developed by OpenAI, and by default utilizes the GPT-4 model, while also offering an advanced mode that utilizes the o1 model for even more accurate diagnoses. When a healthcare professional inputs a patient’s symptoms, DxGPT generates a list of possible diagnoses based on those symptoms.
This not only saves time but can also help reduce errors, providing doctors with a foundation to investigate potential diseases, both common and rare. While DxGPT doesn’t replace medical judgment, it serves as a valuable support, offering initial hypotheses that doctors can use alongside other clinical data and tests.
The purpose of this tool is to facilitate the work of healthcare professionals, making it more efficient and accurate, enabling them to focus on the most probable options from the start and reducing the time to reach a definitive diagnosis.
How It Works
When you input a patient's symptoms, DxGPT uses advanced AI models to generate a list of potential diagnoses. This initial list is a starting point. To refine the diagnosis, clinicians should gather additional information such as clinical data, lab tests, and the patient's medical history. Your feedback is crucial in helping us improve DxGPT's accuracy and usability.
Our commitment to improvement
We are continuously working on new features to enhance DxGPT. Collaboration is at the heart of our mission, and we are eager to partner with researchers and healthcare professionals interested in exploring the potential of AI in diagnostics. If you're interested in collaborating, please reach out to us.
Transparency and collaboration
In our pursuit of transparency and collaboration, we have made the source code of DxGPT available to the public. You can explore and contribute to our client and server repositories on GitHub. We believe that open collaboration with the scientific community will drive innovation and improve rare disease diagnostics.
Clinical studies
The Hospital Sant Joan de Déu has conducted a comprehensive clinical study evaluating the performance of DxGPT compared to highly qualified medical professionals both for common and rare diseases. The results of this study indicate that DxGPT's performance is equivalent to that of a qualified medical professional. You can find more details about this study at this link:
Technical details about DxGPT accuracy
Rigorous evaluations ensure AI diagnostic tools are reliable for real-world medical use.
DxGPT [3], a diagnostic support tool based on GPT-4 family of AI models, has shown promising results in diagnosing rare diseases. When multiple available models were tested on real-world rare disease cases from the RAMEDIS dataset:
- Claude 3 Opus performed best among previously tested models, achieving 55% strict accuracy (correct diagnosis as the top suggestion) and 70% top-5 accuracy (correct diagnosis within the top five suggestions) [1][2]
New frontier models, GPT-4o and Claude 3.5 Sonnet, are currently being evaluated [4]. Early results on the RAMEDIS dataset suggest these models may offer slight improvements:
- GPT-4o: ~56% strict accuracy, ~68% top-5 accuracy
- Claude 3.5 Sonnet: ~60% strict accuracy, ~67% top-5 accuracy
Importantly, GPT-4o and Claude 3.5 Sonnet are significantly faster and more cost-effective than their predecessors like Claude 3 Opus or GPT-4. This increased efficiency means that these advanced diagnostic capabilities could potentially be offered for free to a much larger number of users worldwide, greatly expanding access to AI-assisted rare disease diagnosis.
Moreover, both GPT-4o and Claude 3.5 Sonnet have set new records in diagnosing common diseases. In our tests using the Urgency dataset for common conditions, both models achieved an impressive top-5 accuracy of around 89%. This demonstrates that these AI models are not only making strides in rare disease diagnosis but are also highly capable when it comes to more frequently encountered medical conditions [4].
These findings indicate that the latest AI models are making significant progress in both rare and common disease diagnosis while becoming more accessible, which could have a substantial impact on global healthcare.
Here are some initial results of GPT-4o and Claude 3.5 Sonnet compared to the other models from our last paper:

In this graph, the green and orange bars correspond respectively to strict accuracy (probability of hitting the first diagnosis) and top-5 accuracy (probability of hitting among the 5 diagnoses).
Furthermore, the black line is the difference in accuracy between common diseases (top) and rare diseases (bottom).
It's crucial to emphasize that DxGPT is still in development and is intended as a decision support tool, not a replacement for professional medical judgment[1][5]. The tool aims to assist healthcare professionals by generating diagnostic hypotheses based on patient symptoms and clinical data, potentially reducing the time to diagnosis for rare and common diseases[2][5]. Further validation on real clinical data and comparison with human expert diagnoses are necessary to fully assess its performance and potential impact on patient care[1][4]. We are also conducting these studies within hospitals and the first results will be published soon.
Citations:
[1] https://www.medrxiv.org/content/10.1101/2024.05.08.24307062v1
[2] https://www.medrxiv.org/content/10.1101/2024.05.08.24307062v1.full.pdf
[3] https://github.com/foundation29org/Dx29_client_gpt
[4] https://github.com/foundation29org/dxgpt_testing/blob/main/README.md
Future projects
We are currently working with several hospitals to further advance the ongoing evaluation of the latest and most advanced language models as they become available.
In addition, we are improving the evaluation systems we use to understand which models are best suited for diagnostic tasks both on our own and in collaboration with the Sant Joan de Deu hospital.
All results are continuously published on our github and in their respective MedRxiv publications, which will also be updated on this page in due course.
Due to the high demand of DxGPT we are not able to attend all the messages received in support. Therefore, we are working on developing systems to help us process the feedback we receive from users in the different communication channels available in DxGPT.
Adding this new functionality will allow us to deal with a higher volume of requests for new things to add, complaints, problems or improvements to the platform thanks to your comments.
About Foundation 29
Foundation 29 is a non-profit organization dedicated to empowering patients through data and collaboration with doctors and institutions. We focus on rare diseases, inspired by our namesake, February 29, the World Rare Disease Day. Our mission is to create technologies that transform healthcare and empower users.
Our sponsors
We extend our heartfelt gratitude to our generous sponsors whose altruistic support is vital to our mission. Their contributions are purely philanthropic, with no commercial interests or data sharing involved. Our sponsors' commitment to medical innovation and improving healthcare exemplifies the power of collaboration and generosity.
Join us
If you are interested in becoming a sponsor or collaborating with us, please contact us. Together, we can continue to build a healthier future for everyone.
Thank you to our sponsors:







