Tsinghua University researchers at the Intelligent Industry Research Institute (AIR) and the Department of Computer Science and Technology have built a virtual Agent Hospital for AI doctor training without human intervention. They first created a simulation of an entire hospital along with staff and patients. AI doctors were then given the responsibility of diagnosing and treating thousands of virtual patients without human intervention. The doctors quickly learned from their mistakes, and their skills in examining, diagnosing, and treating rose significantly.
Virtual simulations, or simulacrums, replicate a real-world environment for safe and rapid training of AI. The computer does not need to wait for a sick patient to appear, but rather, hundreds, thousands, even millions of sick patients can be programmed to appear as desired. The cost of such simulations is also much lower than actual training.
Tsinghua researchers were able to quickly train virtual AI doctors on 10,000 virtual patients in the Agent Hospital simulation using their process called the MedAgent-Zero method. These were created by feeding large-language models information on eight types of diseases to create electronic health records for 10,000 virtual patients, each of whom had a different severity and presentation. These eight diseases were acute nasopharyngitis, acute rhinitis, bronchial asthma, chronic bronchitis, COVID-19, Influenza A, Influenza B, and mycoplasma infection. A separate set of 500 patient records were created for testing.
During simulations, the virtual doctor powered by gpt-3.5-turbo-1106 quickly developed their skills. After seeing 10,000 virtual patients, the doctor had success rates in examining, diagnosing, and treating patients as high as 88%, 95.6%, and 77.6% depending on the disease.
GPT is rapidly improving, so Tsinghua researchers also tested their MedAgent-Zero training method using the more powerful gpt-4-1106-preview. They compared the performance of gpt-3 and gpt-4 AI doctors by using 1,273 questions from the MedQA database, a large set of multiple-choice questions that are similar to medical licensing questions found on tests such as the USMLE. The performance of virtual doctors on respiratory disease questions was 93.06% gpt-4 versus 84.72% gpt-3.
The ground-breaking performance of these AI doctors was achieved using just days of virtual training, and the Agent Hospital simulacrum opens the road to developing training methods for future AI doctors, as well as real doctors, that are significantly faster and more effective.
Readers who don’t know what jobs will be left in an AI-powered future can just tell their 1X humanoid robots to make dinner and clean house for them. Those that want to build AI should outfit their PC rigs with a fast Nvidia GPU (like this one on Amazon) or buy one of the fastest laptops on the planet (like this one on Amazon) to get started on training AI to take over many jobs.
Are you a techie who knows how to write? Then join our Team! Wanted:
- News Writer (Romania based)
Details here
Source(s)
Machine translated by Edge browser:
AIR creates a virtual hospital to realize the self-evolution of AI doctors
Release time: 2024-05-24
Tsinghua University's Intelligent Industry Research Institute (AIR) and the Department of Computer Science and Technology of Tsinghua University have cooperated to build a virtual hospital, Agent HospitalMedAgent-Zero, a self-evolution method for medical agents, is proposed, which enables medical agents to continuously improve their medical capabilities by generating a large amount of data without manual annotation in virtual hospitals, and is verified in real-world datasets. All patients, nurses, and doctors in Agent Hospital are played by autonomous agents driven by large models, which simulate the closed-loop process of "pre-hospital-in-hospital-post-hospital" of onset, triage, registration, consultation, examination, diagnosis, medication, rehabilitation, and follow-up. Based on the knowledge base and basic model, Agent Hospital simulates the disease generation and development process of virtual patients. Virtual doctors learn (i.e., read medical literature) and practice (i.e., interact with virtual patients and make diagnosis and treatment decisions) in Agent Hospital, constantly summarize experience from successful diagnosis and treatment cases, reflect on lessons from failed cases, and continuously improve the accuracy of multiple diagnosis and treatment tasks. After treating nearly 10,000 virtual patients (it takes about 2 years for human doctors), virtual doctors were able to surpass the current best methods on the respiratory disease subset of the MedQA dataset, achieving an accuracy rate of 93.06%. The study, which was co-authored by Assistant Prof. Ma Weizhi of AIR and Prof. Yang Liu, Executive Dean of AIR and Associate Dean of the Department of Computer Science, has received extensive attention and discussion from the artificial intelligence community and medical community at home and abroad after it was published on arXiv.
· Title of the paper: Agent Hospital: A Simulacrum of Hospital with Evolvable Medical Agents
· Link to the paper: arxiv.org/pdf/2405.02957v1
In recent years, large-scale language models have developed vigorously, and agent technology based on large language models has attracted much attention. Previous studies have used agent technology to achieve real-world simulation, including interaction and game scenarios such as "Stanford Town" and "Werewolf Killing Game". At the same time, agent technology is also used in the scheduling planning and collaboration process of various tasks, but this process mostly relies on the support of high-quality manual annotated data. Therefore, the research question is whether real-world simulation can help improve the task processing ability of agents.
Smart healthcare has attracted wide attention due to its importance and application value, and the research team has paid great attention to the application of large language models and agent technology in medical scenarios. In response to the above research questions, the team believes that the real model environment can help the task ability of agents improve and evolve, so they carried out the Agent Hospital research that combines real-world simulation and medical ability improvement. In this work, the team is committed to building a hospital simulation environment and exploring the autonomous evolution of medical agents in this environment. The purpose is to enable agents to independently accumulate medical knowledge in the process of diagnosis and treatment and learning, just like human doctors, and realize the continuous evolution of medical capabilities.
The research team first focused on using large-scale model agents to simulate real-world critical medical processes. In Agent Hospital, the team designed and covered 8 typical scenarios from disease generation to recovery, namely: onset, triage, registration, consultation, examination, diagnosis, prescription and recovery, and patients will actively participate in follow-up feedback. All processes are supported by large models in which the roles can interact autonomously.
Examples of major diagnosis and treatment sessions
The diagram above illustrates a closed-loop approach: when Kenneth Morgan, the patient's agent, becomes ill, he goes to the hospital for help. Triage nurse Katherine Li understands Morgan's symptoms and analyzes him, and triages him to a specific department. After Morgan completes the registration, consultation, and medical examination according to the doctor's instructions, the doctor Robert will give him the final diagnosis and treatment plan, and Morgan will go home to rest according to the doctor's instructions and give feedback to the hospital for recovery, until the next time he gets sick and then go to the hospital.
As you can see from the example above, the research team designed two main types of roles for the hospital: medical staff and patients. All character information is generated by a large model (GPT-3.5), so it can be easily scaled and added. The specific information of some of the characters is shown in the figure below, 35-year-old patient Kenneth Morgan currently has acute rhinitis, a history of hypertension, and a series of symptoms such as persistent vomiting; Zhao Lei is an experienced radiologist, and internist Elise Martin has excellent communication skills and specializes in the diagnosis and treatment of acute and chronic medical diseases. These complete character information backgrounds enhance the realism of the hospital simulation.
An introduction to the virtual character's information
In the above-mentioned medical simulation process, the generation of disease is the key to it. Specifically, the current medical record information is generated by a large language model combined with medical knowledge to generate a complete medical record for the patient, including the type of disease, symptoms, duration, and various examination results (see the appendix of the paper for details). It should be noted that in order to ensure the accuracy of the entire simulation process as much as possible, the patient agent will only perceive the symptoms of his disease but not the specific disease, while the doctor agent can only understand the information by talking to the patient agent and prescribing tests. The examination that the patient agent needs to perform, the type of disease and the severity of the disease will be used as three key tasks to evaluate the ability of the medical agent to diagnose and treat virtual patients.
Most traditional medical model training methods rely on pre-training, fine-tuning and other technologies, so they need to be supported by a large amount of medical data and some high-quality manually annotated data. However, the research team believes that the process of improving the capacity of human doctors does not rely on such massive data, and they can often accumulate experience from clinical practice in the diagnosis and treatment process, and will also improve by reading medical literature to accumulate key knowledge. Medical agents in virtual hospitals should be able to achieve similar capability evolution.
Therefore, the team designed an agent self-evolution algorithm named "MedAgent-Zero", which, like AlphaGo-Zero, does not rely on manual annotation data, but uses learning (i.e., reading medical literature) and practice (i.e., interacting with virtual patients and making diagnosis and treatment decisions) in the virtual hospital to achieve capacity improvement. Independently accumulate experience on the three tasks of disease diagnosis and treatment recommendations; On the other hand, medical agents will also learn autonomously, simulating the learning process of medical documents based on the medical questions generated by the LLM.
MedAgent-Zero policy flow diagram
As shown in the figure above, the evolution of MedAgent-Zero includes two approaches: 1) Summarizing experience from successful cases, for diagnosis and treatment problems that can be answered correctly, the intelligent body will accumulate case database experience like a human doctor; 2) Reflect on lessons learned from failures, and when answering mistakes, the agent will take the initiative to reflect on the mistakes and reflect on them. If the lessons from reflection help the agent answer the question, it will be preserved and stored in the experience pool.
Eventually, the research team will carry out the accumulation and evolution of the above two aspects in the training process on virtual data. In each inference process, the agent retrieves the most similar content from the two databases and adds it to the Prompt for in-context learning, and accumulates medical records or summarizes experience according to the correct and incorrect answers, so as to continuously improve the agent's ability.
In the virtual hospital, the research team constructed the medical records of tens of thousands of virtual patients for the autonomous evolution experiments of medical agents, including 8 respiratory related diseases such as influenza A, influenza B, and new crown, involving more than 10 different medical examinations. Based on the calculation that human doctors treat about 100 patients a week, it may take two years for human doctors to diagnose 10,000 patients, but it only takes a few days for intelligent doctors to do this.
The team mainly evaluated the ability of medical agents in virtual hospitals from two aspects. The first is the evaluation of medical competence in the virtual environment: as shown in the figure below, in the training process of the medical agent (left), with the increase in the number of patients diagnosed and treated, the accuracy of the medical agent on the three key tasks continues to increase and gradually stabilizes. In the experiment of 500 test medical records, it was found (right) that the accuracy of the agent fluctuated slightly as the number of patients increased, but showed an overall upward trend.
The task accuracy of the medical agent on the training set (left) and the test set (right).
Subsequently, the research team compared the diagnostic accuracy of medical agents on various diseases before and after their evolution, and found that they were all greatly improved, verifying the effectiveness of their autonomous evolution.
Diagnostic manifestations of different diseases before and after the evolution of agents
On the other hand, the team used a subset of respiratory diseases from the external dataset MedQA to evaluate the medical agent's ability in real-world medicine. Surprisingly, even without using any artificially annotated data in the process of agent evolution, after treating nearly 10,000 patients, the medical agent was able to surpass the current best method on the dataset and achieve the highest accuracy rate of 93.06%, which verifies the effectiveness of the autonomous evolution of medical agents in the simulated environment.
Accuracy of different methods on a subset of MedQA
In addition, the research team carried out ablation experimental verification, and the results showed that both the examples accumulated from the successes and the lessons learned from the failures can help improve the medical capabilities of the model.
Ablation assay performance of MedAgent-Zero
In summary, this research work constructs the first virtual hospital scenario, Agent Hospital, and proposes MedAgent-Zero, a medical agent evolution algorithm that does not rely on artificial data annotation. The experimental results of virtual data and real data preliminarily verify the effectiveness of the simulation environment for the improvement of medical agent capabilities, and propose new solutions for the application of artificial intelligence, especially large language models and agent technology in smart medical scenarios. However, there are still some limitations in this research work, and in the future, the team will continue to improve and optimize the disease types covered, the meticulousness of the simulation environment, and the selection and optimization of the model base.
About the corresponding author
Ma Weizhi, an assistant researcher at the Institute of Intelligent Industry (AIR) of Tsinghua University, was selected as the "Young Talent Lifting Project" of the China Association for Science and Technology. His research interests include intelligent information acquisition and intelligent medical care. Personal homepage: mawz12.github.io.
Liu Yang is Professor of GDS, Executive Dean of the Institute of Intelligent Industry (AIR), Deputy Dean of the Department of Computer Science, Tsinghua University, and winner of the National Fund for Distinguished Young Scholars. His research interests include artificial intelligence, natural language processing and smart medicine. Personal homepage: nlp.csai.tsinghua.edu.cn/~ly.