基于灵溪数据集的论文被顶会AAAI 2025录用

作者:TICC编辑部
浏览量:85
发布日期:2024年12月10日

a9c4bd6dfaac70e46994ff7a2a43edc1.png

MDD-5k: A New Diagnostic Conversation Dataset for Mental Disorders Synthesized via Neuro-Symbolic LLM Agents



The clinical diagnosis of most mental disorders primarily relies on the conversations between psychiatrist and patient. The creation of such diagnostic conversation datasets is promising to boost the AI mental healthcare community. However, directly collecting the conversations in real diagnosis scenarios is near impossible due to stringent privacy and ethical considerations. To address this issue, we seek to synthesize diagnostic conversation by exploiting anonymous patient cases that are easier to access. Specifically, we design a neuro-symbolic multi-agent framework for synthesizing the diagnostic conversation of mental disorders with large language models. It takes patient case as input and is capable of generating multiple diverse conversations with one single patient case. The framework basically involves the interaction between a doctor agent and a patient agent, and achieves text generation under symbolic control via a dynamic diagnosis tree from a tool agent. By applying the proposed framework, we develop the largest Chinese mental disorders diagnosis dataset MDD-5k, which is built upon 1000 cleaned real patient cases by cooperating with a pioneering psychiatric hospital, and contains 5000 high-quality long conversations with diagnosis results as labels. To the best of our knowledge, it's also the first labelled Chinese mental disorders diagnosis dataset. Human evaluation demonstrates the proposed MDD-5k dataset successfully simulates human-like diagnostic process of mental disorders.

c2174040d961ac7f46be04684f565f91.png


由盛大 Theta 殷聪驰、李峰、张澍、邵骏、姜迅,以及 TCCI 人工智能与精神健康前沿实验室研究员陈剑华共同发表的研究论文 《MDD-5k: A New Diagnostic Conversation Dataset for Mental Disorders Synthesized via Neuro-Symbolic LLM Agents》 成功入选第 39 届 AAAI 人工智能国际会议 (AAAI-25)。这是一项全球人工智能领域的顶级盛会,将于 2025 年 2 月 25 日至 3 月 4 日在美国费城举行。

本论文提出了一种基于大语言模型的多智能体框架,依据匿名精神疾病患者的病例信息,通过构建动态诊断树模拟人类医生的诊断过程,合成了高质量精神科疾病的诊断对话,以解决实际诊断场景中数据获取的隐私和伦理挑战。我们开发了全球最大的中文精神科疾病诊断数据集 MDD-5k,包含 5000 例医生与患者之间的多轮对话。这一成果为人工智能在精神健康领域的研究与应用提供了全新工具和实践基础。

联系我们
如有任何数据方面的需求,请填写并提交表单。我们的工作人员将于工作日24小时内与您联系沟通!
* 姓名
* 联系电话
* 联系邮箱地址
* 公司名称/机构名称
* 联系地址
* 客户需求
* 验证码:

TCCI承诺收集您的这些信息仅用于与您取得联系,帮助您更好的了解我们。发送即代表您同意我们的《隐私政策》

我们使用cookie来个性化和增强您在我们网站上的浏览体验 点击“接受所有cookie”,即表示您同意使用cookie 您可以阅读我们的Cookie政策以获取更多信息。