【医学論文】重症上部消化管出血を予測する機械学習モデル構築

みなさんこんにちは！日々機械学習を学ぶ外科医のさとうです。

今回はGastroenterologyに2020年に掲載された治療介入を要する・もしくは３０日以内に死亡する重症上部消化管出血を予測する機械学習アルゴリズム構築に関する論文を一緒に勉強していきましょう。

今回紹介する論文はこちらです。

Pubmedから検索。

今回の論文

論文：Validation of a Machine Learning Model That Outperforms Clinical Risk Scoring Systems for Upper Gastrointestinal Bleeding

著者：Dennis L. Shung et al.

雑誌：Gastroenterology 2020;158:160–167

重症上部消化管出血を予測する機械学習モデル構築

＜背景＞

我々は機械学習を用いて、上部消化管出血（UGIB）患者の医療介入リスクまたは死亡のリスクを計算するモデルを開発し、その性能を他のスコアリングシステムと比較した。

＜方法＞

2014年3月から2015年3月までに、米国、スコットランド、イングランド、デンマークの4カ国の医療機関で、UGIBを発症した無選別の患者から収集したデータを分析した。これらのデータを用いて、病院での医療介入（輸血または止血介入）または30日以内の死亡というエンドポイントを満たす患者を同定するための、勾配ブースティング機械学習モデルを導き出し、内部検証を行いました。機械学習による予測モデルの性能を、内視鏡前に検証された臨床リスクスコアリングシステム（Glasgow-Blatchfordスコア、AdmissionRockallスコア、AIMS65）と比較した。アジア太平洋地域の2施設（シンガポールとニュージーランド、n=399）のデータを用いて、機械学習モデルを外部検証した。性能は、AUCによって測定した。

＜結果＞

内部検証セットにおいて、機械学習モデルは、エンドポイントを達成した患者をAUC 0.91で同定した。臨床スコアリングシステムは、エンドポイントを達成した患者をGBSで0.88（P=.001）、Rockallスコアで0.73（P < 0.001）、AIMS65スコアで0.78（P < 0.001）のAUC値で同定した。外部検証コホートにおいて，機械学習モデルは，エンドポイントを満たした患者を，AUC 0.90，GBSのAUC 0.87（P=.004），RockallスコアのAUC 0.66（P < 0.001），AIMS65のAUC 0.64（P < 0.001）で特定した。機械学習モデルとGBSがエンドポイントを満たす患者を100％の感度で同定するカットオフスコアでは、特異度は機械学習モデルで26％、GBSで12％であった（P < 0.001）。

＜結論＞

我々は、病院での治療介入または30日以内の死亡というエンドポイントを満たすUGIB患者を、有効な臨床リスクスコアリングシステムよりも高いAUCと高いレベルの特異性（感度は100％）で同定する機械学習モデルを開発した。このモデルにより、救急部から安全に退院して患者管理を行うことができる低リスク患者の同定が増加する可能性がある。

＜Background＞

Scoring systems are suboptimal for determining risk in patients with upper gastrointestinal bleeding(UGIB); these might be improved by a machine learning model. We used machine learning to develop a model to calculate the risk of hospital-based intervention or death in patients with UGIB and compared its performance with other scoring systems.

＜Methods＞

We analyzed data collected from consecutive unselected patients with UGIB from medical centers in 4 countries (the United States, Scotland, England, and Denmark; n . 1958) from March 2014 through March 2015. We used the data to derive and internally validate a gradient-boosting machine learning model to identify patients who met a composite endpoint of hospital-based intervention (transfusion or hemostatic intervention) or death within 30 days. We compared the performance of the machine learning prediction model with validated pre-endoscopic clinical risk scoring systems (the Glasgow-Blatchford score [GBS], admission Rockall score, and AIMS65). We externally validated the machine learning model using data from 2 Asia-Pacific sites (Singapore and New Zealand; n . 399). Performance was measured by area under receiver operating characteristic curve (AUC) analysis.

＜Results＞

The machine learning model identified patients who met the composite endpoint with an AUC of 0.91 in the internal validation set; the clinical scoring systems identified patients who met the composite endpoint with AUC values of 0.88 for the GBS (P . .001), 0.73 for Rockall score (P < .001), and 0.78 for AIMS65 score (P < .001). In the external validation cohort, the machine learning model identified patients who met the composite endpoint with an AUC of 0.90, the GBS with an AUC of 0.87 (P =.004), the Rockall score with an AUC of 0.66 (P < .001), and the AIMS65 with an AUC of 0.64 (P < .001). At cutoff scores at which the machine learning model and GBS identified patients who met the composite endpoint with 100% sensitivity, the specificity values were 26% with the machine learning model versus 12% with GBS (P < .001).

＜Conclusion＞

We developed a machine learning model that identifies patients with UGIB who met a composite endpoint of hospital-based intervention or death within 30 days with a greater AUC and higher levels of specificity, at 100% sensitivity, than validated clinical risk scoring systems. This model could increase identification of low-risk patients who can be safely discharged from the emergency department for outpatient management.