外科手術におけるリスク評価の自動計算機が米国外科学会から出されています
ACS NSQIP risk calculatorというものはACS=米国外科学会が行うNSQIP=療品質向上プログラムが推奨するリスク評価のための計算ツールです
ツールはこちら
The risk calculator was built using data collected from over 5.0 million operations from 874 hospitals participating in ACS NSQIP from 2016-20. Entering the most complete and accurate patient information will provide the most precise risk information. However, the estimates can still be calculated if some of the patient information is unknown. Detailed information regarding ACS NSQIP methodology and risk calculator modeling can be found here:
ということで、874施設、500万人のデータセットを用いてリスク評価がなされています
計算過程について根拠となっている論文について今回は抄読していきます
1 Bilimoria KY, Liu Y, Paruch JL, et al. Development and evaluation of the universal ACS NSQIP surgical risk calculator: a decision aid and informed consent tool for patients and surgeons. J Am Coll Surg 2013; 217: 833–42.e1–3.
汎用リスク評価計算機と、旧来の疾患固有のリスク計算機との比較
393施設、141万人のデータを用いて両者を比較しています
外科医の意見を反映するために工夫もされています
汎用リスク評価計算機の正確性が疾患固有のリスク計算機(結腸がん)と比較してC統計量とBrierスコアで同等のリスク評価が可能であると評価されていますC統計量:0.5 (random concordance)から1 (perfect concordance)までの範囲の値を取り、 0.7を超えると良いと考えられます
Brierスコア:0-1で評価される数字で0に近いほど予測精度が高いとされます)
これによりこれまでの疾患固有リスク評価計算機から汎用リスク評価計算機が推奨される根拠になっています
※その他の部位では反論文もありBackground: Accurately estimating surgical risks is critical for shared decision making and informed consent. The Centers for Medicare and Medicaid Services may soon put forth a measure requiring surgeons to provide patients with patient-specific, empirically derived estimates of postoperative complications. Our objectives were to develop a universal surgical risk estimation tool, to compare performance of the universal vs previous procedure-specific surgical risk calculators, and to allow surgeons to empirically adjust the estimates of risk.
Study design: Using standardized clinical data from 393 ACS NSQIP hospitals, a web-based tool was developed to allow surgeons to easily enter 21 preoperative factors (demographics, comorbidities, procedure). Regression models were developed to predict 8 outcomes based on the preoperative risk factors. The universal model was compared with procedure-specific models. To incorporate surgeon input, a subjective surgeon adjustment score, allowing risk estimates to vary within the estimate’s confidence interval, was introduced and tested with 80 surgeons using 10 case scenarios.
Results: Based on 1,414,006 patients encompassing 1,557 unique CPT codes, a universal surgical risk calculator model was developed that had excellent performance for mortality (c-statistic = 0.944; Brier score = 0.011 [where scores approaching 0 are better]), morbidity (c-statistic = 0.816, Brier score = 0.069), and 6 additional complications (c-statistics > 0.8). Predictions were similarly robust for the universal calculator vs procedure-specific calculators (eg, colorectal). Surgeons demonstrated considerable agreement on the case scenario scoring (80% to 100% agreement), suggesting reliable score assignment between surgeons.
Conclusions: The ACS NSQIP surgical risk calculator is a decision-support tool based on reliable multi-institutional clinical data, which can be used to estimate the risks of most operations. The ACS NSQIP surgical risk calculator will allow clinicians and patients to make decisions using empirically derived, patient-specific postoperative risks.
1 Cohen ME, Liu Y, Ko CY, Hall BL. An Examination of American College of Surgeons NSQIP Surgical Risk Calculator Accuracy. J Am Coll Surg 2017; 224: 787–95.e1.
ACSリスク計算機の不正確さを指摘する論文に対する反論文
サンプルサイズ、施設数、症例の偏りなどの点からACSリスク計算機の汎用性という意味で問題はないと結論目次
BACKGROUND:
The American College of Surgeons NSQIP offers a Surgical Risk Calculator (SRC) that provides detailed, patient-level, risk assessments for many adverse outcomes to surgeons, patients, and the general public. The SRC calculator was designed to help guide discussion and decisions by providing generally applicable (not hospital-specific) information about surgical risk using easily understood and broadly available preoperative variables. Although large, internal evaluations have shown that the SRC has good accuracy (model discrimination and calibration), external validations have been inconsistent and tend to favor a conclusion of inadequate performance.
STUDY DESIGN:
External studies, attempting to validate the SRC, were examined with respect to 3 design features: sample size (small samples reduce reliability), case-mix homogeneity (homogeneity reduces discrimination); and number of institutions providing data (few institutions reduces generalizability). The impact of each feature was then examined in several sets of simulation studies.
RESULTS:
Each of the 3 design features has the potential to act as an artifactual cause for apparent SRC predictive failure. In addition, demonstrations that SRC estimates are inferior to those from models that use additional (sometimes operation-specific) predictor variables were seen as not relevant with respect to the SRC’s intended scope.
CONCLUSIONS:
The SRC predictive failures, reported by studies with the described design limitations, should not be misunderstood as disqualifying the SRC as an accurate and appropriate tool for its intended purpose of providing a general purpose risk calculator, applicable across many surgical domains, using easily understood and generally available predictive information.
1 Liu Y, Ko CY, Hall BL, Cohen ME. American College of Surgeons NSQIP Risk Calculator Accuracy Using a Machine Learning Algorithm Compared with Regression. J Am Coll Surg 2023; 236: 1024–30.
機械学習モデルを用いてACSリスク計算機の精度を改善できるのではないかという論文
ここでは旧来のRegressionモデルよりもXGBoostingモデル(勾配ブースティング決定木を使ったアルゴリズム)が改善できるのではないか?という試みから、それを証明しています
これが計算過程の根拠として提示されているため、おそらくXGBoostingモデルが使われているものと思われますBACKGROUND:
The American College of Surgeons NSQIP risk calculator (RC) uses regression to make predictions for fourteen 30-day surgical outcomes. While this approach provides accurate (discrimination and calibration) risk estimates, they might be improved by machine learning (ML). To investigate this possibility, accuracy for regression-based risk estimates were compared to estimates from an extreme gradient boosting (XGB)-ML algorithm.
STUDY DESIGN:
A cohort of 5,020,713 million NSQIP patient records was randomly divided into 80% for model construction and 20% for validation. Risk predictions using regression and XGB-ML were made for 13 RC binary 30-day surgical complications and one continuous outcome (length of stay [LOS]). For the binary outcomes, discrimination was evaluated using the area under the receiver operating characteristic curve (AUROC) and area under the precision recall curve (AUPRC), and calibration was evaluated using Hosmer–Lemeshow statistics. Mean squared error and a calibration curve analog were evaluated for the continuous LOS outcome.
RESULTS:
For every binary outcome, discrimination (AUROC and AUPRC) was slightly greater for XGB-ML than for regression (mean [across the outcomes] AUROC was 0.8299 vs 0.8251, and mean AUPRC was 0.1558 vs 0.1476, for XGB-ML and regression, respectively). For each outcome, miscalibration was greater (larger Hosmer–Lemeshow values) with regression; there was statistically significant miscalibration for all regression-based estimates, but only for 4 of 13 when XGB-ML was used. For LOS, mean squared error was lower for XGB-ML.
CONCLUSIONS:
XGB-ML provided more accurate risk estimates than regression in terms of discrimination and calibration. Differences in calibration between regression and XGB-ML were of substantial magnitude and support transitioning the RC to XGB-ML.
根拠については…と提示されている論文について読んでみたのですが、計算過程に言及されているのはRegressionモデル→XGBoostingモデルということでした
これが取り入れられているのかな?詳細はわかりませんでした
その他の論文集
Mansmann, U., Rieger, A. K., Strahwald, B., & Crispin, A. (2016). Risk calculators—methods, development, implementation, and validation. International Journal of Colorectal Disease, 31, 1111-1116. DOI: 10.1007/s00384-016-2589-3
El Asmar, A., Hafez, K., Fauconnier, P., Moreau, M., Dal Lago, L., Pepersack, T., Donckier, V., & Liberale, G. (2022). The efficacy of the American College of Surgeons Surgical Risk Calculator in the prediction of postoperative complications in oncogeriatric patients after curative surgery for abdominal tumors. Journal of Surgical Oncology, 126, 1359-1366. DOI: 10.1002/jso.27046
Slump, J., Ferguson, P., Wunder, J., Griffin, A., Hoekstra, H., Bagher, S., Zhong, T., Hofer, S., & O’Neill, A. (2016). Can the ACS‐NSQIP surgical risk calculator predict post‐operative complications in patients undergoing flap reconstruction following soft tissue sarcoma resection? Journal of Surgical Oncology, 114. DOI: 10.1002/jso.24357
Vaziri, S., Wilson, J., Abbatematteo, J. M., Kubilis, P., Chakraborty, S., Kshitij, K., & Hoh, D. (2017). Predictive performance of the American College of Surgeons universal risk calculator in neurosurgical patients. Journal of Neurosurgery, 128(3), 942-947. DOI: 10.3171/2016.11.JNS161377
Huang, Z., Dong, W., Duan, H., & Liu, J. (2018). A Regularized Deep Learning Approach for Clinical Risk Prediction of Acute Coronary Syndrome Using Electronic Health Records. IEEE Transactions on Biomedical Engineering, 65, 956-968. DOI: 10.1109/TBME.2017.2731158
Cohen, M. E., Bilimoria, K., Ko, C., & Hall, B. (2009). Development of an American College of Surgeons National Surgery Quality Improvement Program: morbidity and mortality risk calculator for colorectal surgery. Journal of the American College of Surgeons, 208(6), 1009-16. DOI: 10.1016/j.jamcollsurg.2009.01.043
Raymond, B., Wanderer, J., Hawkins, A., Geiger, T., Ehrenfeld, J. M., Stokes, J., & McEvoy, M. (2019). Use of the American College of Surgeons National Surgical Quality Improvement Program Surgical Risk Calculator During Preoperative Risk Discussion: The Patient Perspective. Anesthesia & Analgesia, 128, 643–650. DOI: 10.1213/ANE.0000000000003718
タグ: Rstats, 統計手法, 論文作成