Cited time in webofscience Cited time in scopus

Standardized Assessment of Automatic Segmentation of White Matter Hyperintensities and Results of the WMH Segmentation Challenge

Standardized Assessment of Automatic Segmentation of White Matter Hyperintensities and Results of the WMH Segmentation Challenge
Kuijf, Hugo J.Biesbroek, J. Matthijsde Bresser, JeroenHeinen, RutgerAndermatt, SimonBento, MarianaBerseth, MattBelyaev, MikhailCardoso, M. JorgeCasamitjana, AdriaCollins, D. LouisDadar, MahsaGeorgiou, AchilleasGhafoorian, MohsenJin, DakaiKhademi, AprilKnight, JesseLi, HongweiLlado, XavierLuna, MiguelMahmood, QaiserMcKinley, RichardMehrtash, AlirezaOurselin, SebastienPark, Bo-YongPark, HyunjinPark, Sang HyunPezold, SimonPuybareau, ElodieRittner, LeticiaSudre, Carole H.Valverde, SergiVilaplana, VeronicaWiest, RolandXu, YongchaoXu, ZiyueZeng, GuodongZhang, JianguoZheng, GuoyanChen, Christophervan der Flier, WiesjeBarkhof, FrederikViergever, Max A.Biessels, Geert Jan
DGIST Authors
Park, Sang Hyun
Issued Date
Article Type
Author Keywords
Image segmentationThree-dimensional displaysManualsWhite matterBiomedical imagingRadiologyMagnetic resonance imaging (MRI)brainevaluation and performancesegmentation
Quantification of cerebral white matter hyperintensities (WMH) of presumed vascular origin is of key importance in many neurological research studies. Currently, measurements are often still obtained from manual segmentations on brain MR images, which is a laborious procedure. The automatic WMH segmentation methods exist, but a standardized comparison of the performance of such methods is lacking. We organized a scientific challenge, in which developers could evaluate their methods on a standardized multi-center/-scanner image dataset, giving an objective comparison: the WMH Segmentation Challenge. Sixty T1 + FLAIR images from three MR scanners were released with the manual WMH segmentations for training. A test set of 110 images from five MR scanners was used for evaluation. The segmentation methods had to be containerized and submitted to the challenge organizers. Five evaluation metrics were used to rank the methods: 1) Dice similarity coefficient; 2) modified Hausdorff distance (95th percentile); 3) absolute log-transformed volume difference; 4) sensitivity for detecting individual lesions; and 5) F1-score for individual lesions. In addition, the methods were ranked on their inter-scanner robustness; 20 participants submitted their methods for evaluation. This paper provides a detailed analysis of the results. In brief, there is a cluster of four methods that rank significantly better than the other methods, with one clear winner. The inter-scanner robustness ranking shows that not all the methods generalize to unseen scanners. The challenge remains open for future submissions and provides a public platform for method evaluation. © 2019 IEEE.
Institute of Electrical and Electronics Engineers
Related Researcher
  • 박상현 Park, Sang Hyun 로봇및기계전자공학과
  • Research Interests 컴퓨터비전; 인공지능; 의료영상처리
Files in This Item:

There are no files associated with this item.

Appears in Collections:
Department of Robotics and Mechatronics Engineering Medical Image & Signal Processing Lab 1. Journal Articles


  • twitter
  • facebook
  • mendeley

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.