May 31, 2021
The US FDA is hosting Mohamed Amgad to discuss our work on developing annotation datasets for building and validating computational pathology models. Mohamed's work engaged medical students, residents, fellows, and expert pathologists to collaboratively markup images and to build large and high-quality annotation datasets that are publicly available to the pathology community. High-resolution mapping of cells and tissue structures provides a foundation for developing interpretable computational pathology models. Deep learning algorithms can provide accurate mappings given large numbers of labeled instances for training and validation. Generating adequate volume of quality labels has emerged as a critical barrier in computational pathology given the time and effort required from pathologists. In his paper Mohamed described an approach for engaging crowds of medical students and pathologists that was used to produce a dataset of over 220,000 annotations of cell nuclei in breast cancers, and showed how suggestions generated by a weak algorithm can improve the quality of annotations generated by non-experts and can yield useful data for training segmentation algorithms without laborious manual tracing. See the paper or the dataset for more details.