On advancing healthcare with the wisdom of the masses

by Lena Maier-Hein

One of the major bottlenecks related to applying machine learning algorithms in computer-assisted interventions is the lack of training data. Typically, medical experts do not have sufficient resources to generate large amounts of annotated images that fully capture the variance of clinical practice. Similar problems arise in the context of generating ground truth for validating new algorithms. A dead end?

Google’s recent answer to the challenge of producing large-scale image annotations was crowdsourcing – a new trend that involves outsourcing cognitive tasks to anonymous workers from an online community. By providing a computer game (“ESP game”) that asks users to find the same terms for an image as another anonymous player somewhere in the world, Google was able to significantly improve the Google image search. In the meantime, crowdsourcing has been applied in various areas with spectacular success. One of the most prominent examples in the context of biomedical applications is the “Foldit” game which was designed for resolving the 3D structures of proteins. After the game had been online for only a few weeks, the crowd discovered the structure of an enzyme that is important in the context of AIDS research and had been subject of research for 15 years.

How do such success stories apply to us? Could crowdsourcing be the solution to our data annotation problem in CAI? A first answer was presented at MICCAI 2014 by researchers from the German Cancer Research Center1, the University of Heidelberg2 and Karlsruhe Institute of Technology3 in the scope of the Collaborative Research Center “Cognition-guided surgery”. Pilot studies on medical instrument segmentation and correspondence search in video data have shown that the quality of crowd annotations can compete with that of experts – but the crowd is orders of magnitude faster (and cheaper)!

Still, open research questions remain to be addressed, such as How can we optimize the task design?
How can we motivate the crowd to perform best?
Which crowdsourcing platform is optimally suited?
How can we best combine annotations of multiple users?
Which tasks require complementary expert knowledge?

We are just getting started…