In liquid biopsy, detecting and differentiating circulating tumor cells (CTCs) and non-CTCs in metastatic cancer patients' blood samples remains challenging. The current gold standard often involves tedious manual examination of extensive image galleries. While machine learning (ML) offers potential automation, human expertise remains essential, particularly when ML systems face uncertainty or incorrect predictions due to limited labeled data. Combining self-supervised deep learning with an easily adaptable conventional ML classifier, we propose a human-in-the-loop approach with a targeted sampling strategy. By directing human efforts to label a limited set of new training samples from high-uncertainty clusters in the latent space, we iteratively reduce the system's uncertainty and improve classification performance, thereby saving time compared to naive sampling approaches. On data from metastatic breast cancer patients, we show the feasibility of our approach and achieve better performance while reducing expert evaluation time compared to the gold standard, the FDA-approved CellSearch system.
Keywords: CTC; circulating tumor cells; clustering; human-in-the-loop; image classification; latent space analysis; liquid biopsy; machine learning; metastatic breast cancer; self-supervision.
© 2025 The Authors.