UCSC-VLAA/MedTrinity-25M A Large-scale Multimodal Dataset... | UCSC-VLAA/MedTrinity-25M A Large-scale Multimodal Dataset...
UCSC-VLAA/MedTrinity-25M A Large-scale Multimodal Dataset
https://huggingface.co/papers/2408.02900
MedTrinity--25M is a large-scale multimodal dataset in the field of medicine.

Key Highlights
Dataset size and coverage: Covers more than 25 million images from 10 modalities with multi-granular annotations for more than 65 diseases.
Richness of annotations: Contains global textual information such as disease/lesion type, modality, region-specific descriptions and inter-regional relations, as well as detailed local annotations of regions of interest (ROIs) such as bounding boxes, segmentation masks.

Innovative data generation: Developed the first automated pipeline to extend multimodal data by generating multi-granular visual and textual annotations (in the form of image-ROI-description triplets) without image-text pairs.
Data collection and processing: Collected and preprocessed data from more than 90 different sources, and identified ROIs associated with abnormal regions using domain-specific expert models.