Meta releases SAM2 segmentation model
Last week, Meta continued to make efforts in the field of images and released the Meta Segment Anything Model2 (SAM2) image segmentation model.
It is used for real-time, promptable image and video object segmentation, achieving a leap in video segmentation experience and enabling seamless use between image and video applications.
SM2 surpasses previous capabilities in image segmentation accuracy and achieves better video segmentation performance than existing works, while requiring one-third of the interaction time.
SM2 can also segment any object in any video or image (often described as O-shot generalization), which means it can be applied to previously unseen visual content without custom adaptation.
Also released is SAV: the largest video segmentation dataset, the SA-V dataset contains an order of magnitude more annotations, and the number of videos in the video object segmentation dataset is about 4.5 times that of existing datasets.
The main features of S-V are: more than 600,000 mask annotations on approximately 51,000 videos. Videos showing geographical diversity and real scenes, collected from 47 countries. Covers annotations for whole objects, object parts, and challenging situations, such as objects being occluded, disappearing, and reappearing.
This demo is outrageous, SAM2 can stably track and segment a person from a very blurry, very detailed aerial video.
Download the model here: https:/github.com/facebookresearch/segment-anything-2
Experience SAM2 here: https:/sam2.metademolab.com/
Last week, Meta continued to make efforts in the field of images and released the Meta Segment Anything Model2 (SAM2) image segmentation model.
It is used for real-time, promptable image and video object segmentation, achieving a leap in video segmentation experience and enabling seamless use between image and video applications.
SM2 surpasses previous capabilities in image segmentation accuracy and achieves better video segmentation performance than existing works, while requiring one-third of the interaction time.
SM2 can also segment any object in any video or image (often described as O-shot generalization), which means it can be applied to previously unseen visual content without custom adaptation.
Also released is SAV: the largest video segmentation dataset, the SA-V dataset contains an order of magnitude more annotations, and the number of videos in the video object segmentation dataset is about 4.5 times that of existing datasets.
The main features of S-V are: more than 600,000 mask annotations on approximately 51,000 videos. Videos showing geographical diversity and real scenes, collected from 47 countries. Covers annotations for whole objects, object parts, and challenging situations, such as objects being occluded, disappearing, and reappearing.
This demo is outrageous, SAM2 can stably track and segment a person from a very blurry, very detailed aerial video.
Download the model here: https:/github.com/facebookresearch/segment-anything-2
Experience SAM2 here: https:/sam2.metademolab.com/