• Community
  • Model
  • general-image-detector-detic_C2_IN_L_SwinB_lvis

general-image-detector-detic_C2_IN_L_SwinB_lvis

--

Notes

Detecting Twenty-thousand Classes using Image-level Supervision

Detic: A Detector with image classes that can use image-level labels to easily train detectors.

Detecting Twenty-thousand Classes using Image-level Supervision,
Xingyi Zhou, Rohit Girdhar, Armand Joulin, Philipp Krähenbühl, Ishan Misra,
ECCV 2022 (arXiv 2201.02605)

Features

  • Detects any class given class names (using CLIP).

  • We train the detector on ImageNet-21K dataset with 21K classes.

  • Cross-dataset generalization to OpenImages and Objects365 without finetuning.

  • State-of-the-art results on Open-vocabulary LVIS and Open-vocabulary COCO.

Detic_C2_IN-L_SwinB_896_4x Performance

Open-vocabulary LVIS

NameTraining timemask mAPmask mAP_novel
Box-Supervised_C2_R50_640_4x17h30.216.4
Detic_C2_IN-L_R50_640_4x22h32.424.9
Detic_C2_CCimg_R50_640_4x22h31.019.8
Detic_C2_CCcapimg_R50_640_4x22h31.021.3
Box-Supervised_C2_SwinB_896_4x43h38.421.9
Detic_C2_IN-L_SwinB_896_4x47h40.733.8

Note

  • The open-vocabulary LVIS setup is LVIS without rare class annotations in training. We evaluate rare classes as novel classes in testing.

  • The models with C2 are trained using our improved LVIS baseline (Appendix D of the paper), including CenterNet2 detector, Federated Loss, large-scale jittering, etc.

  • All models use CLIP embeddings as classifiers. This makes the box-supervised models have non-zero mAP on novel classes.

  • The models with IN-L use the overlap classes between ImageNet-21K and LVIS as image-labeled data.

  • The models with CC use Conception Captions. CCimg uses image labels extracted from the captions (using a naive text-match) as image-labeled data. CCcapimg additionally uses the row captions (Appendix C of the paper).

  • The Detic models are finetuned on the corresponding Box-Supervised models above (indicated by MODEL.WEIGHTS in the config files). Please train or download the Box-Supervised model and place them under DETIC_ROOT/models/ before training the Detic models.

  • ID
  • Name
    general-image-detector-detic_C2_IN_L_SwinB_lvis
  • Model Type ID
    Visual Detector
  • Description
    --
  • Last Updated
    Aug 29, 2022
  • Privacy
    PUBLIC
  • License
  • Share
    • Badge
      general-image-detector-detic_C2_IN_L_SwinB_lvis