Cover Image ## Introduction V3Det is a Vast Vocabulary Visual Detection Dataset with accurately annotated more than 13,000 object categories, empowering more comprehensive research in object detection. 1) Vast Vocabulary:V3Det contains bounding boxes of objects from more than 13,000 categories on real-world images. 2) Hierarchical Category Organization:V3Det is organized by a hierarchical category tree which annotates the inclusion relationship among categories. 3) Rich Annotations:V3Det comprises precisely annotated objects in 245k images and professional descriptions of each category written by human experts and chatgpt. ### Data ![](https://github.com/ztayty/ztayty.github.io/blob/main/image/%E6%95%B0%E6%8D%AE%EF%BC%88%E8%BF%90%E8%90%A5%E6%89%8B%E5%8A%A8%E4%B8%8A%E6%9E%B6%E5%88%B0%E7%B1%BB%E5%AE%9A%E4%B9%89%EF%BC%89.jpg?raw=true) ## Citation Please cite the following paper when using V3Det ``` @misc{wang2023v3det, title={V3Det: Vast Vocabulary Visual Detection Dataset}, author={Jiaqi Wang and Pan Zhang and Tao Chu and Yuhang Cao and Yujie Zhou and Tong Wu and Bin Wang and Conghui He and Dahua Lin}, year={2023}, eprint={2304.03752}, archivePrefix={arXiv}, primaryClass={cs.CV} } ``` ‌​‌‌​​​​‌​​​‌‌‌‌‌​​‌‌​‌​‌​​‌​​​‌‌​‌‌‌​‌‌‌​​‌‌‌‌​‌​​​‌​‌‌‌​​‌‌‌‌​‌​‌‌​​‌‌‌​​‌‌‌‌​‌​​‌‌‌​‌