2025-07-30 09:53:13 +08:00

21 lines
1.4 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# 数据集介绍
## 简介
VoxCeleb2 是一个从开源媒体自动获得的大规模说话人识别数据集。 VoxCeleb2 包含来自 6k 多个扬声器的超过 100 万个话语。由于数据集是“在野外”收集的,语音片段被现实世界的噪音破坏,包括笑声、串音、频道效果、音乐和其他声音。该数据集也是多语言的,来自 145 个不同国籍的演讲者,涵盖了广泛的口音、年龄、种族和语言。该数据集是视听的,因此对于许多其他应用也很有用,例如 - 视觉语音合成、语音分离、从人脸到语音的跨模态转换(反之亦然)以及从视频中训练人脸识别以补充现有的人脸识别数据集。
## 引文
```
"@article{chung2018voxceleb2,
title={Voxceleb2: Deep speaker recognition},
author={Chung, Joon Son and Nagrani, Arsha and Zisserman, Andrew},
journal={arXiv preprint arXiv:1806.05622},
year={2018}
}
}@article{chung2018voxceleb2,
title={Voxceleb2: Deep speaker recognition},
author={Chung, Joon Son and Nagrani, Arsha and Zisserman, Andrew},
journal={arXiv preprint arXiv:1806.05622},
year={2018}
}"
```