Datasets:
Languages:
Hausa
Multilinguality:
monolingual
Size Categories:
10K<n<100K
Language Creators:
expert-generated
Source Datasets:
original
ArXiv:
Tags:
bible
License:
cc-by-sa-4.0
audio
dict
| sentence
string
| locale
string
| book
string
| verse
string
|
---|---|---|---|---|
{"bytes":"ZkxhQwAAACIQABAAAANTACGyC7gBcAAEqmD6JgL0uALVJRyVVE7qDF2+AwAAEgAAAAAAAAAAAAAAAAAAAAAQAIQAAE(...TRUNCATED) | "Zan miƙa hannuna gāba da Yahuda da kuma dukan mazaunan Urushalima." | "ha" | "ZEP" | "007" |
{"bytes":"ZkxhQwAAACIQABAAAANJACJeC7gBcAAEd8Dqm+DFGZdHvnHJDdbXNfSTAwAAEgAAAAAAAAAAAAAAAAAAAAAQAIQAAE(...TRUNCATED) | "Ku nemi adalci da tawali’u; mai yiwuwa ku sami mafaka a ranar fushin Ubangiji." | "ha" | "ZEP" | "003" |
{"bytes":"ZkxhQwAAACIQABAAAANpACOuC7gBcAACqoAv8UqtELps8PFhkG3fnRRVAwAAEgAAAAAAAAAAAAAAAAAAAAAQAIQAAE(...TRUNCATED) | "Za a washe dukiyarsu, a rurrushe gidajensu." | "ha" | "ZEP" | "019" |
{"bytes":"ZkxhQwAAACIQABAAAAOBACJbC7gBcAAGe2BcXPcIS2QIGjGIv0rOrnsyAwAAEgAAAAAAAAAAAAAAAAAAAAAQAIQAAE(...TRUNCATED) | "A cikin zafin kishinsa za a hallaka dukan duniya gama zai kawo ƙarshen dukan mazaunan duniya nan d(...TRUNCATED) | "ha" | "ZEP" | "028" |
{"bytes":"ZkxhQwAAACIQABAAAANKACLMC7gBcAAE78D7YWC1sy+vftp4sjRpRLFrAwAAEgAAAAAAAAAAAAAAAAAAAAAQAIQAAE(...TRUNCATED) | "Ki yi murna ki kuma yi farin ciki da dukan zuciyarki, Ya Diyar Urushalima!" | "ha" | "ZEP" | "025" |
{"bytes":"ZkxhQwAAACIQABAAAAM3ACKCC7gBcAAEu0C6rU5PlJHiMsJWFMvNXHmZAwAAEgAAAAAAAAAAAAAAAAAAAAAQAIQAAE(...TRUNCATED) | "Ubangiji da yake cikinki mai adalci ne; ba ya yin abin da ba daidai ba." | "ha" | "ZEP" | "007" |
{"bytes":"ZkxhQwAAACIQABAAAAOnACR8C7gBcAAJfgAkhjYXFqZRQ8Tu+8lihKKlAwAAJAAAAAAAAAAAAAAAAAAAAAAQAAAAAA(...TRUNCATED) | "Maganar Ubangiji da ta zo wa Zefaniya ɗan Kushi, ɗan Gedaliya, ɗan Amariya, ɗan Hezekiya, a zam(...TRUNCATED) | "ha" | "ZEP" | "001" |
{"bytes":"ZkxhQwAAACIQABAAAAMYACNGC7gBcAAG3OC0trRDZsZsP7CYwPZy8q1UAwAAEgAAAAAAAAAAAAAAAAAAAAAQAIQAAE(...TRUNCATED) | "Wannan ne zai zama sakamakon girmankansu, saboda zagi da kuma ba’ar da suka yi wa mutanen Ubangij(...TRUNCATED) | "ha" | "ZEP" | "016" |
{"bytes":"ZkxhQwAAACIQABAAAAOWACH1C7gBcAAEpqCSI5w5IZ6TicLDWrTP/pl8AwAAEgAAAAAAAAAAAAAAAAAAAAAQAIQAAE(...TRUNCATED) | "Gargaɗi a kan hallaka mai zuwa Zan hallaka kome a fuskar duniya," | "ha" | "ZEP" | "002" |
{"bytes":"ZkxhQwAAACIQABAAAAEbACQjC7gBcAAOQMCMSQcrUpgj51UN9jbUg726AwAAJAAAAAAAAAAAAAAAAAAAAAAQAAAAAA(...TRUNCATED) | "Zan datse raguwar Ba’al daga wannan wuri, da sunayen firistocin gumaka da na marasa sanin Allah, (...TRUNCATED) | "ha" | "ZEP" | "008" |
Dataset Card for BibleTTS Hausa
Dataset Summary
BibleTTS is a large high-quality open Text-to-Speech dataset with up to 80 hours of single speaker, studio quality 48kHz recordings. This is a Hausa part of the dataset. Aligned hours: 86.6, aligned verses: 40,603.
Languages
Hausa
Dataset Structure
Data Fields
audio
: audio pathsentence
: transcription of the audiolocale
: always set toha
book
: 3-char book encodingverse
: verse id
Data Splits
dev
: Book of Ezra (264 verses)test
: Book of Colossians (124 verses)train
: all other books (40215 verses)
Additional Information
*See this notebook for the code on how the dataset was processed.
Dataset Curators
The dataset uploaded by vpetukhov who is not connected to the dataset authors. Please, see the project page for more info.
Licensing Information
The data is released under a commercial-friendly CC-BY-SA license.
Citation Information
Meyer, Josh, et al. "BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus." arXiv preprint arXiv:2207.03546 (2022).
- Downloads last month
- 32