Common Voice Spontaneous Speech 1.0 - Ushojo
Locale: ush
Size: 102.40 MB
Task: ASR
Format: MP3
License: CC-0
[Ushojo] — Ushojo (ush
)
This datasheet is for version 23.0 of the the Mozilla Common Voice Spontaneous Speech dataset
for Ushojo (ush
). The dataset contains 6 hours of recorded
speech (5 hours validated) from 10 speakers.
Language
Ushojo is an Indo-Aryan language spoken by about 1000-1200 people in Bishigram near Madyan in Swat Pakistan
Demographic information
The dataset includes the following distribution of age and gender.
Gender
Self-declared gender information, frequency refers to the number of clips annotated with this gender.
Age
Self-declared age information, frequency refers to the number of clips annotated with this age band.
Transcriptions
Spontaneous speech prompted to the system and then transcribed into audio.
Writing system
Shina, Torwali based on Perso-Arabic
Symbol table
ݜ، ڙ، ڇ، أ، نڑ different from Urdu
Questions
There follows a randomly selected sample of transcribed responses from the corpus. تُو کامیک رونگ خوشاریلا؟ می تا ہر فن خوشاریما کے فن خو فن بینو. تو کدا کدا خارو کئی بجونو خوشاریما؟ پٹوئیا کارے جے کاروبار اِسٹارٹ بینو؟ تو آسو جیب رس بئیلا؟
Responses
There follows a randomly selected sample of transcribed responses from the corpus. می تہ ہر فن خوشاریما کے فن خو فن بینو۔ می تہ ݜیلو رونگ لالو خوش ہنو۔ آسو شیِدلے موسم در خار کئی بجونو خوشاریما۔ کے گرمی نی بیلو۔ موسم برابر بیلو۔ ہاں۔ مہ تی توسی جیب شنوٹو شنوٹو رز بئیلا۔ مہ تی کامن وائس بارا در تپوس کیلا۔
Recommended post-processing
More datasets needed
Community links
With the community. I have good network with them
Discussions
No
Contribute
NA
Datasheet authors
Zubair Torwali, ztorwali@gmail.com 2. Javid Iqbal Torwali email: jitorwali@gmail.com
Citation guidelines
Javid Iqbal Torwali ' 2. Ihsan Ullah 3. Tariq Aziz 4. Zubair Torwali
Funding
yes we acknowledge
Licence
This dataset is released under the Creative Commons Zero (CC-0) licence. By downloading this data you agree to not determine the identity of speakers in the dataset.