Datasets

Filters:
Common Voice

Common Voice Spontaneous Speech 2.0 - Catalan

A collection of spontaneous spoken phrases in Catalan.
License Icon

License: CC0-1.0

Locale Icon

Locale: ca

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 11.78 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Bukusu

A collection of spontaneous spoken phrases in Bukusu.
License Icon

License: CC0-1.0

Locale Icon

Locale: bxk

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 258.53 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Sabah Bisaya

A collection of spontaneous spoken phrases in Sabah Bisaya.
License Icon

License: CC0-1.0

Locale Icon

Locale: bsy

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 219.99 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Bodo

A collection of spontaneous spoken phrases in Bodo.
License Icon

License: CC0-1.0

Locale Icon

Locale: brx

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 1.29 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Breton

A collection of spontaneous spoken phrases in Breton.
License Icon

License: CC0-1.0

Locale Icon

Locale: br

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 13.57 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Betawi

A collection of spontaneous spoken phrases in Betawi.
License Icon

License: CC0-1.0

Locale Icon

Locale: bew

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 213.73 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Basaa

A collection of spontaneous spoken phrases in Basaa.
License Icon

License: CC0-1.0

Locale Icon

Locale: bas

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 109.37 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Bashkir

A collection of spontaneous spoken phrases in Bashkir.
License Icon

License: CC0-1.0

Locale Icon

Locale: ba

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 5.08 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Aragonese

A collection of spontaneous spoken phrases in Aragonese.
License Icon

License: CC0-1.0

Locale Icon

Locale: an

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 2.24 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Gheg Albanian

A collection of spontaneous spoken phrases in Gheg Albanian.
License Icon

License: CC0-1.0

Locale Icon

Locale: aln

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 200.85 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Adyghe

A collection of spontaneous spoken phrases in Adyghe.
License Icon

License: CC0-1.0

Locale Icon

Locale: ady

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 107.44 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Arvanitika

A collection of spontaneous spoken phrases in Arvanitika.
License Icon

License: CC0-1.0

Locale Icon

Locale: aat

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 46.68 MB

Common Voice

Common Voice Scripted Speech 24.0 - Teutila Cuicatec

A collection of scripted spoken phrases in Teutila Cuicatec.
License Icon

License: CC0-1.0

Locale Icon

Locale: cut

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 209.52 MB

Common Voice

Common Voice Scripted Speech 24.0 - Norwegian Nynorsk

A collection of scripted spoken phrases in Norwegian Nynorsk.
License Icon

License: CC0-1.0

Locale Icon

Locale: nn-NO

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 33.55 MB

My Cool Organization Changed Again

rm-vallader test

rm-vallader test
License Icon

License: BSD-3-Clause

Locale Icon

Locale: rm-vallader

Task Icon

Task: NLP

Format Icon

Format: MP3

Size Icon

Size: 2.63 MB

Aaron Tello-Wharton

checksum dataset

License Icon

License: Apache-2.0

Locale Icon

Locale: en-US

Task Icon

Task: N/A

Format Icon

Format: Not specified

Size Icon

Size: 914.69 KB

Aaron Tello-Wharton

dawdad

wadaddwa
License Icon

License: Apache-2.0

Locale Icon

Locale: awdad

Task Icon

Task: NLP

Format Icon

Format: awdawd

Size Icon

Size: 34.00 MB

MozFam

Common Voice AZ DF

That's my little test upload. It contains the cv 10 corpus for az.
License Icon

License: CC0-1.0

Locale Icon

Locale: az

Task Icon

Task: ASR

Format Icon

Format: mp3

Size Icon

Size: 3.41 MB

Aaron Tello-Wharton

test

test
License Icon

License: Apache-2.0

Locale Icon

Locale: en-US

Task Icon

Task: N/A

Format Icon

Format: WAV

Size Icon

Size: 2.63 MB

Common Voice

Dataset for API & Python SDK Tests [Do not remove] - Mock Spontaneous Speech English

DO NOT DELETE. E2E tests of the Python SDK depend on this test dataset
License Icon

License: CC-BY-4.0

Locale Icon

Locale: en-US

Task Icon

Task: NLP

Format Icon

Format: CSV

Size Icon

Size: 119.84 KB

New Test Organization by Liv

test with better name

License Icon

License: Apache-2.0

Locale Icon

Locale: en-US

Task Icon

Task: NLP

Format Icon

Format: Not specified

Size Icon

Size: 7.37 MB

Community

Community Dataset

Community Dataset
License Icon

License: CC-BY-SA-4.0

Locale Icon

Locale: en-US

Task Icon

Task: RAG

Format Icon

Format: MP3

Size Icon

Size: 2.76 MB

My Cool Organization Changed Again

Community Dataset

My Community Dataset
License Icon

License: BSD-3-Clause

Locale Icon

Locale: en-US

Task Icon

Task: MT

Format Icon

Format: MP3

Size Icon

Size: 2.76 MB

Mozilla

Otro dataset bonito

Esta es una descripción bastante corta para describir mi dataset
License Icon

License: CC-BY-ND-4.0

Locale Icon

Locale: es_MX

Task Icon

Task: NLP

Format Icon

Format: wav

Size Icon

Size: 180.78 MB