Datasets like "The LJ Speech Dataset"

Question

I am trying to find databases like the LJ Speech Dataset made by Keith Ito. I need to use these datasets in TacoTron 2 (Link), so I think datasets need to be structured in a certain way. the LJ database is linked directly into the tacotron 2 github page, so I think it's safe to assume it's made to work with it. So I think Databases should have the same structure as the LJ. I downloaded the Dataset and I found out that it's structured like this:

main folder:

    -wavs

        -001.wav

        -002.wav

        -etc
    -metadata.csv: This file is a csv file which contains all the things said in every .wav, in a form like this **001.wav | hello etc.**

So, my question is: Are There other datasets like this one for further training?

But I think there might be problems, for example, the voice from one dataset would be different from the one in one another, would this cause too much problems? And also different slangs or things like that can cause problems?

dabhand · Accepted Answer

There a few resources:

The main ones I would look at are Festvox (aka CMU artic) http://www.festvox.org/dbs/index.html and LibriVoc https://librivox.org/

these guys seem to be maintaining a list https://github.com/candlewill/Speech-Corpus-Collection

And I am part of a project that is collecting more (shameless self plug): https://github.com/Idlak/Living-Audio-Dataset

Datasets like "The LJ Speech Dataset"

Answers (2)

Related Questions

Datasets like &quot;The LJ Speech Dataset&quot;

Answers (2)

Related Questions

Datasets like "The LJ Speech Dataset"