voice cloning on linux

Message

#1 Post by **user101** » Wed Jun 11, 2025 3:47 am

What software are you using to successfully clone voices on Linux?

I have made several attempts & lost countless hours: the overly-complex installations (with too many steps) always fail me at some point. I have just come off wasting several hours trying to insitall https://github.com/erew123/alltalk_tts under Windows, and I failed there again too.

What are your recommendations?
PS. App must work with no internet after it's been installed.

#2 Post by **DukeComposed** » Wed Jun 11, 2025 6:12 am

user101 wrote: Wed Jun 11, 2025 3:47 am What software are you using to successfully clone voices on Linux?

I have made several attempts & lost countless hours: the overly-complex installations (with too many steps) always fail me at some point.

PS. App must work with no internet after it's been installed.

This is now at least the second thread you've opened in the last couple weeks asking for a standalone, offline, GUI voice-synthesizing application that you don't have to think about or do any reading to learn how to use. I'm not aware of an application that hasn't already been mentioned and none of the suggestions in the other thread seem to have been just right for you. I'm curious to know how many more bowls of porridge you're going to eat here before you start asking about what kinds of beds there are for you in which to sleep.

#3 Post by **Nokkaelaein** » Wed Jun 11, 2025 8:02 am

DukeComposed wrote: Wed Jun 11, 2025 6:12 am I'm not aware of an application that hasn't already been mentioned and none of the suggestions in the other thread seem to have been just right for you.

The other thread was about transcribing speech to text, i.e. generating text from user-provided audio input, and the ones mentioned there work in that direction. What is asked here is a solution for text to speech, i.e. generating synthetic speech audio from user-provided text input.

@user101, I've used the Text generation webUI and Coqui TTS in the past, and still can mount them as a ready to use part of my system when needed. These two are also mentioned in the documentation of the AllTalk project. Despite its name, the webUI is an offline component, using a local client-server model (accessed via your browser).

Anyway, the only concrete suggestion I have is, make sure you use the version 2 of the AllTalk project and read the documentation so that you don't follow it blindly, but learn why something is done in a certain way. There really seems to be very good documentation of the installation. There might be a turnkey solution somewhere that I haven't heard about, but in any case, consider that this is all cutting edge software, in the forefront of applying different local AI models in a useful/usable way, and a certain degree of techiness will probably be needed. (I just checked and mounted the local AI tools I have here, and the Text generation webUI alone takes way over four gigabytes, just with the extensive libraries it is using. This is just the installation itself with its libraries, not including any actual models it will access.) The installation instructions provided with tools like this are usually your best resource, and you will most probably just need to take your time and learn the stuff.

#4 Post by **DukeComposed** » Wed Jun 11, 2025 8:14 am

Nokkaelaein wrote: Wed Jun 11, 2025 8:02 am The other thread was about transcribing speech to text, i.e. generating text from user-provided audio input, and the ones mentioned there work in that direction. What is asked here is a solution for text to speech, i.e. generating synthetic speech audio from user-provided text input.

And as I suspect will be the case in this thread as in the other one, OP has secret criteria that well-meaning people trying to be helpful will miss until they proffer a suggestion that fails to meet one of them.

It must be offline. It must be GUI-based. It must not require compiling anything or reading any instructions. Perhaps there are more restrictions, all of which should have been stated entirely upfront instead of teased out piecemeal over the course of three days and fourteen posts.

#5 Post by **Nokkaelaein** » Wed Jun 11, 2025 8:24 am

DukeComposed wrote: Wed Jun 11, 2025 8:14 am It must be offline. It must be GUI-based. It must not require compiling anything or reading any instructions. Perhaps there are more restrictions, all of which should have been stated entirely upfront instead of teased out piecemeal over the course of three days and fourteen posts.

Yeah, if someone somewhere hasn't recently built a turnkey solution like this, it's not going to happen. Anything that is currently cutting edge stuff in this manner... is just not that kind of software.

#6 Post by **DukeComposed** » Wed Jun 11, 2025 8:50 am

Nokkaelaein wrote: Wed Jun 11, 2025 8:24 am Yeah, if someone somewhere hasn't recently built a turnkey solution like this, it's not going to happen. Anything that is currently cutting edge stuff in this manner... is just not that kind of software.

It honestly isn't my place to say if such a utility does or doesn't exist. What bugs me about these threads is the subtle, Platonic iteration through good answers that get rejected for hidden reasons that are only revealed once they are relevant. "I am Socrates. What is the best kitchen utensil?" "Socrates, clearly the best kitchen utensil is a spoon." "Ahh, but a spoon can be slotted, and a slotted spoon cannot hold soup. So a spoon cannot be the best kitchen utensil." Well maybe you should have outlined that soup was a priority when you posed the question, Socrates.

If I can short-circuit that kind of waste of everyone's time so we can get to one or two valid answers that will actually satisfy OP's criteria, it will be worth it.

#7 Post by **user101** » Fri Jun 13, 2025 12:42 am

Thanks for the help.

What is funny is I had https://github.com/erew123/alltalk_tts running (offline, browser GUI) but after install it needed more to download, and it wasn't clear to me how to activate the voice cloning feature. It seems I need to download more bits. I go online the next day and try to 'fix it up' and now it fails to start! I obviously did something and the dominos fell. So I do know how to follow instructions, but this stuff is far from friendly.

Despite DukeComposed's utterly predictable grumpy answers, I am trying my best to get a successful install done. But there comes a point where you spend so much time behind the computer, and you have to get stuff done, despite doing my best to follow instructions to the letter. Voice cloning is something I have tried to get done on multiple occasions, over the past 6 - 12 months. I don't expect anyone to do the work for me, but rather get suggestions people have working successfully. As usual, certain people would be better to ignore the thread if they have no experience.

It just isn't easy to get an install done. You can call me dumb if you want. The first one I tried was Coqui. That failed me in some way after a longgggg download session, though I'm thinking of trying that again too. To get this one done, I tried 3 versions of Python, to name just one troubleshooting step, in an effort to get my AllTalk TTS back on its feet, for example. It takes a heap of time. I thought things might be easier on Windows, but they aren't, at least not in the free open-source software voice cloning department.

Again, I wish people who don't offer anything of value would just keep their misery to themselves.

Thanks Nokkaelaein for your clarification of my question. My older thread just confirms some things for me. Different platforms have different apps, and just use the best tool for the job, even if that means sometimes using a spyware OS!

#8 Post by **DukeComposed** » Sat Jun 14, 2025 11:27 pm

user101 wrote: Fri Jun 13, 2025 12:42 am What is funny is I had https://github.com/erew123/alltalk_tts running (offline, browser GUI) but after install it needed more to download

Despite DukeComposed's utterly predictable grumpy answers, I am trying my best to get a successful install done.

It's one thing to ask people what software they use. Ostensibly that's what this thread is about, but let's be honest. That isn't what you're really asking. You're asking for a specific solution to your exact scenario and you haven't been completely forthcoming about it.

If you're having trouble compiling, it would be advantageous to you to to outline what you've done and what errors you've encountered. This isn't an audio/DAW/DSP forum but there are folks here who do that kind of thing and I'm sure they'd be willing to give some valuable advice if you'd ask for it.

user101 wrote: Fri Jun 13, 2025 12:42 am But there comes a point where you spend so much time behind the computer, and you have to get stuff done, despite doing my best to follow instructions to the letter.... I don't expect anyone to do the work for me

Figuring out how to build software is tricky, no one's saying it's not. What I take umbrage with is the defense of "I don't expect people to do the work for me" after systematically shooting down everyone's suggestions to help you, Goldilocks-style, because they aren't juuust right. I think of asking for help the same way as learning how to knit or crochet. If someone offers to show you how to knit, they are showing kindness. If you reject them and say "I only care about learning how to purl!" then there's a serious disconnect between the teacher and the student.

Asking people what software they use for their workflows is fine, and you have. I have evidence from previous posts that you're fishing for more than just opinions. Casually shooting down each suggestion because you have some kind of bespoke agenda that that tool does not suit is ignorant, nasty, and IMO rather selfish. I read this as a setup. A trap you've laid to beg for good suggestions and then shoot each one you don't like in the head. But sure, call me the grumpy one for pointing out the trap and not for having laid it.

Edkt: typo

#9 Post by **oops** » Sun Jun 15, 2025 7:12 am

FI ... Clone voice AI free online (already exists too)

#10 Post by **user101** » Sun Jun 15, 2025 10:15 pm

DukeComposed wrote: Sat Jun 14, 2025 11:27 pm
user101 wrote: Fri Jun 13, 2025 12:42 am What is funny is I had https://github.com/erew123/alltalk_tts running (offline, browser GUI) but after install it needed more to download

Despite DukeComposed's utterly predictable grumpy answers, I am trying my best to get a successful install done.
It's one thing to ask people what software they use. Ostensibly that's what this thread is about, but let's be honest. That isn't what you're really asking.

No, that's exactly what I"m asking: what software they are using for voice cloning, so I can try something I haven't discovered or have tried, but need to try again. I am not sure why you think there is some 'agenda'.

DukeComposed wrote: Sat Jun 14, 2025 11:27 pm If you're having trouble compiling, it would be advantageous to you to to outline what you've done and what errors you've encountered. This isn't an audio/DAW/DSP forum but there are folks here who do that kind of thing and I'm sure they'd be willing to give some valuable advice if you'd ask for it.

Never even tried to compile it. It looked too complex for me, and these types of things always take me too long to do, because I always get stuck at some point where the person giving instructions have some knowledge I don't. But I might try it if the path of least resistance turns out to be compiling it.

DukeComposed wrote: Sat Jun 14, 2025 11:27 pm Figuring out how to build software is tricky, no one's saying it's not. What I take umbrage with is the defense of "I don't expect people to do the work for me" after systematically shooting down everyone's suggestions to help you, Goldilocks-style, because they aren't juuust right.

Not shooting anyone down. I have listed requirements. In my previous thread, I specifically asked for something easy to install. Compiling code is not in that list, at least not to me.

DukeComposed wrote: Sat Jun 14, 2025 11:27 pm Asking people what software they use for their workflows is fine, and you have. I have evidence from previous posts that you're fishing for more than just opinions.

You can keep that to your wild imagination. I am after suggestions for software I can try. Nothing else.

DukeComposed wrote: Sat Jun 14, 2025 11:27 pm Casually shooting down each suggestion because you have some kind of bespoke agenda that that tool does not suit is ignorant, nasty, and IMO rather selfish. I read this as a setup. A trap you've laid to beg for good suggestions and then shoot each one you don't like in the head. But sure, call me the grumpy one for pointing out the trap and not for having laid it.

I will always choose the path of least resistance. I haven't dedicated my life to computer science. I have limited time to get things done. Contrary to what you might think, I have spent far too many hours trying to get stuff done, many times unsuccessfully. If you have nothing of value to contribute, it would be great if you just ignored my posts permanently. Because you have given me nothing of value. Forgive me for not wanting to compile code and finding an easier solution more in line with my requirements.

MX Linux Forum

voice cloning on linux

voice cloning on linux

Re: voice cloning on linux

Re: voice cloning on linux

Re: voice cloning on linux

Re: voice cloning on linux

Re: voice cloning on linux

Re: voice cloning on linux

Re: voice cloning on linux

Re: voice cloning on linux

Re: voice cloning on linux