[ML] F5-TTS 모델 fine-tuning하기

Notice

Recent Posts

Link

« 2025/06 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Tags more

관리 메뉴

정화 코딩

[ML] F5-TTS 모델 fine-tuning하기 본문

Machine Learnig

[ML] F5-TTS 모델 fine-tuning하기

jungh150c 2024. 11. 18. 19:53

저번에 파인튜닝을 제대로 시켰다고 생각했는데 이상한 기계음만 들리고 결국 실패했다. 그래서 깃허브 이슈들을 쭉 보고 있었는데 충격적인 글 발견..!!!

https://github.com/SWivid/F5-TTS/discussions/57#discussioncomment-10980454

Finetune practice · SWivid F5-TTS · Discussion #57

Full finetune is currently supported, lora or adapter not yet. Set checkpoint_path to pretrained model dir in test_train.py, model/trainer.py will load from there to resume. Reuse the vocab.txt und...

github.com

대충 요약하면 내가 한건 base model 없이 처음부터 train시킨 거였고, fine-tuning하려면 F5-TTS의 base model을 다운받아서 거기에 해야하는 것이었다..!!! 즉, 뻘짓만 하고 있었던 것.

그래서 이 글을 보고 다시 제대로 파인튜닝을 해보려고 한다.

F5-TTS base model 다운받기

https://github.com/SWivid/F5-TTS

GitHub - SWivid/F5-TTS: Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching" - SWivid/F5-TTS

github.com

여기 리드미를 보면 거의 맨 위에 News 부분에 F5-TTS & E2 TTS base models를 다운받을 수 있는 링크가 걸려있다.

https://huggingface.co/SWivid/F5-TTS

SWivid/F5-TTS · Hugging Face

Inference API (serverless) does not yet support f5-tts models for this pipeline type.

huggingface.co

model_1200000.pt를 다운받는다.

위와 같은 위치에 둔다.

F5-TTS base model 파인튜닝하기

accelerate config

위와 같이 설정해준다.

accelerate launch src/f5_tts/train/finetune_cli.py --exp_name F5TTS_Base --learning_rate 0.00001 --batch_size_per_gpu 400 --batch_size_type frame --max_samples 64 --grad_accumulation_steps 4 --max_grad_norm 1 --epochs 10 --num_warmup_updates 500 --save_per_updates 10000 --last_per_steps 20000 --dataset_name my_dataset_char --finetune True

이렇게 실행하면 된다. 계속 GPU 메모리가 부족하여 --batch_size_per_gpu 값과 --grad_accumulation_steps 값을 계속 조정하면서 해주었다.

파인튜닝은 잘 되었는데 무슨 이유에서인지 샘플이 생기지 않아 결과를 확인할 수 없다. 한 팀원의 말에 따르면 파인튜닝한 것보다 pre-train되어 있는 걸로 하는게 더 정확하게 오디오가 나온다고ㅠㅠ

'Machine Learnig' 카테고리의 다른 글

[ML] 감정 기반 오디오 변환 모델 만들기 (1)	2024.11.22
[ML] Simple Two Hidden Layer Deep Learning, Convolutional Neural Network (CNN) (1)	2024.11.20
[ML] F5-TTS 모델 pre-train하기 (0)	2024.11.17
[ML] F5-TTS model 파인튜닝을 위한 데이터 모으기 (4)	2024.11.16
[ML] Spam Classification via Naïve Bayes (0)	2024.10.16

'Machine Learnig' Related Articles

Comments

정화 코딩

[ML] F5-TTS 모델 fine-tuning하기 본문

[ML] F5-TTS 모델 fine-tuning하기

F5-TTS base model 다운받기

F5-TTS base model 파인튜닝하기

'Machine Learnig' 카테고리의 다른 글

티스토리툴바