Uzbek language AI speech technologies

Author's photo
Islomov Sardor
AI Enthusiast & Software Engineer | Click JSC
in Connect on LinkedIn

AI Enthusiast

Hey there! đź‘‹ I'm an AI researcher with a background spanning software engineering, information security, and machine learning. While AI started as a hobby, it's grown into my passion project now that I've found time to return to the machine learning industry.

I'm currently working on Uzbek language speech technologies, specifically STT/TTS models. For now, I've decided to take the open source route, publishing my work right here and on my Hugging Face account.

My goal? To contribute to Uzbekistan's emerging AI landscape. Because sometimes the most meaningful innovations start with a passion project!

Models

I'm developing a suite of speech AI models tailored specifically for Uzbek language. Here are the current and upcoming models:

NavaiSTT-2v Medium

Available Now

A classic Whisper medium model fine-tuned specifically for the Uzbek language. The training dataset included diverse audio sources: publicly available podcasts, Tashkent dialect podcasts, news content, Google FLEURS, USC, and Common Voice 17. Data quality was mixed with 50% human transcribed and 50% pseudo-transcribed using Gemini 2.5 Pro.

Key Improvements from v1: This version fixes problematic moments from v1 and offers better generalization.

Fully Repeatable Open Source: Due to conflicts with data partners, v1 was removed and the 500-hour dataset was excluded. Instead, new and different datasets were included—all of which will be open-sourced. Training scripts will also be open-sourced, making the entire process fully repeatable.

Dialect Coverage: This model includes some popular Uzbek dialects, providing broader language coverage and improved performance across different regional variations.

View Details

GapTTS-1v

Later This Year

GapTTS-1v is my upcoming Text-to-Speech project for the Uzbek language. While I have a clear vision and have gathered the necessary data for training, development will begin after I complete the current STT work. I'm planning to make GapTTS-1v open source upon completion, bringing natural Uzbek speech synthesis to the AI community.