A multimodal deep learning project focused on detecting fake or manipulated audio-video content using both facial and voice-based analysis. This system combines computer vision and audio processing ...
Built a dual-branch deep learning model (CNN + Bi-LSTM) to classify emotions from speech — trained on IEMOCAP & MELD datasets. 🔧 Tech: PyTorch · Librosa · MFCC · Mel-spectrogram 📊 55.9% accuracy · ...
Customer stories Events & webinars Ebooks & reports Business insights GitHub Skills ...
CEO@Radiotext - All-terrain texting radios for 3 billion people out of cellphone coverage ...
The self-collected audio dataset was split in the ratio of 80:20. The proposed Multi-QuadEmoNet and other methods are experimented using python 3.10.4 version. The software packages employed for this ...
Acoustic feature extraction was carried out using librosa (for pitch, MFCC, mel-spectrogram, and spectral centroid) and openSMILE with the eGeMAPS feature set (for shimmer, harmonics-to-noise ratio, ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results