Librosa Spectrogram Python

Multimodal Face & Voice Anti-Spoofing Detection System

A multimodal deep learning project focused on detecting fake or manipulated audio-video content using both facial and voice-based analysis. This system combines computer vision and audio processing ...

BERT vs LSTM for Small Datasets

Built a dual-branch deep learning model (CNN + Bi-LSTM) to classify emotions from speech — trained on IEMOCAP & MELD datasets. 🔧 Tech: PyTorch · Librosa · MFCC · Mel-spectrogram 📊 55.9% accuracy · ...

GitHub

Faster preprocessing when using new Parakeet TDT please #46247

Customer stories Events & webinars Ebooks & reports Business insights GitHub Skills ...

Python DSP Libraries for Signal Processing and Engineering

CEO@Radiotext - All-terrain texting radios for 3 billion people out of cellphone coverage ...

Frontiers

Multi-QuadEmoNet: cat and dog emotion classification model from animal vocalization using multi-stage LSTM-GRU paradigm

The self-collected audio dataset was split in the ratio of 80:20. The proposed Multi-QuadEmoNet and other methods are experimented using python 3.10.4 version. The software packages employed for this ...

Frontiers

AI-assisted vocal emotion analysis in forensic interview with children: an exploratory study

Acoustic feature extraction was carried out using librosa (for pitch, MFCC, mel-spectrogram, and spectral centroid) and openSMILE with the eGeMAPS feature set (for shimmer, harmonics-to-noise ratio, ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results