I’m an M.Sc. Data Science student with a strong foundation in Statistics, passionate about building intelligent systems that create real impact. With hands-on experience in ML, DL, and Generative AI, I love writing clean code—and more importantly, explaining it in a way that makes sense to others.
I specialize in:
- Predictive Modeling & ML Pipelines
- Multimodal Systems (Text, Image, Audio)
- LLMs, RAG, and GenAI Applications
Skilled in Python, R, SQL, with expertise in Scikit-learn, XGBoost, Hugging Face Transformers, and tools like Whisper and CLIP. My academic journey in Statistics helped me master complex analytical concepts and apply them in real-world ML workflows. Currently building projects that blend theory with practice—transforming raw data into strategic insights. I thrive on learning, teaching, and collaborating with teams that push the boundaries of AI.
Let’s build something intelligent together.
📖 Latest Blog Articles
📈 My GitHub Stats
My research lies at the intersection of Natural Language Processing, Multimodal Learning, and Mental Health AI, with a focus on weak supervision, representation learning, and foundation-model adaptation under data scarcity.
Identifying Severity of Depression in Forum Posts
Zafar Sarif, Sannidhya Das , Abhishek Das, Md Fahin Parvej, Dipankar Das
· 📘 RANLP 2025 (Workshop on NLP & Language Models for Digital Humanities)
🔗 Paper: https://acl-bg.org/proceedings/2025/LM4DH%202025/pdf/2025.lm4dh-1.12.pdf
- Proposed a two-stage weakly supervised framework for depression severity classification without annotated training data.
- Used BART-MNLI for zero-shot pseudo-label generation and DistilBERT fine-tuning for multi-class prediction.
- Demonstrated the effectiveness of zero-shot learning + weak supervision for low-resource mental health NLP tasks.
- Results: 92% internal accuracy; 28.9% accuracy on the official blind test set.
Keywords: NLP · Weak Supervision · Mental Health AI · Transformers
From Voice to Vision: A Multimodal Approach to Speech Emotion Recognition
Sannidhya Das , Dipanjan Saha, Subharthi Ray, Sainik Kumar Mahata, Dipankar Das
· 📘 SPELLL 2025 (Accepted)
- Reframed Speech Emotion Recognition using spectrograms as visual surrogates for emotional states.
- Introduced class-wise PCA to preserve emotion-discriminative acoustic representations.
- Performed systematic unimodal and multimodal evaluations on the MELD dataset.
- Explored CLIP + Whisper for modality-aware audio–text fusion under class imbalance.
- Results: 0.4975 Macro F1 (text+audio); 0.38 Macro F1 (CLIP–Whisper).
Keywords: Multimodal Learning · SER · CLIP · Whisper · Affective Computing
Interests:
Agentic Solution · RAG-LLM Pipeline · Multimodal Representation Learning · Weakly Supervised Learning · Affective Computing · Foundation Models· Speech · NLP
📝Want To Know More About Me ?





