Ankit Singh

I am currently working as a Research engineer where my focus is to develop multi-modal large language models (LLMs). Previously, I was a research student at Computer Science & Engineering Department, Indian Institute of Technology, Madras ( IIT Madras ) , where I worked on Computer Vision and Deep Learning.
I did my undergraduate studies in Computer Science at National Institute of Technology, Silchar( NIT Silchar )

Email / Twitter

Research

My research interests mainly lie in the areas of computer vision and deep learning. In partiuclar, my current work is particularly focused on vision language models and label-efficient (Semi-Supervised/ Unsupervised /Self-Supervised ) approaches for deep-learning across Images/Videos.
In addition, I am also interested in video understanding, representation learning ,domain adaptation and transfer learning.

Google Scholar / Github

Highlights

Two papers in CVPR 2026
Paper on Vision-Language Models can't see the Obvious ICCV 2025
Paper on Harnessing Frozen Unimodal Encoders for Flexible Multimodal Alignment CVPR 2025
Paper on permutation symmetries in Bayesian neural network posteriors at NeurIPS 2023
Institute Research Award 2022, IIT Madras
Student Volunteer for NeurIPS 2021
Paper on Semi-Supervised Domain Adaptation accepted at NeurIPS 2021
Paper on Semi-Supervised Action Recognition accepted at CVPR 2021.
Paper on Mitigating data Imbalance accepted at ECCV-W 2020.

Publications

Selected publications are listed here, for full list of works kindly visit to the google scholar link provided above.

On permutation symmetries in Bayesian neural network posteriors: a variational perspective
Simone Rossi, Ankit Singh and Thomas Hannagan
Neural Information Processing Systems (NeurIPS), 2023

In this work, we first extend the formalism of marginalized loss barrier and solution interpolation to BNNs, before proposing a matching algorithm to search for linearly connected solutions. This is achieved by aligning the distributions of two independent approximate Bayesian solutions with respect to permutation matrice

CLDA: Contrastive Learning for Semi-Supervised Domain Adaptation
Ankit Singh
Neural Information Processing Systems (NeurIPS), 2021

We propose a contrastive framework for semi-supervised domain adaptation (SSDA) where we use instance alignment between unlabeled target samples and centroid alignment between source and target domains.

Semi-Supervised Action Recognition with Temporal Contrastive Learning
Ankit Singh* , Omprakash Chakraborty*, Ashutosh Varshney, Rameswar Panda, Rogerio Feris, Kate Saenko, Abir Das
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021

We propose a temporal contrastive learning framework for semi-supervised action recognition by using contrastive losses between different videos and groups of videos with similar actions.

Mitigating Dataset Imbalance via Joint Generation and Classification
Aadarsh Sahoo* , Ankit Singh* , Rameswar panda, Rogerio Feris, Abir Das
ECCV Workshop on Imbalance Problems in Computer Vision (ECCV-W), 2020

We introduce a joint dataset repairment strategy by combining classifier with a GAN that makes up for the deficit of training examples from the minority class by producing additional examples.

Services

Reviewer: CVPR, ICCV, ECCV, ICLR, NeurIPS, AAAI, WACV, ACCV, BMVC, TPAMI

Website template from here

* denotes equal contribution