Sayan Nag

Sayan Nag (সায়ন নাগ)

I am currently a Research Scientist at Adobe Research. I have completed my PhD at the University of Toronto and my undergraduate studies in Electrical Engineering from Jadavpur University, India.

We are actively looking for research interns. If you are interested in doing research project(s) at Adobe or collaboration with me, please reach out with your CV.

Email | Google Scholar | Twitter | Github

Updates

Agentic-DRS is going to AAAI 2026! NEW
Localizing Knowledge in DiTs and MAGNET are accepted at NeurIPS 2025! NEW
AVTrustBench, Aurelia and EgoAdapt are going to ICCV 2025! NEW
Recognized as an outstanding reviewer at CVPR 2025!
SafaRi and Meerkat are accepted at ECCV 2024!
EgoVLPv2 is awarded as an EgoVis (Egocentric Vision) 2022/2023 Distinguished Paper (news)!
VistaLLM and MelFusion are selected as Highlights (Top 2.8% of submitted papers) at CVPR 2024!
VistaLLM and MelFusion are accepted at CVPR 2024!
ApoLLo has been accepted at EMNLP 2023!
VoLTA has been accepted at TMLR 2023!
EgoVLPv2 has been accepted at ICCV 2023!
Joined Adobe Research as a research intern!
BeAts has been accepted at Interspeech 2023 as an Oral presentation!
DeCAtt has been accepted at CVPRw 2023 as an Oral presentation!
IDEAL has been accepted at ICASSP 2023!
Our paper on SERF: Towards better training of deep neural networks using log-Softplus ERror activation Function has been accepted at WACV 2023!
Our paper on Deciphering Environmental Air Pollution with Large Scale City Data has been accepted at at IJCAI 2022 as an Oral Presentation!
Our abstract on Fast and scalable estimation of effective connectivity using Neural Network aided P-DCM has been accepted at the OHBM, 2022!
My paper on Graph Self Supervised Learning: the BT, the HSIC, and the VICReg has been presented at IJCAI Weakly Supervised Representation Learning Workshop 2021!
Our paper on CDF-Net: Cross-Domain Fusion Network for Accelerated MRI Reconstruction has been presented at MICCAI 2020!

Research

My research interests broadly include Computer Vision, Self-supervised Learning, Multimodal Learning, Time-series Modeling and Natural Language Understanding. Previously, I also worked in ML for Climate Change , ML for Health, Approximate Optimzation Algorithms and Image Processing. Some recent representative papers can be found below. Other publications can be found in my Google Scholar link.

	Agentic Design Review System Sayan Nag, K J Joseph, Koustava Goswami, Vlad I Morariu, Balaji Vasan Srinivasan AAAI, 2026 Paper \| Project
	Localizing Knowledge in Diffusion Transformers Arman Zarei, Samyadeep Basu, Keivan Rezaei, Zihao Lin, Sayan Nag, Soheil Feizi NeurIPS, 2025 Paper \| Project \| Code
	MAGNET: A Multi-agent Framework for Finding Audio-Visual Needles by Reasoning over Multi-Video Haystacks Sanjoy Chowdhury, Mohamed Elmoghany, Yohan Abeysinghe, Junjie Fei, Sayan Nag, Salman Khan, Mohamed Elhoseiny, Dinesh Manocha NeurIPS, 2025 Paper \| Project
	AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs Sanjoy Chowdhury, Sayan Nag, Subhrajyoti Dasgupta, Yaoting Wang, Mohamed Elhoseiny, Ruohan Gao, Dinesh Manocha ICCV, 2025 Paper \| Project \| Code
	AURELIA: Test-time Reasoning Distillation in Audio-Visual LLMs Sanjoy Chowdhury, Hanan Gani, Nishit Anand, Sayan Nag, Ruohan Gao, Mohamed Elhoseiny, Salman Khan, Dinesh Manocha ICCV, 2025 Paper \| Project \| Code
	EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception Sanjoy Chowdhury, Subrata Biswas, Sayan Nag, Tushar Nagarajan, Calvin Murdock, Ishwarya Ananthabhotla, Yijun Qian, Vamsi Krishna Ithapu, Dinesh Manocha, Ruohan Gao ICCV, 2025 Paper \| Project \| Code
	SafaRi: Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation Sayan Nag, Koustava Goswami, Srikrishna Karanam ECCV, 2024 Paper \| Project
	Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time Sanjoy Chowdhury, Sayan Nag, Subhrajyoti Dasgupta, Jun Chen, Mohamed Elhoseiny, Ruohan Gao, Dinesh Manocha ECCV*, 2024 Paper \| Project \| Code
	MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models Sanjoy Chowdhury, Sayan Nag, Joseph KJ, Balaji Vasan Srinivasan, Dinesh Manocha CVPR, 2024 (Highlight, Top 2.8%) Paper \| Project \| Code \| Dataset
	Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model Shraman Pramanick, Guangxing Han, Rui Hou, Sayan Nag, Ser-Nam Lim, Nicolas Ballas, Qifan Wang, Rama Chellappa, Amjad Almahairi CVPR, 2024 (Highlight, Top 2.8%) Paper \| Project \| Code (Coming Soon)
	APoLLo: Unified Adapter and Prompt Learning for Vision Language Models Sanjoy Chowdhury, Sayan Nag, Dinesh Manocha EMNLP, 2023 Paper \| Code \| Project
	EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone Shraman Pramanick, Yale Song, Sayan Nag, Kevin Qinghong Lin, Hardik Shah, Mike Z. Shou, Rama Chellappa, Pengchuan Zhang ICCV, 2023 Paper \| Code \| Project
	VoLTA: Vision-Language Transformer with Weakly-Supervised Local-Feature Alignment Shraman Pramanick, Li Jing, Sayan Nag, Jiachen Zhu, Hardik Shah, Yann LeCun, Rama Chellappa TMLR*, 2023 Paper \| Code \| Project
	BeAts: Bengali Speech Acts Recognition using Multimodal Attention Fusion Ahana Deb, Sayan Nag, Ayan Mahapatra, Soumitri Chattopadhyay, Aritra Marik, Pijush Kanti Gayen, Shankha Sanyal, Archi Banerjee, Samir Karmakar Interspeech, 2023 (Oral Presentation)* Paper \| Project
	IDEAL: Improved DEnse locAL Contrastive Learning for Semi-Supervised Medical Image Segmentation Hritam Basak, Soumitri Chattopadhyay, Rohit Kundu, Sayan Nag, Rammohan Mallipeddi ICASSP*, 2023 Paper \| Code \| Project
	DeCAtt: Efficient Vision Transformers with Decorrelated Attention Heads Mayukh Bhattacharyya, Soumitri Chattopadhyay, Sayan Nag* CVPRW, 2023 (Oral Presentation) Paper
	Exploring Self-Supervised Representation Learning For Low-Resource Medical Image Analysis Soumitri Chattopadhyay, Soham Ganguly, Sreejit Chaudhury, Sayan Nag, Samiran Chattopadhyay ICIP*, 2023 Paper \| Code
	Deciphering Environmental Air Pollution with Large Scale City Data Mayukh Bhattacharyya, Sayan Nag, Udita Ghosh IJCAI, 2022 (Spotlight & Oral Presentation) * denotes equal contribution Paper \| Project \| Code Air pollutant forecasting using a novel cosSquareFormer.
	SERF: Towards better training of deep neural networks using log-Softplus ERror activation Function Sayan Nag, Mayukh Bhattacharyya, Anuraag Mukherjee, Rohit Kundu WACV, 2023 Paper
	Graph Self Supervised Learning: the BT, the HSIC, and the VICReg Sayan Nag IJCAI Weakly Supervised Representation Learning Workshop (IJCAI-WSRL), 2021 Paper \| Poster Self-Supervised Learning using Graph Neural Networks and a hybrid VICRegHSIC loss function.
	CDF-Net: Cross-Domain Fusion Network for Accelerated MRI Reconstruction Osvald Nitski, Sayan Nag, Chris McIntosh, Bo Wang MICCAI, 2020 Paper
	Hybrid Style Siamese Network: Incorporating style loss in complementary apparels retrieval Mayukh Bhattacharyya, Sayan Nag, CVPR Workshop on Computer Vision for Fashion, Art and Design, 2020 Paper \| Code \| Video
	On the application of deep learning and multifractal techniques to classify emotions and instruments using Indian Classical Music Sayan Nag, Medha Basu, Shankha Sanyal, Archi Banerjee, Dipak Ghosh Physica A: Statistical Mechanics and its Applications, 2022 Paper A new dataset comprising of Indian Classical Music clips is proposed along with a Neural ODE based architecture for MER and MIR tasks.

Some Fun Projects

This section consists of some fun projects that I have undertaken during my leisure times.

Colorization of Old Movies using Deep Learning
Sayan Nag
video

This video is a short clip from the movie Nayak which shows the famous money scene. This work has been done by using self-attention GAN (SAGAN) which colors and restores old images and videos. This movie is considered as one of the most iconic films in the history of Bengali Cinema. It is also one of my grandmom's favorites.

Neural Style Transfer with My Paintings
Sayan Nag

In this small fun project, I have used my paintings to do Neural Style Transfer.

Template borrowed from Jon Barron's website.