Sayan Nag (সায়ন নাগ)

I am a PhD student at the University of Toronto, where I work on Artificial Intelligence and Neuroscience. I have completed my undergraduate studies in Electrical Engineering from Jadavpur University, India.

Email   |   Google Scholar   |   Twitter   |   Github

profile photo
Updates
  • SafaRi and Meerkat are accepted at ECCV 2024! NEW
  • EgoVLPv2 is awarded as an EgoVis (Egocentric Vision) 2022/2023 Distinguished Paper (news)!
  • VistaLLM and MelFusion are selected as Highlights (Top 2.8% of submitted papers) at CVPR 2024!
  • VistaLLM and MelFusion are accepted at CVPR 2024!
  • ApoLLo has been accepted at EMNLP 2023!
  • VoLTA has been accepted at TMLR 2023!
  • EgoVLPv2 has been accepted at ICCV 2023!
  • Joined Adobe Research as a research intern!
  • BeAts has been accepted at Interspeech 2023 as an Oral presentation!
  • DeCAtt has been accepted at CVPRw 2023 as an Oral presentation!
  • IDEAL has been accepted at ICASSP 2023!
  • Our paper on SERF: Towards better training of deep neural networks using log-Softplus ERror activation Function has been accepted at WACV 2023!
  • Our paper on Deciphering Environmental Air Pollution with Large Scale City Data has been accepted at at IJCAI 2022 as an Oral Presentation!
  • Our abstract on Fast and scalable estimation of effective connectivity using Neural Network aided P-DCM has been accepted at the OHBM, 2022!
  • My paper on Graph Self Supervised Learning: the BT, the HSIC, and the VICReg has been presented at IJCAI Weakly Supervised Representation Learning Workshop 2021!
  • Our paper on CDF-Net: Cross-Domain Fusion Network for Accelerated MRI Reconstruction has been presented at MICCAI 2020!
Research

My research interests broadly include Computer Vision, Self-supervised Learning, Multimodal Learning, Time-series Modeling and Natural Language Understanding. I also work in ML for Climate Change and ML for Health. Previously, I have worked on Approximate Optimzation Algorithms and Image Processing. Some recent representative papers can be found below. Other publications can be found in my Google Scholar link.

project image
SafaRi: Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation
Sayan Nag, Koustava Goswami, Srikrishna Karanam
ECCV, 2024

Paper | Project | Code (Coming Soon)

project image
Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time
Sanjoy Chowdhury*, Sayan Nag*, Subhrajyoti Dasgupta*, Jun Chen, Mohamed Elhoseiny, Ruohan Gao, Dinesh Manocha
ECCV, 2024

Paper | Project (Coming Soon) | Code (Coming Soon)

project image
MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models
Sanjoy Chowdhury*, Sayan Nag*, Joseph KJ, Balaji Vasan Srinivasan, Dinesh Manocha
CVPR, 2024   (Highlight, Top 2.8%)

Paper | Project | Code | Dataset

VistaLLM_Sampling
Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model
Shraman Pramanick*, Guangxing Han*, Rui Hou, Sayan Nag, Ser-Nam Lim, Nicolas Ballas, Qifan Wang, Rama Chellappa, Amjad Almahairi
CVPR, 2024   (Highlight, Top 2.8%)

Paper | Project | Code (Coming Soon)

VistaLLM_Sampling
APoLLo: Unified Adapter and Prompt Learning for Vision Language Models
Sanjoy Chowdhury*, Sayan Nag*, Dinesh Manocha
EMNLP, 2023

Paper | Code | Project

EgoVLPv2 EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone
Shraman Pramanick, Yale Song, Sayan Nag, Kevin Qinghong Lin, Hardik Shah, Mike Z. Shou, Rama Chellappa, Pengchuan Zhang
ICCV, 2023

Paper | Code | Project

VoLTA
VoLTA: Vision-Language Transformer with Weakly-Supervised Local-Feature Alignment
Shraman Pramanick*, Li Jing*, Sayan Nag*, Jiachen Zhu, Hardik Shah, Yann LeCun, Rama Chellappa
TMLR, 2023

Paper | Code | Project

VistaLLM_Sampling
BeAts: Bengali Speech Acts Recognition using Multimodal Attention Fusion
Ahana Deb*, Sayan Nag*, Ayan Mahapatra*, Soumitri Chattopadhyay*, Aritra Marik*, Pijush Kanti Gayen, Shankha Sanyal, Archi Banerjee, Samir Karmakar
Interspeech, 2023   (Oral Presentation)

Paper | Project

VistaLLM_Sampling
IDEAL: Improved DEnse locAL Contrastive Learning for Semi-Supervised Medical Image Segmentation
Hritam Basak, Soumitri Chattopadhyay*, Rohit Kundu*, Sayan Nag*, Rammohan Mallipeddi
ICASSP, 2023

Paper | Code | Project

VistaLLM_Sampling
DeCAtt: Efficient Vision Transformers with Decorrelated Attention Heads
Mayukh Bhattacharyya*, Soumitri Chattopadhyay*, Sayan Nag*
CVPRW, 2023   (Oral Presentation)

Paper

VistaLLM_Sampling
Exploring Self-Supervised Representation Learning For Low-Resource Medical Image Analysis
Soumitri Chattopadhyay, Soham Ganguly*, Sreejit Chaudhury*, Sayan Nag*, Samiran Chattopadhyay
ICIP, 2023

Paper | Code

Deciphering Environmental Air Pollution with Large Scale City Data
Mayukh Bhattacharyya*, Sayan Nag*, Udita Ghosh
IJCAI, 2022   (Spotlight & Oral Presentation)

* denotes equal contribution

Paper | Project | Code

Air pollutant forecasting using a novel cosSquareFormer.

SERF: Towards better training of deep neural networks using log-Softplus ERror activation Function
Sayan Nag*, Mayukh Bhattacharyya*, Anuraag Mukherjee*, Rohit Kundu*
WACV, 2023
Paper

Graph Self Supervised Learning: the BT, the HSIC, and the VICReg
Sayan Nag
IJCAI Weakly Supervised Representation Learning Workshop (IJCAI-WSRL), 2021
Paper | Poster

Self-Supervised Learning using Graph Neural Networks and a hybrid VICRegHSIC loss function.

CDF-Net: Cross-Domain Fusion Network for Accelerated MRI Reconstruction
Osvald Nitski, Sayan Nag, Chris McIntosh, Bo Wang
MICCAI, 2020
Paper

Hybrid Style Siamese Network: Incorporating style loss in complementary apparels retrieval
Mayukh Bhattacharyya, Sayan Nag,
CVPR Workshop on Computer Vision for Fashion, Art and Design, 2020
Paper | Code | Video

On the application of deep learning and multifractal techniques to classify emotions and instruments using Indian Classical Music
Sayan Nag, Medha Basu, Shankha Sanyal, Archi Banerjee, Dipak Ghosh
Physica A: Statistical Mechanics and its Applications, 2022
Paper

A new dataset comprising of Indian Classical Music clips is proposed along with a Neural ODE based architecture for MER and MIR tasks.

Some Fun Projects

This section consists of some fun projects that I have undertaken during my leisure times.

Colorization of Old Movies using Deep Learning
Sayan Nag
video

This video is a short clip from the movie Nayak which shows the famous money scene. This work has been done by using self-attention GAN (SAGAN) which colors and restores old images and videos. This movie is considered as one of the most iconic films in the history of Bengali Cinema. It is also one of my grandmom's favorites.

Neural Style Transfer with My Paintings
Sayan Nag

In this small fun project, I have used my paintings to do Neural Style Transfer.

Template borrowed from Jon Barron's website.