I am a Pre-Doctoral Researcher at Google Research India, where I work with Dr. Pradeep Shenoy on cognitively inspired deep neural networks.

I am interested in working on problems at the intersection of Deep Learning and Neuroscience with applications in Vision. My research goal is the comparative understanding of primate and machine vision to create robust and interepretable cognitively-inspired ML models.
I am also interested in application of ML to help people with communicative disorders and for promoting and ensuring inclusivity.

At Google, I have had opportunity to work on projects in collaboration with Prof. Ravi Kiran Sarvadevabhatla (IIIT Hyderabad), Prof. Virginia De Sa (UC San Diego), Dr. Mike Mozer (Google Research) and Prof. Wieland Brendel (University of Tubingen).

I have been fortunate in pursuing my master's thesis at Indian Institue of Technology, Delhi under the guidance of Prof. Sumeet Agrawal. Before that, I spent the Summers of 2018 and 2017 at Google, Mountain View as a Software Engineering Intern. I spent the fall semester of 2016 as an exchange student at KTH Royal Institue of Technology.

I was awarded the Summer Undergraduate Research Award for successfully parallezing a parallel program verification engine INSPECT with upto 5x gains in runtime under Prof. Subodh Sharma during the Summer and Fall of 2016.

I graduated with a Dual Degree, Bachelor and Master of Technology in Computer Science and Engineering from Indian Institute of Technology, Delhi, India in 2019. For more details, check my CV or send me an email.

Publications

Robustifying Deep Vision Models Through Shape Sensitization
Aditay Tripathi, Rishubh Singh, Anirban Chakraborty, Pradeep Shenoy
ArXiv Preprint
abstract | pdf| cite

FLOAT: Factorized Learning of Object Attributes for Improved Multi-object Multi-part Scene Parsing
Rishubh Singh, Pranav Gupta, Pradeep Shenoy, Ravi Kiran Sarvadevabhatla
Proceedings of CVPR 2022
project page | pdf| cite

How much complexity does an RNN architecture need to learn syntax-sensitive dependencies?
Gantavya Bhatt*, Hritik Bansal*, Rishubh Singh*, Sumeet Agrawal (* = Equal Contribution)
ACL'20 : Student Research Workshop | Annual Conference of the Association for Computational Linguistics
abstract| pdf| cite

Ongoing Projects

Understanding the transfer gap between models learned from videos and images

  • Humans learn from video-style data which is believed to have better learning cues and generalize well to images.
  • In contrast, models learned from video data do not transfer well to image data.
  • Recent work has tried closing this gap by efforts in reducing distribution shift, but still need additional artificial augmentations to compete with image based methods.
  • Our initial experiments demonstrate a larger gap between video to image transfer than image to image transfer under distribution shifts.
  • Working on categorising underlying reasons and experimenting with remedies to fix them.
  • Object Centric Learning for Robust and Interpretable Image Classification

  • Designing object centric bottlenecks for creating robust latent representations for classification.
  • Creating segmentation sub-tasks for increased interpretability of the model's predictions.
  • Applying part based feature generation and contrastive learning via graph matching for improved generalization.
  • Explicit Orientation Learning for Object-Part Segmentation and Spatial Pose Understanding

  • Humans and primates have an inherent 3D understanding of the world which is built around orientation of object.
  • Explicitly forcing the model to learn orientation of objects and object parts through design changes improves spatial understanding and part segmentation.
  • Indian Institute of Technology, Delhi
    2014 - 2019
    KTH Royal Institute of Technology, Stockholm
    Fall 2016
    Google, Mountain View
    Summers of 2017 and 2018
    Graviton Research Capital LLP
    2019 - 2020
    Google Research India
    2020 - Present