I am a
Pre-Doctoral Researcher
at
Google Research India, where I work with
Dr. Pradeep Shenoy on cognitively inspired deep neural networks.
I am interested in working on problems at the intersection of Deep Learning and Neuroscience
with applications in Vision. My research goal is the comparative understanding of primate and machine vision to create robust
and interepretable
cognitively-inspired ML models.
I am also interested in application of ML to help people with communicative
disorders and for promoting and ensuring inclusivity.
At Google, I have had opportunity to work on projects in collaboration with
Prof. Ravi Kiran Sarvadevabhatla (IIIT Hyderabad),
Prof. Virginia De Sa (UC San Diego),
Dr. Mike Mozer (Google Research) and
Prof. Wieland Brendel (University of Tubingen).
I have been fortunate in pursuing my master's thesis at
Indian Institue of Technology, Delhi under
the guidance of
Prof. Sumeet
Agrawal. Before that, I spent the Summers of 2018 and 2017 at Google, Mountain View as a Software Engineering Intern.
I spent the fall semester of 2016 as an exchange student at
KTH Royal Institue of Technology.
I was awarded the
Summer Undergraduate Research Award for successfully parallezing a parallel program verification engine INSPECT with upto 5x gains in runtime
under
Prof. Subodh Sharma during the Summer and Fall of 2016.
I graduated with a Dual Degree, Bachelor and Master of Technology in Computer Science and Engineering from
Indian Institute of Technology, Delhi, India in 2019. For more
details, check my
CV or send me
an
email.
Robustifying Deep Vision Models Through Shape Sensitization
Aditay Tripathi, Rishubh Singh, Anirban Chakraborty, Pradeep Shenoy
ArXiv Preprint
abstract |
pdf|
cite
FLOAT: Factorized Learning of Object Attributes for Improved Multi-object Multi-part Scene Parsing
Rishubh Singh, Pranav Gupta, Pradeep Shenoy, Ravi Kiran Sarvadevabhatla
Proceedings of CVPR 2022
project page |
pdf|
cite
How much complexity does an RNN architecture need to learn syntax-sensitive dependencies?
Gantavya Bhatt*, Hritik Bansal*, Rishubh Singh*, Sumeet Agrawal
(* = Equal Contribution)
ACL'20 :
Student Research Workshop |
Annual Conference of the Association for Computational Linguistics
abstract|
pdf|
cite
Understanding the transfer gap between models learned from videos and images
Humans learn from video-style data which is believed to have better learning cues and generalize well to images.
In contrast, models learned from video data do not transfer well to image data.
Recent work has tried closing this gap by efforts in reducing distribution shift, but still need additional artificial augmentations to compete with image based methods.
Our initial experiments demonstrate a larger gap between video to image transfer than image to image transfer under distribution shifts.
Working on categorising underlying reasons and experimenting with remedies to fix them.
Object Centric Learning for Robust and Interpretable Image Classification
Designing object centric bottlenecks for creating robust latent representations for classification.
Creating segmentation sub-tasks for increased interpretability of the model's predictions.
Applying part based feature generation and contrastive learning via graph matching for improved generalization.
Explicit Orientation Learning for Object-Part Segmentation and Spatial Pose Understanding
Humans and primates have an inherent 3D understanding of the world which is built around orientation of object.
Explicitly forcing the model to learn orientation of objects and object parts through design changes improves spatial understanding and part segmentation.