I am a Pre-Doctoral Researcher
at Google Research India
, where I work with Dr. Pradeep Shenoy
on cognitively inspired deep neural networks.
I am interested in working on problems at the intersection of Deep Learning and Neuroscience
with applications in Vision. My research goal is the comparative understanding of primate and machine vision to create robust
and interepretable cognitively-inspired ML models
I am also interested in application of ML to help people with communicative
disorders and for promoting and ensuring inclusivity.
At Google, I have had opportunity to work on projects in collaboration with Prof. Ravi Kiran Sarvadevabhatla
Prof. Virginia De Sa
(UC San Diego),
Dr. Mike Mozer
(Google Research) and
Prof. Wieland Brendel
(University of Tubingen).
I have been fortunate in pursuing my master's thesis at Indian Institue of Technology, Delhi
the guidance of Prof. Sumeet
. Before that, I spent the Summers of 2018 and 2017 at Google, Mountain View as a Software Engineering Intern.
I spent the fall semester of 2016 as an exchange student at KTH Royal Institue of Technology
I was awarded the
Summer Undergraduate Research Award
for successfully parallezing a parallel program verification engine INSPECT with upto 5x gains in runtime
under Prof. Subodh Sharma
during the Summer and Fall of 2016.
I graduated with a Dual Degree, Bachelor and Master of Technology in Computer Science and Engineering from Indian Institute of Technology, Delhi
, India in 2019. For more
details, check my CV
or send me
Robustifying Deep Vision Models Through Shape Sensitization
Aditay Tripathi, , Anirban Chakraborty, Pradeep Shenoy
FLOAT: Factorized Learning of Object Attributes for Improved Multi-object Multi-part Scene Parsing
, Pranav Gupta, Pradeep Shenoy, Ravi Kiran Sarvadevabhatla
Proceedings of CVPR 2022
How much complexity does an RNN architecture need to learn syntax-sensitive dependencies?
Gantavya Bhatt*, Hritik Bansal*, , Sumeet Agrawal
(* = Equal Contribution)
Student Research Workshop |
Annual Conference of the Association for Computational Linguistics
Understanding the transfer gap between models learned from videos and imagesHumans learn from video-style data which is believed to have better learning cues and generalize well to images.
In contrast, models learned from video data do not transfer well to image data.
Recent work has tried closing this gap by efforts in reducing distribution shift, but still need additional artificial augmentations to compete with image based methods.
Our initial experiments demonstrate a larger gap between video to image transfer than image to image transfer under distribution shifts.
Working on categorising underlying reasons and experimenting with remedies to fix them.
Object Centric Learning for Robust and Interpretable Image ClassificationDesigning object centric bottlenecks for creating robust latent representations for classification.
Creating segmentation sub-tasks for increased interpretability of the model's predictions.
Applying part based feature generation and contrastive learning via graph matching for improved generalization.
Explicit Orientation Learning for Object-Part Segmentation and Spatial Pose UnderstandingHumans and primates have an inherent 3D understanding of the world which is built around orientation of object.
Explicitly forcing the model to learn orientation of objects and object parts through design changes improves spatial understanding and part segmentation.