Manas Sahni

Hi!

I’m a Software Engineer working on Deep Learning Libraries at NVIDIA.

I completed my MS in CS at Georgia Tech, where I was advised by Prof. Alexey Tumanov in the Systems for Artificial Intelligence Lab (SAIL), working on making machine learning applications, tools, and algorithms cheaper and accessible.

In Summer 2020, I interned at Nvidia with the TensorRT team, optimizing systems for deep-learning inference on GPUs.

Prior to starting at Georgia Tech, I was an ML Software Engineer at Samsung India R&D, where I helped bring flagship vision applications to low-power devices through the Samsung Neural SDK.
I completed my Bachelor’s degree in Math & Computing at Delhi Technological University.

News

Jul 2021: Joined NVIDIA’s as a Software Engineer on the Deep Learning Libraries team

May 2021: Completed my MS in CS (Machine Learning) from Georgia Tech

Jan 2021: Our paper was accepted to ICLR 2021! CompOFA: Compound Once-For-All Networks for Faster Multi-Platform Deployment

Jan 2021: I will be serving as Head TA for CS7643 Deep Learning in Spring 2021

Interests

Systems for Deep Learning
Neural Architecture Search
Computer Vision
High-Performance Computing

Education

M.S. in Computer Science, 2019-2021

Georgia Institute of Technology
B.Tech. in Mathematics & Computing, 2013-2017

Delhi Technological University

Manas Sahni, Shreya Varshini, Alind Khare, Alexey Tumanov || Systems for AI Lab (SAIL), Georgia Institute of Technology

Apr 28, 2021 5 min read

CompOFA: Compound Once-For-All Networks for Faster Multi-Platform Deployment

CompOFA improves the speed, cost, and usability of jointly training models for many deployment targets. By highlighting insights on model design and system deployment, we try to address an important problem for real-world usability of DNNs.

Aug 26, 2019 16 min read

Anatomy of a High-Speed Convolution

It’s no surprise that modern deep-learning libraries have production-level, highly-optimized implementations of most operations. But just what is the black magic that these libraries use that we mere mortals don’t? What exactly does one do to “optimize” or accelerate neural networks operations?

Last updated on Dec 7, 2018 13 min read

Making Neural Nets Work With Low Precision

Deploying efficient neural nets on mobiles is becoming increasingly important. This post explores the concept of quantized inference, and how it works in TensorFlow Lite.

MS CS student

Georgia Institute of Technology

Hi!