Siddharth Sharma – Medium

Siddharth Sharma

AWS Blog In Collaboration With Nvidia — Optimizing Inference For Seq2Seq And Encoder Only Models…

Posted on November 22, 2023 by Siddharth Sharma

Nov 22, 2023

Nov 22, 2023

Compressing LLMs With Low Rank Decomposition Of Attention Matrices

Colab Link To Reproduce Experiment: LLM Compression Via Low Rank Decomposition.ipynb

Nov 22, 2023

Compressing LLMs With Low Rank Decomposition Of Attention Matrices

Nov 22, 2023

Summary Of Adapter Based Performance Efficient Fine Tuning (PEFT) Techniques For Large Language…

The two most common transfer learning techniques in NLP were feature-based transfer (generating input text embedding from a pre-trained…

Apr 21, 2023

Summary Of Adapter Based Performance Efficient Fine Tuning (PEFT) Techniques For Large Language…

Apr 21, 2023

Neural Ranking Architectures

Glimpses On Implicit/Explicit, Dense/Sparse, Gated/Non Gated, Low Rank And Many More Layered Interactions

Jan 19, 2023

Neural Ranking Architectures

Jan 19, 2023

Anatomy Of A Model Inference Service

Context :

Jan 14, 2023

Anatomy Of A Model Inference Service

Jan 14, 2023

Feature Fusion For The Uninitiated

Consider a typical e-commerce product. It would have a variety of content specific features like product title, brand, thumbnail etc and…

Jan 13, 2023

Feature Fusion For The Uninitiated

Jan 13, 2023

Search Query Understanding

Introduction:

Jan 25, 2021

Search Query Understanding

Jan 25, 2021

Of Bandits And Bidding

Real-time bidding(RTB) refers to the buying and selling of online ad impressions through real-time auctions that occur in the time it takes…

May 9, 2016

Of Bandits And Bidding

May 9, 2016

Siddharth Sharma

Siddharth Sharma

Machine Learning Tech Lead Amazon https://www.linkedin.com/in/siddharth-sharma-31140210/

Following

Help
Status
About
Careers
Press
Blog
Privacy
Rules
Terms
Text to speech