As an Inference Optimization Engineer, you will improve the speed and efficiency of large language models at the GPU kernel level, through the inference engine, and across distributed architectures.
BentoML is a prominent provider of inference platforms designed to assist AI teams in efficiently running large language models and generative AI workloads at scale. Backed by investors like DCM, the company serves enterprises globally, ensuring consistent scalability and performance in production environments. BentoML offers a diverse portfolio that includes both open-source and commercial products, with a mission to empower teams to leverage AI for building competitive advantages.
Share This Job!
Save This Job!
BentoML is a prominent provider of inference platforms designed to assist AI teams in efficiently running large language models and generative AI workloads at scale. Backed by investors like DCM, the company serves enterprises globally, ensuring consistent scalability and performance in production environments. BentoML offers a diverse portfolio that includes both open-source and commercial products, with a mission to empower teams to leverage AI for building competitive advantages.
Share This Job!
Save This Job!
Join Augmodo as an ML Optimization Engineer to optimize and deploy cutting-edge computer vision algorithms on edge devices.
Join Waabi as a PnP Optimization Engineer to develop and optimize self-driving technology.
Join Ubiminds as a Cloud Optimization Engineer and leverage your skills in Azure to drive cost-effective cloud solutions.
Join Sully.ai as a Senior AI Systems Engineer to optimize and deploy large language models on advanced GPU infrastructure.
Join iPullRank as a Search Engine Optimization Engineer to enhance Organic Search visibility through technical SEO and innovative strategies.