India

Distributed Training Salem

Distributed Training Salem
Description
Overview

Join a highly advanced AI infrastructure team focused on building and optimizing large-scale machine learning systems. This workplace leverages cutting-edge technologies to enable high-performance experimentation, scalable model deployment, and effective processing of large datasets.

The team operates globally, bringing together engineers and researchers to push the boundaries of deep learning, distributed systems, and next-generation compute platforms.

About the Role

This position is centered on maximizing the efficiency and scalability of GPU-based machine learning workloads, particularly for large language models (LLMs) and generative AI systems.

You will work on improving both training performance and inference efficiency, ensuring optimal utilization of hardware resources, reduced latency, and faster model iteration cycles. The role requires hands-on expertise in deep learning frameworks, distributed systems, and performance optimization.

Key Responsibilities

Enhance performance of distributed training frameworks such as PyTorch, DeepSpeed, or similar systems

Identify and resolve bottlenecks in large-scale training pipelines (e.g., memory usage, communication overhead, GPU utilization)

Optimize inference systems using techniques like quantization, caching, and batching to achieve low latency and high throughput

Collaborate with infrastructure and platform teams to improve resource orchestration, scheduling, and system reliability

Design benchmarking tools and metrics to measure training efficiency, system throughput, and latency performance

Apply advanced optimization techniques (e.g., mixture-of-experts, speculative decoding, model parallelism) to improve large model performance

Continuously evaluate new approaches to hardware acceleration and model execution efficiency

Required Qualifications

3+ years of hands-on experience optimizing GPU-based machine learning workloads

Solid expertise in deep learning frameworks such as PyTorc Apply on Kit Job: kitjob.in/job/4mzgc7
Highlights
Safety Tips
Be careful if you are offered a job on the spot.
1 / 10
More info about this ad

Distributed Training Salem has been posted in the Salem Transportation & Logistics category on Locanto.

If you’re still wanting to browse, there is so much to explore in the Transportation & Logistics category! Take a look at the ads CDL A Shuttle Truck Driver (R204201), Salem, CDL A Local Delivery Truck Driver (R205841), Salem and CDL A Shuttle Truck Driver - Sysco Portland - Newport (R230925) in 4354 S Coast Hwy 101, Salem to discover more of what you’re looking for. Currently, there are 3 ads posted in the Transportation & Logistics category in Salem.

You can find the Transportation & Logistics category under Jobs. Want something else? Check out the related categories Healthcare, Beauty & Wellness, Accounting, Financing & Banking and Other Jobs Salem.

There are more ads within a 15 km radius for this category. If you want to view those ads, click here.