Overview
On Site
USD 300,000.00 per year
Full Time
Skills
Trading
Artificial Intelligence
Real-time
Research
SAN
Computer Hardware
Machine Learning (ML)
Sass
Debugging
GDB
CUDA
InfiniBand
Optimization
Computer Networking
GPU
Training
Algorithms
Health Insurance
Job Details
Salary: up to $300,000 USD base + annual bonus
Summary
Exciting opportunity to work at one of the world's leading proprietary trading firms with offices across the globe. You will be working with an elite global team of top minds in machine learning and engineering to optimize the performance of their models - both training and inference.
Seeking a machine learning engineer with expertise in low-level systems programming and optimization to join a world-class ML team. The team optimizes large-scale AI systems, from low-latency real-time inference to high-throughput research. This involves whole-system tuning, including CUDA, storage, networking, and detailed hardware analysis.
Requirements
NB: Please only apply if you meet the above criteria.
Benefits and Incentives
Whilst we carefully review all applications, to all jobs, due to the high volume of applications we receive it is not possible to respond to those who have not been successful.
Contact
If this sounds like you or you'd like to know more, please get in touch:
Clouie Anareta
linkedin.com/in/clouieanareta
Summary
Exciting opportunity to work at one of the world's leading proprietary trading firms with offices across the globe. You will be working with an elite global team of top minds in machine learning and engineering to optimize the performance of their models - both training and inference.
Seeking a machine learning engineer with expertise in low-level systems programming and optimization to join a world-class ML team. The team optimizes large-scale AI systems, from low-latency real-time inference to high-throughput research. This involves whole-system tuning, including CUDA, storage, networking, and detailed hardware analysis.
Requirements
- Strong understanding of modern ML techniques and toolsets.
- Experience debugging training performance.
- Deep understanding of low-level GPU concepts e.g., PTX, SASS, warps, cooperative groups, Tensor Cores, and the memory hierarchy.
- Proficiency in CUDA debugging and optimization tools e.g., CUDA GDB, NSight Systems, NSight Compute
- Familiarity with core GPU libraries e.g., Triton, CUTLASS, CUB, Thrust, cuDNN, and cuBLAS
- Strong intuition about CUDA performance characteristics.
- Background in GPU networking technologies e.g, Infiniband, RoCE, GPUDirect, PXN, rail optimization, and NVLink, and how to use these networking technologies to link up GPU clusters
- Understanding of distributed GPU training algorithms.
- Ability to question and innovate.
NB: Please only apply if you meet the above criteria.
Benefits and Incentives
- Annual bonus.
- Health insurance and other benefits.
- Paid time off and parental leave.
- A retirement plan with a company match.
- And more!
Whilst we carefully review all applications, to all jobs, due to the high volume of applications we receive it is not possible to respond to those who have not been successful.
Contact
If this sounds like you or you'd like to know more, please get in touch:
Clouie Anareta
linkedin.com/in/clouieanareta
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.