AI/ML Operations OR Deployment Engineer at Warren, NJ (onsite from day 1 )

Overview

On Site
Accepts corp to corp applications
Contract - W2
Contract - Independent

Skills

JSON
Python
Docker
Kubernetes
pytorch
AI
ML
Helm
SLURM
CUDA
TensorRT
Jupyter

Job Details

Title: AI/ML Operations/Deployment Engineer

Location: Warren, NJ (Complete onsite)

Duration: 12 months contract

Must have: AI/ML, Docker Containerization and Orchestration, Kubernetes, Helm Charts, Python & JSON. AI Frameworks (PyTorch), Linux Resource Management (SLURM), Nvidia Software Stack (CUDA, TensorRT, Triton). Jupyter Notebook and Strong communicator

Job Description:

10+ years profiles preferred

1. Docker Containerization and Orchestration strong understanding of pods, services and deployments

2. Collaborate with AI/ML teams to understand their requirements and translate them into scalable Kubernetes-based infrastructure solutions.

3. Docker, Operators and Helm charts

4. Understanding of Kubernetes security best practices (e.g., RBAC, network, and pod security policies)

5. Ability to set up monitoring, logging, and alerting for Kubernetes clusters using PrometheGrafana

6. Optimize Kubernetes cluster performance, resource utilization.

7. Python, JSON

Desired Skills: Working level understanding of the following:

1. Desired outcomes of AI Platforms required to support the Data Scientist community

2. AI Frameworks like PyTorch

3. Linux Resource Management Tools SLURM

4. Nvidia Software Stack CUDA, TensorRT, Triton Inference Server

5. Jupyter Notebook

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.