Applied AI/ML Specialist

Overview

On Site
Hybrid
Depends on Experience
Full Time

Skills

Artificial Intelligence
Tensorflow
PyTorch
Python
vSphere
machine learning
LLMs
ESXi
Nutanix

Job Details

Our client 8bit.AI is a dynamic startup in the Bay Area, CA seeking to hire Full-time employees and focused on developing a high-performance, multi-technology, vendor-independent, xPU-based Accelerated Cloud Computing platform. We stack massive clusters purpose-built for high-performance parallel computing and aim to launch a global accelerated cloud solution. Additionally, the firm will focus on broader Artificial General Intelligence (AGI) products, supercomputing services, and end-to-end AI engineering services.

About the Role:

In this dynamic role, you'll bridge the gap between theoretical AI research and practical implementation. You'll leverage your expertise in various AI domains like natural language processing (NLP) and machine learning (ML) to design, develop, and deploy AI models for real-world applications. You will be responsible for the entire ML lifecycle, from model training and optimization to production deployment and monitoring. You will also have the opportunity to contribute to the development of our internal MLOps infrastructure, explore the potential of RAG technology, and leverage your OpenStack expertise to manage the underlying infrastructure for our AI workloads.

Responsibilities:

  • Train and fine-tune large language models (LLMs) using various techniques, including transfer learning and prompt engineering.
  • Leverage LangChain technology to integrate LLMs with other AI systems and applications.
  • Develop AI Solutions to eliminate up to 70% of manual work done in routine IT Operations using agent-based frameworks and the up and coming LAM models
  • Develop end-to-end roadmap for customers that are on the AI journey.
  • Develop and maintain a Model factory including building domain specific models.
  • Develop and manage RAG and Data pipelines.
  • Develop and implement MLOps pipelines for efficient model training, deployment, and monitoring. Conduct research and experimentation with Retrieval-Augmented Generation (RAG) technology.
  • Port existing models to new hardware and software platforms for optimal performance.
  • Collaborate with researchers, engineers, and product managers to define and implement AI solutions.
  • Leverage your OpenStack skills to manage and optimize the infrastructure for AI workloads, including scaling resources and ensuring high availability.
  • Stay up to date on the latest advancements in LLM, LangChain, MLOps, and OpenStack technologies.

Qualifications:

  • Minimum of 3 to 5 years of experience with PhD or Master's degree in Computer Science, Artificial Intelligence, or a related.
  • Proven experience in training and fine-tuning large language models (LLMs).
  • Strong understanding of natural language processing (NLP) techniques.
  • Experience with MLOps tools and methodologies such as MLFlow and Bento.
  • Proven experience with virtualization platforms, preferably including VMware (vSphere, ESXi, etc.) OR Nutanix (AHV, AOS, etc.)
  • Familiarity with LangChain and Retrieval-Augmented Generation (RAG) technologies (a plus). Experience with model porting and optimization for different hardware and software platforms.
  • In-depth knowledge of AL/ML technologies, frameworks (e.g, Tensorflow, PyTorch), and programming languages (e.g, Python, R)
  • A passion for innovation and a desire to push the boundaries of what's possible with AI.
  • Experience from companies like CoreWeave, Vultr, Lambda, Nvidia, and Broadcom is preferred.

Please send your resumes to srini at zaspar dot com

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.