Customer Solutions Engineer, Clustered Systems

Overview

On Site
Full Time

Skills

Embedded systems
Innovation
Art
Artificial intelligence
Machine Learning (ML)
Training
Systems design
Workflow
Presentations
Data
Effective communication
Design
WINS
Software deployment
HPC
Data center design
SAN
Cloud computing
IaaS
Performance analysis
Scheduling
Resource management
Ansible
Git
Docker
Kubernetes
DevOps
Linux
Remote direct memory access
Computer networking
InfiniBand
Ethernet
Network
Routing protocols
Network security
Communication
Teaching
Collaboration
Knowledge sharing
Standard operating procedure
Knowledge base
Management
Research
High performance computing
Deep learning
Computer science
Physics
Mathematics
CISSP
Amazon Web Services
Google Cloud
Google Cloud Platform
Microsoft Azure
Sales
Purchasing
Military
Law
Recruiting

Job Details

WHAT YOU DO AT AMD CHANGES EVERYTHING

We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences - the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world's most important challenges. We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives.

AMD together we advance_

Customer Solutions Engineer, Clustered Systems

THE ROLE:

We are looking for a Solutions Architect with experience designing and building Clustered Systems . Looking to play a key role in supporting the design and deployment of state-of-the-art AI/ML training and inferencing systems. Able to provide insights on at-scale system design and tuning mechanisms for large-scale compute runs . Excited to be working with the latest Accelerated computing and Deep Learning platforms and help customers to craft improved workflows and develop new solutions. Able to work well cross functionally with multiple organizations within AMD as well as with the customer to ensure a successful and trouble-free deployment.

THE PERSON:

Provide solutions to deploy large scale clustered system , ensure technical relationships with internal and external engineering teams, and build creative solutions based on AMD technology . Develop essential collateral such as white papers, guides, presentations, and test data to facilitate effective communication with customers and internal teams regarding the deployment/scaling of clustered systems .

KEY RESPONSIBILITIES:
  • Provide solutions to deploy large scale clustered system s.
  • Collaborate with multi-functional teams built of customers, external partners, and internal teams from concept to prototype to deployments .
  • Solve complex problems involving multi-site deployments of AMD products .
  • Partner with OEM partners, AMD Engineering, Product, and Sales teams to secure design wins for customers.
  • Enable development and growth of AMD product features through customer feedback and deployment evaluations.


PREFERRED EXPERIENCE:
  • 5+ years of experience in accelerated computing for datacenter/HPC solutions or related experience.
  • Strong background in performance analysis, system profiling, and high-performance computing.
  • Deep understanding of dense data center design and architecture including compute , storage, networking, cloud APIs, and IaaS.
  • Conduct system profiling and performance analysis, utilizing tools such as perftest and rccl_test , to ensure systems operate at peak efficiency.
  • Solid understanding of accelerated computing scheduling and I/O stacks.
  • Experience modern automation, development, and resource management tooling such ansible, git, containers (docker), Kubernetes, etc.
  • Knowledge of container networking, particularly Kubernetes, and experience with DevOps practices.
  • Proficient in Linux based networking technologies and protocols such as RDMA, RoCE, CNI-based container networking, InfiniBand, Ethernet, NVLINK, and familiar with various network topologies, routing protocols and network security practices.
  • Clear verbal and written communication skills, capable of effectively teaching others and contributing to a team's success through collaboration and open information sharing.
  • ( desired) A networker that collaborates with both intra-team and inter-team members; who promotes knowledge sharing (and able to turn that knowledge into standard operating procedures).
  • (desired) Skilled in the development of SOPs and team knowledge base management.
  • (desired) Experience working with engineering or research community supporting high performance computing or deep learning.


ACADEMIC CREDENTIALS:
  • BS (or equivalent experience) in Computer Science, Engineering, Physics or Mathematics
  • (d esired ) Professional Credentials such as - CISSP, CSP (AWS, Google Cloud Platform, Azure) SA-pro , RH CA , CKA , CKS or other industry recognized certification


LOCATION:

AUstin Texas but open to the possibility of remote

AMD make s extensive use of conferencing tools, but occasional travel is required for a local on-site visit to customers and conferences.

#LI-RW1

#LI-HYBRID

At AMD, your base pay is one part of your total rewards package. Your base pay will depend on where your skills, qualifications, experience, and location fit into the hiring range for the position. You may be eligible for incentives based upon your role such as either an annual bonus or sales incentive. Many AMD employees have the opportunity to own shares of AMD stock, as well as a discount when purchasing AMD stock if voluntarily participating in AMD's Employee Stock Purchase Plan. You'll also be eligible for competitive benefits described in more detail here .

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.