Overview
Skills
Job Details
Job Title: VLM Data Science Expert
Location: San Jose, CA
Experience Required: 10+ Years
Education:
-
Master's or Ph.D. in Computer Science, Data Science, Machine Learning, or a related field
Join us as a Senior Data Scientist specializing in Vision-Language Models (VLMs) to drive the development and deployment of cutting-edge, cost-efficient multimodal AI solutions. If you are passionate about AI innovation and have experience with frameworks like VILA, Isaac Sim, and VSS, this is your opportunity to apply your skills in a real-world healthcare context, especially within medical device applications. You'll work on impactful projects using cloud platforms such as AWS or Azure, helping shape the future of multimodal intelligence.
Key Responsibilities: VLM Development & Deployment-
Design, train, and deploy Vision-Language Models (e.g., VILA, Isaac Sim) for real-world multimodal AI applications
-
Implement efficient training and inference using methods like knowledge distillation, LoRA fine-tuning, and modal-adaptive pruning
-
Build scalable pipelines for training/testing VLMs on AWS SageMaker or Azure ML
-
Create AI solutions that combine vision and language for tasks such as:
-
Image-text matching
-
Visual question answering (VQA)
-
Document data extraction
-
-
Enhance model performance using interleaved image-text datasets and cross-attention layers
-
Apply VLMs to healthcare-focused use cases like:
-
Medical imaging analysis
-
Position and motion detection
-
Measurement extraction
-
-
Ensure compliance with healthcare data privacy and industry regulations
-
Evaluate trade-offs between performance, cost, and latency using elastic visual encoders and lightweight architectures
-
Benchmark leading VLMs (e.g., GPT-4V, Claude 3.5) across multiple parameters
-
Collaborate with engineers and domain experts to define project goals and technical specifications
-
Mentor junior team members and lead technical discussions across teams
-
10+ years in Machine Learning or Data Science with a focus on Vision-Language Models
-
Proven success in deploying multimodal AI solutions in production environments
-
Healthcare or medical device experience is highly desirable
-
Proficient in Python and ML frameworks like PyTorch or TensorFlow
-
Hands-on with VILA, Isaac Sim, or VSS
-
Skilled in deploying models on AWS SageMaker or Azure ML Studio
-
Familiarity with medical imaging datasets and healthcare compliance standards
-
Strong analytical and problem-solving capabilities
-
Excellent communication skills to articulate complex ideas to diverse stakeholders
-
VLM Frameworks: VILA, Isaac Sim, EfficientVLM
-
Cloud Platforms: AWS SageMaker, Azure ML
-
Optimization Techniques: LoRA fine-tuning, modal-adaptive pruning
-
Multimodal Techniques: Cross-attention layers, interleaved image-text datasets
-
MLOps Tools: Docker, MLflow