Research Engineer

AI EVALUATION
Malaysia

Entry level / Mid Senior level


About the Role 

We’re looking for a Research Engineer to join our AI Evaluation Team and lead the development of scalable, real-world evaluation frameworks for large AI models. You’ll help design benchmarks, set up evaluation pipelines, and drive how we measure, compare, and improve model performance across LLMs, vision models, and multimodal systems. 

 

What You’ll Do: 

  • Design and maintain evaluation benchmarks grounded in real-world use cases 
  • Build scalable, reproducible pipelines for testing LLMs and multimodal systems using tools like vLLM, TGI, and LMDeploy 
  • Analyze existing benchmarks and identify gaps, blind spots, or product misalignment 
  • Propose datasets, metrics, and protocols for meaningful evaluation 
  • Partner with annotation, data, and engineering teams to align benchmark design with data quality and deployment needs 
  • Standardize and preprocess datasets to ensure consistency and traceability 
  • Communicate insights and performance metrics clearly across technical and non-technical teams 

 

What We’re Looking For: 

  • Ph.D., Master’s, or bachelor's degree in computer science, Artificial Intelligence, Engineering, Statistics, or a related technical field.  
  • 1–3 years of practical experience in evaluating machine learning or AI models across different stages of development.  
  • Have experience building or working with AI model evaluation pipelines 
  • Know your way around Python, Pandas, regex, and tools like vLLM, TGI, LMDeploy, or similar 
  • Strong analytical and problem-solving skills, with the ability to interpret evaluation results and contribute to actionable insights.  
  • Effective communication and collaboration skills, with experience working cross-functionally with research, engineering, or product teams.  
  • Familiarity with AI evaluation frameworks, benchmark design principles, and quality assurance methodologies.  
  • Bonus: Experience contributing to research publications, internal technical reports, or open-source benchmarking tools related to model evaluation, metrics, or dataset design. 
APPLY

About the Company

YTL AI Labs Sdn Bhd

At YTL AI Labs, we build sovereign AI models that perform on par with the world’s best—while staying grounded in local needs, values, and context. Our flagship model, Ilmu, is designed to be culturally aware, contextually intelligent, and fluent in Bahasa Melayu, delivering cutting-edge solutions that empower Malaysian businesses with intelligence that truly understands the market and the people they serve.

As pioneers of sovereign AI, we believe every nation should have the power to shape its own intelligence—guided by its people, priorities, and principles.