- Professional
- Office in Morrisville
					Lenovo is seeking an experienced Software Engineer to lead the integration and implementation of Small Language Model (SLM) inferencing for our next-generation AI systems onto our Windows laptop and desktop computers. If you are passionate about making Smarter Technology For All, come help us realize our Hybrid AI vision! Responsibilities: Design, implement, and maintain core agent runtimes responsible for: Dynamic model loading and lifecycle management Scheduling, prioritization, and queuing of model inference requests Information retrieval, data preprocessing, and context preparation pipelines Develop system integrations to support interoperability between Windows applications, services, and AI runtime components Implement security and privacy controls, including process isolation, sandboxing, audit logging, and compliance with enterprise-grade software security standards Optimize runtime performance for latency, throughput, and memory footprint across heterogeneous compute platforms (CPU, GPU, NPU) and across various vendor AI Frameworks (such as OpenVino, RyzenAI, and QNN) Embed safety and interpretability features into the stack, including guardrails, telemetry, and explainability mechanisms Contribute to platform reliability, including fault tolerance, error handling, and automated recovery strategies Collaborate with cross-functional teams (AI researchers, product managers, QA, and DevOps) to deliver robust, production-ready solutions Mentor junior engineers by providing technical guidance, conducting code reviews, and supporting onboarding Author technical documentation and best practices, contributing to internal knowledge bases, design proposals, and architecture reviews Required Qualifications:   Expertise in Windows development, including Win32 APIs, DLL development, and system-level programming Strong understanding of Windows software security best practices (e.g., secure coding, privilege management, sandboxing) Experience with GGML, GGUF, and llama.cpp for local model inference Proficiency in C/C++ for client and systems development Preferred Qualifications:Familiarity with ONNX, OpenVino, RyzenAI, and QNN runtimes Working knowledge of Python and deep learning frameworks such as PyTorch Strong problem-solving and debugging skills in complex, multi-threaded environments Experience with Kotlin Multiplatform (KMP) or other cross-platform development frameworks Contributions to open-source AI runtimes, libraries, or Windows utilities Background in performance optimization, compiler toolchains, or low-level hardware acceleration #AAITC
					
					
					
					
					
					
						Apply Now
						
					
				
			 
			 
			 
			 
			