Lenovo is seeking an experienced Software Engineer to lead the integration and implementation of Small Language Model (SLM) inferencing for our next-generation AI systems onto our Windows laptop and desktop computers. If you are passionate about making Smarter Technology For All, come help us realize our Hybrid AI vision! Responsibilities: Design, implement, and maintain core agent runtimes responsible for: Dynamic model loading and lifecycle management Scheduling, prioritization, and queuing of model inference requests Information retrieval, data preprocessing, and context preparation pipelines Develop system integrations to support interoperability between Windows applications, services, and AI runtime components Implement security and privacy controls, including process isolation, sandboxing, audit logging, and compliance with enterprise-grade software security standards Optimize runtime performance for latency, throughput, and memory footprint across heterogeneous compute platforms (CPU, GPU, NPU) and across various vendor AI Frameworks (such as OpenVino, RyzenAI, and QNN) Embed safety and interpretability features into the stack, including guardrails, telemetry, and explainability mechanisms Contribute to platform reliability, including fault tolerance, error handling, and automated recovery strategies Collaborate with cross-functional teams (AI researchers, product managers, QA, and DevOps) to deliver robust, production-ready solutions Mentor junior engineers by providing technical guidance, conducting code reviews, and supporting onboarding Author technical documentation and best practices, contributing to internal knowledge bases, design proposals, and architecture reviews Required Qualifications: Expertise in Windows development, including Win32 APIs, DLL development, and system-level programming Strong understanding of Windows software security best practices (e.g., secure coding, privilege management, sandboxing) Experience with GGML, GGUF, and llama.cpp for local model inference Proficiency in C/C++ for client and systems development Preferred Qualifications:Familiarity with ONNX, OpenVino, RyzenAI, and QNN runtimes Working knowledge of Python and deep learning frameworks such as PyTorch Strong problem-solving and debugging skills in complex, multi-threaded environments Experience with Kotlin Multiplatform (KMP) or other cross-platform development frameworks Contributions to open-source AI runtimes, libraries, or Windows utilities Background in performance optimization, compiler toolchains, or low-level hardware acceleration #AAITC
Diese Cookies sind für das Funktionieren der Website erforderlich und können in unseren Systemen nicht abgeschaltet werden. Sie können Ihren Browser so einstellen, dass er diese Cookies blockiert, aber dann könnten einige Teile der Website nicht funktionieren.
Sicherheit
Benutzererfahrung
Zielgruppenorientierte Cookies
Diese Cookies werden über unsere Website von unseren Werbepartnern gesetzt. Sie können von diesen Unternehmen verwendet werden, um ein Profil Ihrer Interessen zu erstellen und Ihnen an anderer Stelle relevante Werbung zu zeigen.
Google Analytics
Google Ads
Wir benutzen Cookies
🍪
Unsere Website verwendet Cookies und ähnliche Technologien, um Inhalte zu personalisieren, das Nutzererlebnis zu optimieren und Werbung zu indvidualisieren und auszuwerten. Indem Sie auf Okay klicken oder eine Option in den Cookie-Einstellungen aktivieren, stimmen Sie dem zu.
Die besten Remote-Jobs per E-Mail
Schliess dich über 5'000+ Personen an, die wöchentlich Benachrichtigungen über Remote-Jobs erhalten!