We are looking for a proactive and process-driven Operations Center Manager to lead our command center team. In this role, you will be the "pulse" of our IT environment, ensuring high availability and seamless performance of our critical infrastructure. You aren't just watching screens—you are an Incident Commander and a Process Architect. We need someone who lives and breathes the ITIL framework to bring structure to chaos, and who has deep experience with infrastructure monitoring to detect issues before they impact the business.
Key Responsibilities
1. Operations Center Leadership
Manage the daily activities of the Operations Center (NOC), ensuring 24/7 coverage and rapid response to alerts.
Lead, mentor, and train a team of System Administrators and L1/L2 Support Engineers.
Manage shift schedules, handovers, and on-call rotations to ensure zero coverage gaps.
2. Infrastructure Monitoring & Tooling
Oversee the health of the entire IT estate: Servers (Windows/Linux), Networks (LAN/WAN), Cloud (AWS/Azure), and Virtualization (VMware/Hyper-V).
Tool Ownership: Administer and tune monitoring platforms (e.g., SolarWinds, Nagios, Datadog, Zabbix, Logic Monitor, Elastic, Splunk etc).
Refine alert thresholds to reduce "alert fatigue" and ensure the team focuses on actionable signals.
Design and maintain real-time dashboards for leadership, visualizing uptime, latency, and system health.
Ensure patching schedules are executed on time and compliant with security policies.
Qualifications
Required Experience
12+ years of experience in IT Operations, Infrastructure Support, or NOC environments.
2+ years of experience in a leadership or team lead role.
Deep understanding of the ITIL Framework (Certification is highly preferred).
Hands-on experience with Monitoring Tools: Proficiency in configuring and managing tools like Logic Monitor, Elastic, SolarWinds, PRTG, Nagios, Datadog, or New Relic.
Solid technical background in Server Administration (Windows/Linux) and basic Networking concepts (DNS, TCP/IP, Firewalls).
Soft Skills
Crisis Management: Ability to stay calm and decisive during high-pressure outages.
Communication: capable of translating complex technical issues into clear business updates for executives.
Analytical Thinking: A data-driven approach to identifying trends and inefficiencies.
Preferred (Bonus Points)
ITIL v3 or v4 Foundation/Intermediate Certification.
Experience with ITSM tools like ServiceNow, Jira Service Management, or BMC Remedy etc.
Basic scripting skills (PowerShell, Bash, or Python) for automation.
Experience in a Hybrid Cloud environment (On-prem + Azure/AWS).
Estas cookies son necesarias para que el sitio web funcione y no se pueden desactivar en nuestros sistemas. Puede configurar su navegador para bloquear estas cookies, pero entonces algunas partes del sitio web podrían no funcionar.
Seguridad
Experiencia de usuario
Cookies orientadas al público objetivo
Estas cookies son instaladas a través de nuestro sitio web por nuestros socios publicitarios. Estas empresas pueden utilizarlas para elaborar un perfil de sus intereses y mostrarle publicidad relevante en otros lugares.
Google Analytics
Anuncios Google
Utilizamos cookies
🍪
Nuestro sitio web utiliza cookies y tecnologías similares para personalizar el contenido, optimizar la experiencia del usuario e indvidualizar y evaluar la publicidad. Al hacer clic en Aceptar o activar una opción en la configuración de cookies, usted acepta esto.
Los mejores empleos remotos por correo electrónico
¡Únete a más de 5.000 personas que reciben alertas semanales con empleos remotos!