We are looking for a proactive and process-driven Operations Center Manager to lead our command center team. In this role, you will be the "pulse" of our IT environment, ensuring high availability and seamless performance of our critical infrastructure. You aren't just watching screens—you are an Incident Commander and a Process Architect. We need someone who lives and breathes the ITIL framework to bring structure to chaos, and who has deep experience with infrastructure monitoring to detect issues before they impact the business.
Key Responsibilities
1. Operations Center Leadership
Manage the daily activities of the Operations Center (NOC), ensuring 24/7 coverage and rapid response to alerts.
Lead, mentor, and train a team of System Administrators and L1/L2 Support Engineers.
Manage shift schedules, handovers, and on-call rotations to ensure zero coverage gaps.
2. Infrastructure Monitoring & Tooling
Oversee the health of the entire IT estate: Servers (Windows/Linux), Networks (LAN/WAN), Cloud (AWS/Azure), and Virtualization (VMware/Hyper-V).
Tool Ownership: Administer and tune monitoring platforms (e.g., SolarWinds, Nagios, Datadog, Zabbix, Logic Monitor, Elastic, Splunk etc).
Refine alert thresholds to reduce "alert fatigue" and ensure the team focuses on actionable signals.
Design and maintain real-time dashboards for leadership, visualizing uptime, latency, and system health.
Ensure patching schedules are executed on time and compliant with security policies.
Qualifications
Required Experience
12+ years of experience in IT Operations, Infrastructure Support, or NOC environments.
2+ years of experience in a leadership or team lead role.
Deep understanding of the ITIL Framework (Certification is highly preferred).
Hands-on experience with Monitoring Tools: Proficiency in configuring and managing tools like Logic Monitor, Elastic, SolarWinds, PRTG, Nagios, Datadog, or New Relic.
Solid technical background in Server Administration (Windows/Linux) and basic Networking concepts (DNS, TCP/IP, Firewalls).
Soft Skills
Crisis Management: Ability to stay calm and decisive during high-pressure outages.
Communication: capable of translating complex technical issues into clear business updates for executives.
Analytical Thinking: A data-driven approach to identifying trends and inefficiencies.
Preferred (Bonus Points)
ITIL v3 or v4 Foundation/Intermediate Certification.
Experience with ITSM tools like ServiceNow, Jira Service Management, or BMC Remedy etc.
Basic scripting skills (PowerShell, Bash, or Python) for automation.
Experience in a Hybrid Cloud environment (On-prem + Azure/AWS).
Diese Cookies sind für das Funktionieren der Website erforderlich und können in unseren Systemen nicht abgeschaltet werden. Sie können Ihren Browser so einstellen, dass er diese Cookies blockiert, aber dann könnten einige Teile der Website nicht funktionieren.
Sicherheit
Benutzererfahrung
Zielgruppenorientierte Cookies
Diese Cookies werden über unsere Website von unseren Werbepartnern gesetzt. Sie können von diesen Unternehmen verwendet werden, um ein Profil Ihrer Interessen zu erstellen und Ihnen an anderer Stelle relevante Werbung zu zeigen.
Google Analytics
Google Ads
Wir benutzen Cookies
🍪
Unsere Website verwendet Cookies und ähnliche Technologien, um Inhalte zu personalisieren, das Nutzererlebnis zu optimieren und Werbung zu indvidualisieren und auszuwerten. Indem Sie auf Okay klicken oder eine Option in den Cookie-Einstellungen aktivieren, stimmen Sie dem zu.
Die besten Remote-Jobs per E-Mail
Schliess dich über 5'000+ Personen an, die wöchentlich Benachrichtigungen über Remote-Jobs erhalten!