Critical Incident & Problem Management Specialist
HRBrain
We’re looking for a Critical Incident & Problem Management Specialist! Reach out if you’re interested and feel free to refer friends/colleagues!
Type of Employment: Contract
Title: Critical Incident & Problem Management Specialist
Term: 12-month Contract
Location: Remote for now – Toronto, Winnipeg and London
Job ID number: C13413
Brief description of duties:
- Manage critical incidents and problems with an emphasis on minimizing production, financial and reputational impact (24×7 rotating on-call support)
– Initiate protocol for any critical technology incident that impacts a critical business application or core infrastructure service
– Assemble and provide technical leadership to the Incident Response Team as required to coordinate troubleshooting approach and service restoration activities
– Communicate appropriate updates to all key stakeholders with progress until resolution
- Identify and analyze the incident and problem trends, thoroughly documenting resolution and/or workarounds
– Identify trends leveraging business feedback and incident analysis
– Perform root cause analysis using proven problem analysis methodology (Kepner Tregoe, Ishikawa/Fishbone, etc.)
– Maintain inventory of Problems under analysis, their current progress/status, and coordinate the resolution with the Problem owner
- Monitor progress on the resolution of Known Errors and ensure best available workaround for Incidents are documented and communicated
- Manage stakeholder communications, provide seamless escalations and communications
- • Create and publish various metrics, reports, and conduct post-incident reviews
MUST haves:
- Professional technology support experience in a large application or infrastructure environment
- Critical incident management experience in a large enterprise
- Demonstrated breadth of understanding of multiple technology areas (i.e. one or more of Unix, Intel, enterprise storage, networking, database, IT security, web infrastructure, application support, Cloud computing)
- Demonstrated ability to problem solve using a structured, data-centric methodology (e.g. Kepner-Tregoe)
- Strong analytical, critical thinking, and problem-solving skills and an ability to present clear, concise, and effective solutions
- Ability to work and adapt in high pressure, fast-paced environment and maintain calm during stressful situations
- Exhibit solid communication skills, both written and verbal with the ability to communicate and articulate technical information into business language
- Effective influencer without authority for driving work and direction across multiple teams and technology discipline, geographical locations; ability to build successful relationships across a wide range of IT teams
- Proven ability to effectively prioritize and execute tasks with multiple competing priorities
Job Features
Technology support experience in a large application or infrastructure environment | 8-10 |
Critical incident management experience in a large enterprise | 5+ |