Critical Incident & Problem Management Specialist


Posted 2 years ago

We’re looking for a Critical Incident & Problem Management Specialist! Reach out if you’re interested and feel free to refer friends/colleagues!


Type of Employment: Contract
Title: Critical Incident & Problem Management Specialist
Term: 12-month Contract
Location: Remote for now – Toronto, Winnipeg and London
Job ID number: C13413


Brief description of duties:

  • Manage critical incidents and problems with an emphasis on minimizing production, financial and reputational impact (24×7 rotating on-call support)

– Initiate protocol for any critical technology incident that impacts a critical business application or core infrastructure service

– Assemble and provide technical leadership to the Incident Response Team as required to coordinate troubleshooting approach and service restoration activities

– Communicate appropriate updates to all key stakeholders with progress until resolution

  • Identify and analyze the incident and problem trends, thoroughly documenting resolution and/or workarounds

– Identify trends leveraging business feedback and incident analysis

– Perform root cause analysis using proven problem analysis methodology (Kepner Tregoe, Ishikawa/Fishbone, etc.)

– Maintain inventory of Problems under analysis, their current progress/status, and coordinate the resolution with the Problem owner

  • Monitor progress on the resolution of Known Errors and ensure best available workaround for Incidents are documented and communicated
  • Manage stakeholder communications, provide seamless escalations and communications
  • • Create and publish various metrics, reports, and conduct post-incident reviews


MUST haves:

  • Professional technology support experience in a large application or infrastructure environment
  • Critical incident management experience in a large enterprise
  • Demonstrated breadth of understanding of multiple technology areas (i.e. one or more of Unix, Intel, enterprise storage, networking, database, IT security, web infrastructure, application support, Cloud computing)
  • Demonstrated ability to problem solve using a structured, data-centric methodology (e.g. Kepner-Tregoe)
  • Strong analytical, critical thinking, and problem-solving skills and an ability to present clear, concise, and effective solutions
  • Ability to work and adapt in high pressure, fast-paced environment and maintain calm during stressful situations
  • Exhibit solid communication skills, both written and verbal with the ability to communicate and articulate technical information into business language
  • Effective influencer without authority for driving work and direction across multiple teams and technology discipline, geographical locations; ability to build successful relationships across a wide range of IT teams
  • Proven ability to effectively prioritize and execute tasks with multiple competing priorities


Job Features

Technology support experience in a large application or infrastructure environment8-10
Critical incident management experience in a large enterprise5+

Apply Online