Being part of Air Canada is to become part of an iconic Canadian symbol, recently ranked the best Airline in North America. Let your career take flight by joining our diverse and vibrant team at the leading edge of passenger aviation.
Air Canada is seeking a Specialist in IT Operations to act as a subject matter expert in enterprise observability and AIOps, enabling proactive, data driven operations across critical IT services. This role is responsible for the advanced use and continuous maturation of monitoring platforms including Dynatrace, Splunk, and Glassbox, leveraging telemetry, analytics, and automation to detect anomalies, accelerate root cause analysis, and reduce service impact. The Specialist will work closely with IT Operations, Incident and Problem Management, and application teams to transform monitoring insights into preventative actions, driving improved resilience, reduced MTTR, and higher operational maturity.
Responsibilities:
- Act as a subject matter expert for enterprise observability and AIOps platforms, including Dynatrace, Splunk, and Glassbox, supporting the reliability and performance of critical IT services.
- Own the operational configuration, tuning, and ongoing optimization of monitoring and digital experience tools to improve signal quality, reduce noise, and increase actionable insights.
- Leverage telemetry, analytics, and automation capabilities to enable proactive detection of anomalies, service degradation, and emerging risks before customer or operational impact occurs.
- Partner with IT Operations, Incident Management, and application teams to accelerate incident detection, triage, and root cause analysis, contributing to reduced MTTR.
- Translate monitoring data into clear, actionable insights that support decision making during major incidents and high severity operational events.
- Support post incident and problem management activities by providing monitoring insights, trends, and evidence to identify systemic issues and preventative opportunities.
- Contribute to the evolution of Air Canada’s AIOps and observability maturity, including standardization of metrics, dashboards, alerts, and service views.
- Collaborate with application, infrastructure, and platform teams to ensure monitoring coverage is aligned with service criticality, customer journeys, and business outcomes.
- Drive continuous improvement initiatives that enhance service resilience, stability, and operational predictability through improved observability practices.
- Maintain accurate documentation, standards, and playbooks related to monitoring configurations, alerting strategies, and operational workflows.
- Provide guidance and coaching to team members on effective use of observability tools and interpretation of monitoring data.
- Stay current on industry trends, emerging observability practices, and AIOps capabilities, recommending enhancements aligned to Air Canada’s operational goals.
Qualifications
- University degree in Information Technology, Computer Science, Engineering, or a related field, or an equivalent combination of education and directly relevant technical experience.
- 5+ years of hands on experience in IT Operations, Monitoring, Observability, or Reliability Engineering within a large, complex enterprise environment.
- Hands on experience with web and mobile Application Performance Monitoring (APM), including RUM, synthetic monitoring, and distributed tracing in large scale enterprise environments.
- Proven, practical experience implementing, configuring, and maintaining enterprise monitoring platforms, including Dynatrace, Splunk, and/or Glassbox, in a production environment.
- Strong technical understanding of observability data types (metrics, logs, traces, digital experience telemetry) and how to onboard, correlate, and operationalize them effectively.
- Demonstrated ability to design, tune, and sustain alerting, dashboards, service views, and analytics to improve signal quality and operational usefulness.
- Experience integrating monitoring tools with upstream and downstream systems (e.g., applications, infrastructure, cloud platforms, incident workflows) to enable end to end visibility.
- Solid working knowledge of IT Operations and Incident Management concepts, with practical experience supporting incident detection, triage, and diagnosis using monitoring data (formal ITIL certification is an asset but not required).
- Strong analytical and troubleshooting skills, with the ability to independently investigate complex performance or availability issues using monitoring platforms.
- Ability to work autonomously as a technical specialist while collaborating effectively with application, infrastructure, and operations teams.
- Comfortable operating in a high availability, high impact environment, demonstrating accountability for the reliability and accuracy of monitoring solutions.
- Demonstrate punctuality and dependability to support overall team success in a fast-paced environment.
Nice to Have
- Experience onboarding and monitoring cloud native and hybrid environments (e.g., AWS, Azure, Kubernetes, containerized applications).
- Familiarity with synthetic monitoring, real user monitoring (RUM), and customer journey instrumentation, particularly for digital and customer facing platforms.
- Experience integrating observability tools with incident and workflow platforms (e.g., ServiceNow, APIs, webhooks, automation pipelines).
- Working knowledge of log enrichment, tagging strategies, and data normalization to improve analytics and correlation across monitoring platforms.
- Exposure to alert automation, event correlation, or noise reduction techniques supporting AIOps capabilities.
Conditions of Employment:
Candidates must be eligible to work in the country of interest at the time any offer of employment is made and are responsible for obtaining any required work permits, visas, or other authorizations necessary for employment. Prior to their start date, candidates will also need to provide proof of their eligibility to work in the country of interest.
Linguistic Requirements
Based on equal qualifications, preference will be given to bilingual candidates.
Diversity and Inclusion
Air Canada is strongly committed to Diversity and Inclusion and aims to create a healthy, accessible and rewarding work environment which highlights employees’ unique contributions to our company’s success.
As an equal opportunity employer, we welcome applications from all to help us build a diverse workforce which reflects the diversity of our customers, and communities, in which we live and serve.
Air Canada thanks all candidates for their interest; however only those selected to continue in the process will be contacted.