Job Purpose
Ensures availability and scalability of technology platforms in collaboration with internal and external teams to meet the current and future business needs in line with prescribed IT measures and ISAs adopted policies and procedures
Key Result Responsibilities
Core Linux Administration Skills
- Administer, monitor, and support Linux servers (RHEL-based systems – versions 6/7/8/9) across Production, Staging, and DR environments.
- Perform OS installation, system hardening, patching, upgrades, and vulnerability remediation.
- Manage file systems, LVM, RAID, multipathing, and conduct performance tuning and capacity planning.
- Hands-on experience with package management systems such as RPM, YUM/DNF, APT, and DPKG.
- Strong understanding of Linux networking concepts including TCP/IP, DNS, DHCP, routing, and firewalls.
- Strong hands-on experience with load balancers, especially HAProxy and Nginx.
- Experience managing web servers such as Apache and Nginx.
- Automation & Infrastructure as Code (Highly Emphasized)
- Strong automation experience using Ansible for configuration management and orchestration.
- Hands-on expertise in Terraformfor Infrastructure as Code (IaC), provisioning, and lifecycle management.
- Advanced scripting skills in Bash and Python to automate operational tasks and improve system reliability.
- Virtualization & Cloud
- Support and manage virtualization platforms including VMware, KVM, and Proxmox.
- Manage and support cloud infrastructure services including EC2 Instances, Load Balancers, VPC, S3, IAM, and DNS.
- Assist with cloud migrations, disaster recovery planning, and hybrid infrastructure design.
- Monitoring & Logging
- Experience with enterprise monitoring and observability tools including Datadog.
- Hands-on experience with metrics visualization using Grafana.
- Knowledge of time-series monitoring using Prometheus.
- Experience implementing infrastructure and service monitoring using Zabbix.
- Exposure to application performance monitoring tools such as Dynatrace.
- Containerization & Orchestration
- Proficiency with container technologies such as Docker and Podman in Linux environments.
- Operations & Incident Management
- Act as Level 3 support for complex Linux, automation, and infrastructure-related issues.
- Participate in incident handling, root cause analysis (RCA), and preventive action planning.
- Ensure SLA compliance and high availability for mission-critical airline applications.
- Provide on-call and shift support as per operational requirements.
Qualifications (Academic, training, languages)
- RHCSA / RHCE Linux certification is preferred.
- Cloud certifications such as AWS, OCI, Azure, or GCP are an added advantage.
- Fluent in English Language
Work Experience
- 3 to 8 years in IT Infrastructure design and management 24x7 critical operations preferably with Airlines
- IAC (Ansible, Terraform), Cloud experience (OCI/AWS/Azure), Load balancer (Nginix/HAProxy)
- Working experience in implementation of infrastructure projects solution designs and architecture
- Holistic IT Knowledge in heterogeneous technology environments experience with different types of end-to-end technology stacks
- Operations and management of technology platforms, both internally and externally hosted
- Hands on technical leadership technical solution design and architecture
- Proven skills in analyzing data identifying pitfalls and recommending cost effective solution
- Capable of conducting cost benefit analysis for IT investments
- Cost oriented possesses effective problem solving and decision-making skills
- Employs technical expertise and interpersonal relations to execute new initiatives and achieve companies' objectives
- Demonstrates the ability to contribute and successfully deliver against business strategy and set KPIs