Senior Systems Engineer Virtualization Job Vacancy in UAE Dubai
A unique technology group with a very human purpose, on a journey, to invent visionary artificial intelligence for a better everyday. |
The opportunity
The Senior Systems Engineer engages in the design, leads implementation, and provides Level 3 expert support for large-scale private cloud computing infrastructure, with a specific emphasis on computing technologies including hardware layer, operating system, hypervisor, and orchestration services.
G42 is an Abu Dhabi based artificial intelligence and cloud computing company with a global footprint delivering holistic and scalable AI solutions to a variety of commercial and government clients. The Group’s business operations cover a wide range of industry verticals including Healthcare, Government, Smart City & Smart Mobility, Oil & Gas, Fintech, Geospatial, Aviation, Cloud Computing, Big Data Analytics and Sports.
Responsibilities
Co-design, implement, and manage hybrid virtualization and containerized platforms based on OpenStack, VMware VCF and/or Red Hat OpenShift ensuring platform stability, performance and compliance with industry standards and best practices.
Collaborate with architecture and engineering teams on technology stack component evaluation and selection ensuring solutions are designed following best practices and optimized from both functional and non-functional perspectives.
Conduct regular capacity planning exercises to anticipate and accommodate the growing demands on the virtualized environment, ensuring it meets current and future requirements.
Develop and implement plans to enhance the reliability of the computing infrastructure, addressing potential points of failure and ensuring high availability of services.
Explore, analyze, and implement performance optimization strategies for the cloud computing environment, ensuring optimal resource utilization and responsiveness.
Collaborate with relevant teams to conduct regular performance assessments and implement improvements based on findings.
Prepare and participate in complex changes to production environments supporting operational teams.
Develop auto-test and automation solutions for cloud platform using tools like Jenkins and Selenium along with other configuration management tools such as Terraform, Ansible, Puppet, Chef, and GitLab CI/CD.
Provide L3 expert support including on-call shifts with focus on immediate incident management and resolutions, such as outages, breaches, and system failures.
Write and maintain relevant documentation ensuring completeness and quality.
Prepare and provide trainings for operational teams in the related technical domains.
Collaborate with security management teams to ensure that systems are safe and secure against cybersecurity threats.
Work closely with process management and operational teams and contribute to process development standardizing collaboration framework and improving collaboration efficiency