Job Openings Site Reliability Senior Engineer

About the job Site Reliability Senior Engineer

A. PROFILE

Role Title: Site Reliability Senior Engineer
Reporting to: Engineering Manager - DevOps
Division: Information & Communication Technology
Department / Section: Technology & Information

B. CONTEXT

Purpose: This role is responsible for contributing in the planning team of ICT. This includes strategic planning, solutions roadmaps, capacity planning, and Innovation.

Context: The Technology Unit within OM is the backbone of the organization providing all technology services which enable OM to deliver its services to its customers across all technology platforms, 24/7/365. The quality of the customer experience sits within this BU and therefore it plays a significant role in the delivery of revenue and satisfaction targets.

ICT Planning plays a vital role in this context by ensuring that ICT systems fulfill demand needs, and that ICT strategy is aligned with U9 business strategy and vision.

C. ROLE ACCOUNTABILITIES

  • Lead the design, development, and maintenance of a robust and efficient DevOps pipeline to enable continuous integration and delivery of software products.
  • Configure and manage automation tools such as Ansible to streamline deployment and configuration management processes.
  • Containerize applications using Docker and orchestration tools to enable scalability and portability.
  • Maintain and enhance version control systems, primarily Git, to ensure smooth code collaboration and version control.
  • Plan and implement integration with multiple third-party systems such as infra, core, ICT, public cloud etc.
  • Develop and maintain microservices using Python, adhering to best practices and coding standards.
  • Utilize the expertise in Oracle Linux and SQL to optimize database performance and troubleshoot issues.
  • Collaborate closely with software developers, providing on-time support and deploying micro-service solutions in the IT environment.
  • Plan and scale for multiple applications, ensuring efficient development, maintenance, and performance tuning
  • Monitor system performance, analyse metrics, and implement proactive measures to ensure high availability and scalability.
  • Conduct application performance analysis and reporting for environment-related matters.
  • Participate in incident management and root cause analysis, identifying and resolving issues to minimize downtime and improve system reliability.
  • Work with industry collaborators or research institutes for the potential new business stream for automation, process efficiency and so on.
  • Undertake any other related or ancillary duties and responsibilities assigned based on U9 business and operational needs.

D. KEY PERFORMANCE INDICATORS

  • Time to market for IT application environment
  • Scalability of IT application which will be elastic to scale up and down
  • Seamless runtime for IT application >=99% after application go live
  • ALL system to be update with latest security patches

E. WORKING RELATIONSHIPS & DECISION MAKING

Interacts with:
Internal: 

  1. Infrastructure team, IT/Network team
  2. Software development team
  3. ICT demand team
  4. ICT Operation team

External: 

  1. Infrastructure vendor
  2. Security Vendor

Decision Making

  1. Impact analysis approval
  2. Solution design approval
  3. Security path and assurance approval

F. EXPERIENCE AND QUALIFICATIONS

Minimum Experience & Essential Knowledge

  • Proven knowledge in translating business requirements into operating technologies
  • 3 to 5 years of relevant experience in telecom industry.
  • Good experience in system administration.

Minimum Entry Qualifications

  • Bachelors Degree in Telecoms engineering, Computer Science or equivalent