Site Reliability Engineers (SREs)

Bangkok, Bangkok, Thailand

Or refer someone

Job Openings Site Reliability Engineers (SREs)

About the job Site Reliability Engineers (SREs)

Hiring Position: Site Reliability Engineers (SREs): Open to All Nationalities

Working Condition: 100% On-Site: BTS Accessible

Location: Bangkok, Thailand

Pay Rate: THB 75000 TO THB 100000

__________________________________________________________________

About the Role

Our client is looking for a Site Reliability Engineer (SRE) to improve the reliability, performance, and efficiency of their software and IT services. This role acts as a bridge between development and operations, ensuring seamless deployment, automation, and monitoring of critical systems.

Key Responsibilities

Monitor and Improve System Performance Track system health, identify bottlenecks, and enhance service availability.
Automate Processes Reduce manual work by writing scripts and implementing automation tools.
Service Reliability & Risk Mitigation Define key performance metrics (SLIs, SLOs) and manage error budgets to reduce risks.
Incident Management Take ownership of platform-related incidents, ensure quick resolution, and enhance long-term stability.
Collaboration with Development Teams Work closely with engineers to streamline deployments and improve system efficiency.

Qualifications

- Junior Level: 3 to 5 years of experience as a Software Engineer or System Administrator, with a strong interest in becoming an SRE.

- Senior Level: Minimum 5 years of experience as an SRE.

- Proficiency in at least one coding language (Bash, Python, PowerShell, etc.).
- Experience with monitoring tools like Datadog, Grafana, ElasticSearch, or Kibana.
- Familiarity with cloud services such as AWS.
- Strong communication skills in English (spoken & written).

Nice to Have (Bonus Skills)

- Knowledge of IT operations and best practices for high-availability systems.
- Experience with CI/CD automation tools (GitHub Actions, Jenkins, Ansible, Terraform, etc.).
- Understanding of containerization (Docker, Kubernetes, Helm).
- Familiarity with IT service management (incident, problem, and change management).

Or refer someone