About the job Site Reliability Engineers (SREs)
Hiring Position: Site Reliability Engineers (SREs): Open to All Nationalities
Working Condition: 100% On-Site: BTS Accessible
Location: Bangkok, Thailand
Pay Rate: THB 75000 TO THB 100000
_______________________________________________________________________
About the Role:
Our client is looking for Site Reliability Engineers (SREs) to join their team and play a critical role in enhancing the quality of software processes and services in production. SREs act as a bridge between development and operations, designing code to automate processes and improve the efficiency of deliverables. They are responsible for the availability, performance, efficiency, monitoring, capacity planning, and overall reliability of the services they manage.
________________________________________________________________________
Responsibilities:
- Monitor the health of services and collaborate with developers to increase the velocity of changes using built-in support for service monitoring.
- Define metrics for SLIs (Service Level Indicators), set SLOs (Service Level Objectives), and track error budgets to mitigate service risks.
- Use dashboards to aggregate metrics and logs, including golden signals, to reduce MTTR (Mean Time to Recovery) and quickly assess service health.
- Take ownership of platform-related incident management and resolution, ensuring timely communication and effective problem-solving.
- Automate provisioning and maintenance tasks using scripts and automation tools.
________________________________________
Qualifications:
- Junior Level: 3 to 5 years of experience as a software engineer or systems administrator with a willingness to transition into an SRE role.
- Senior Level: A minimum of 5 years of experience as an SRE.
- Proficiency in at least one programming language (e.g., Bash, Python, PowerShell).
- Hands-on experience with observability tools such as Datadog, Grafana, ElasticSearch, and Kibana.
- Familiarity with cloud services (e.g., AWS).
- Strong command of English, both spoken and written.
________________________________________
Nice to Have:
- Knowledge of best practices and IT operations for always-available, highly scalable services.
- Experience with automation tools and CI/CD platforms (e.g., GitHub Actions, Jenkins, Ansible, Terraform).
- Familiarity with containerization and container orchestration tools, including Docker, Kubernetes (K8s), and Helm.
- Understanding of IT Service Management (ITSM) processes, such as incident, problem, and change management.