About the job DevOps Engineer
Do
you like when sites stay up, like all the time? Do you enjoy finding new and
creative ways of surfacing poor quality code in production? Do you think that
Netflix’s simian army of chaos is the best thing ever? You may well be exactly
who we are looking for! We at Lalamove are now looking for a highly trained SRE
who can help us prevent downtime and provide visibility for our engineers into
what things may go wrong.
You’d be a part of the infrastructure team here at Lalamove and work closely
with production engineers, security experts and tooling software engineers to
achieve the ultimate goal of plentiful 9s. You’d also work on outreach and
education within the greater engineering organisation.
What We Imagine You’d Be Doing
· Plan, set up and manage the monitoring infrastructure
· Educate developers on what kinds of application level metrics could be valuable
· Ensure we have a relevant RED dashboard for our business
· Help out on the most challenging root cause analysis
· Find and fix bugs deep inside the systems we are using
· Plan and implement reliable recovery process
·
Ensure our applications are
resilient to infrastructure level failures (chaos testing)
What
We’re Looking For
· At least 4+ years of experience
· Experience in cloud(AWS) infrastructure
· Strong programmer in systems languages (Go, NodeJS, etc)
· Experience with different kinds of monitoring tools and can tell which ones are good for what
·
Someone who thinks an
opensource-first policy makes sense