About the job Data Operations Engineer
EDUCATION / QUALIFICATIONS / EXPERIENCE
- B.S. in computer science or information systems fields required, or 5+ years related work experience.
- Strong analytical, critical thinking skills used to solve complex problems
- Strong technical background with a mix of development and automation skills
- Outstanding attention to detail and consistently meets deadlines
- Exceptional communication and interpersonal skills
- Ability to work alongside a highly collaborative team, but also a self-starter, able to work independently with little guidance
- Experience in troubleshooting, performance tuning, and optimization
- Proficient in shell scripting, Python, Scala or other programming languages
- Knowledge of Spark/PySpark
- Excellent SQL knowledge, ability to read/write SQL queries
- Skilled in Hive (HQL) and HDFS
- Experience working with both unstructured and structured data sets, including flat files, JSON, XML, ORC, Parquet and AVRO
- Comfortable working with big data environments and dealing with large diverse data sets
- Proficient in Linux environments
- Familiarity with source code management/versioning tools such as Github
- Understanding of CI/CD principles and best practices in data processing
- Experience building data visualization dashboards to capture data quality metrics using tools like Tableau, Big Data Studio
- Understanding of public cloud technologies such as AWS, GCP and Azure is a plus
MAJOR JOB RESPONSIBILITIES
- Contribute to the maintenance, documentation, and monitoring of supported data pipelines
- Continuously analyze supported data workflows for opportunities to improve reliability and timeliness against established SLAs
- Conceive, develop, and apply improvements to workflows and monitoring to minimize the occurrence and impact of defects
- Communicate with stakeholders when data is in error or is delayed, with clear plans and timelines for recovery, and future prevention
- Develop modifications to workflows using git and github
- Assist with production support tickets and inquiries from consumers of supported data pipelines
- Facilitate the onboarding of new products and pipelines into our suite of supported production processes
ABOUT YOU
You are passionate about improving the integrity, accuracy and reliability of data across the organization. You are a highly motivated individual with excellent analytical, critical thinking and problem solving skills. You bring substantial value to the team with your prior experience in building and supporting production data and reporting pipelines. Strong verbal, written and interpersonal skills provide the flexibility to work collaboratively with a team or independently with minimal supervision. Youre a detail-oriented, self-starter with the ability to multitask and thrive in a dynamic environment. Furthermore, your familiarity with techniques for automating, cleansing and standardizing data at rest, and in motion, make you a great fit for this role.
WHAT YOUD BE DOING
As a Data Operations Engineer you will be responsible for monitoring and managing maintenance of multiple data ETL pipelines, which power hundreds to thousands of business-critical applications and reports, used by many teams throughout the organization, as well as external customers, every day. Due to the scale, variety, and complexity of the processes we support, standardized or automated practices and tools are needed for monitoring and maintenance of the pipelines. It is your duty to ensure that monitoring alerts Data Operations to any issues with data pipelines, with appropriate timeliness and sensitivity. Any alerts must be dealt with in a timely and appropriate manner. This can include a variety of things, including job regeneration, communication to stakeholders, definition and development of process enhancements via code, or updates to process documentation. In addition, you will assist with production support inquiries from stakeholders about the quality or timeliness of data in reporting. Finally, you will work with other teams to facilitate the onboarding of new data pipelines and products into our suite of supported production processes. Strong analytic skills and problem solving skills are needed throughout to model, monitor, and troubleshoot the production processes.