About the job Data Engineer (Hybrid Set-up)
Data Engineer
Data Engineers are mainly tasked with transforming data into a format that can be easily analyzed. The Data Engineer will be responsible for designing and developing Sprouts data system, processing and extracting data features and deploying the data science teams machine learning models. The Data Engineer will support our software engineers, architects and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects.
Responsibilities:
Build, optimize and maintain conceptual, logical and physical database models
Develop database solutions to store and retrieve information
Assemble datasets that meet functional/non-functional business requirements
Build the infrastructure for optimal ETL from a wide variety of data sources
Monitor data integrity and adopt appropriate tools
Improve system performance
Optimize or re-design data architecture to support Sprouts next generation of products and data initiatives
Design, develop, test and deploy web service APIs
Work with Data Scientists to identify future needs and requirements
Deploy models and algorithms developed by the Data Science team
Requirements:
Extensive knowledge on databases (SQL and/or NoSQL) and data engineering best practices
Expertise in SQL and other programming languages(e.g. Python, Java, Scala, shell scripting etc.)
Experience with data modeling (data warehouse, data lake) and designing data storage schemes
Familiarity with data engineering and ETL software tools, hadoop, spark, talend, SSAS, etc. is also helpful
Experience building and optimizing data pipelines, architecture and datasets
Experience in software development
Experience with Azure
A successful history of manipulating, processing and extracting value from large disconnected datasets
Familiarity with data visualization tools (e.g. PowerBI)
Familiarity with agile development as a project management methodology is a plus
Strong problem-solving and analytical skills
Must be self-motivated and comfortable supporting the data needs of multiple teams, systems and products
A good team player and willingness to learn
Strong innate desire and proven track record of continuous self-improvement (in learning, job expansion, extracurricular activities, etc.)