Job Openings
GCP Cloud Data Engineer
About the job GCP Cloud Data Engineer
The Cloud Data Engineer is a hands-on role responsible for performance, availability and scalability of the public cloud environment. You will collaborate with other Architects and Engineers to enable enterprise applications to meet the organizations business goals.
Role and Responsibilities:
- Work closely with application development and data engineers on day-to-day tasks.
- Participate in project planning and implementation.
- Coordinate with Data Scientists, Product Managers and business leaders to understand data needs and deliver on those needs.
- Create and maintain optimal data pipeline architecture.
- Assemble large, complex data sets that meet functional / non-functional business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and other data sources.
- Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
- Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
- Improve, optimize and identify opportunities for efficient software development processes.
- Help achieve milestones as per sprint plan and prioritize to manage ad-hoc requests in parallel with ongoing sprints.
Candidate Profile:
Required Qualifications:
- 5+ years of hands-on experience in building Data pipeline (ETL/ELT) in a cloud platform
- GCP knowledge strongly preferred - other cloud experience such as AWS. AZURE is ok
- 5+ years of hands-on experience of building and operationalizing data processing systems
- Strong Python scripting experience is very important requirement
- 2+ years experience in NoSQL databases and close familiarity with technologies/languages such as Python/R, Scala, Java, Hive, Spark, Kafka
- 2+ years experience working with data platforms (Data warehouse, Data Lake, ODS)
- 2+ years experience working with tools to automate CI/CD pipelines (e.g., Jenkins, GIT, Control-M)
- Must have working experience with the clinical data
Preferred Qualifications:
- GCP (google cloud platform) experience
- 3+ years of experience working on healthcare / clinical data
- Data analysis / Data mapping skills
- Python
- Cloud Data flow/Data proc/Function
- Whistle map SDK
- Google Health care API/ FHIR store