Job Openings Oracle + PySpark Data Engineer (Remote)

About the job Oracle + PySpark Data Engineer (Remote)

Job Description: 

Experiance:5 to 7 Years

We are seeking a highly skilled and motivated Oracle + PySpark Data Engineer/Analyst to join our team. The ideal candidate will be responsible for leveraging the Oracle database and PySpark to manage, transform, and analyze data to support our business's decision-making processes. This role will play a crucial part in maintaining data integrity, optimizing data processes, and enabling data-driven insights.

Key Responsibilities:

1. Data Integration: Integrate data from various sources into Oracle databases and design PySpark data pipelines to enable data transformation and analytics.
2. Data Transformation:
Develop and maintain data transformation workflows
using PySpark to clean, enrich, and structure data for analytical purposes.
3. Data Modeling:
Create and maintain data models within Oracle databases,
ensuring data is structured and indexed for optimal query performance.
4. Query Optimization:
Write complex SQL queries and PySpark transformations for efficient data retrieval and processing.
5. Data Analysis:
Collaborate with data analysts and business teams to provide insights through data analysis and reporting.
6. Data Quality:
Implement data quality checks, error handling, and validation
processes to ensure data accuracy and reliability.
7. Performance Tuning:
Optimize Oracle database and PySpark jobs to improve

Known Tools

  • Proven experience in working with Oracle databases and PySpark.
    Strong proficiency in SQL, PL/SQL, Python, and PySpark.
  • Familiarity with Oracle database administration, data warehousing, and ETL concepts.
  • Understanding of big data technologies and distributed computing principles.
  • Strong analytical and problem-solving skills.
  • Excellent communication and teamwork abilities.
  • Knowledge of data security and compliance standardsand overall data processing and analysis performance.
  •  Documentation: Create and maintain comprehensive documentation for data
    models, ETL processes, and codebase