Job Openings System Administrator

About the job System Administrator

Skills: Big Data, PySpark, BigQuery, cloud platforms (Google Cloud, AWS, or Azure), Linux, Shell/Python, Ansible

Overview:
This role involves managing and optimizing Big Data environments (PySpark, BigQuery, Airflow) in Google/AWS/Azure cloud, ensuring efficient, secure, and cost-effective operations. Responsibilities include 24x7 support, data pipeline optimization, automation, and troubleshooting, with a focus on DevOps, CI/CD, and disaster recovery.

Roles and Responsibilities:
(Google/AWS/Azure public cloud, PySpark, BigQuery, and Google Airflow)

  • Participate in 24x7x365 SAP Environment rotational shift support and operations.
  • As a team lead, you will be responsible for maintaining the upstream Big Data environment, where millions of financial data transactions flow daily, consisting of PySpark, BigQuery, Dataproc, and Google Airflow.
  • You will be responsible for streamlining and tuning existing Big Data systems and pipelines and building new ones. Making sure the systems run efficiently and with minimal cost is a top priority.
  • Manage the operations team in your respective shift, making changes to the underlying systems.
  • This role involves providing day-to-day support, enhancing platform functionality through DevOps practices, and collaborating with application development teams to optimize database operations.
  • Architect and optimize data warehouse solutions using BigQuery to ensure efficient data storage and retrieval.
  • Install, build, patch, upgrade, and configure Big Data applications.
  • Manage and configure BigQuery environments, datasets, and tables.
  • Ensure data integrity, accessibility, and security in the BigQuery platform.
  • Implement and manage partitioning and clustering for efficient data querying.
  • Define and enforce access policies for BigQuery datasets.
  • Implement query usage caps and alerts to avoid unexpected expenses.
  • Should be very comfortable with troubleshooting Linux-based systems on issues and failures with a good grasp of the Linux command line.
  • Create and maintain dashboards and reports to track key metrics like cost and performance.
  • Integrate BigQuery with other Google Cloud Platform (GCP) services like Dataflow, Pub/Sub, and Cloud Storage.
  • Enable BigQuery through tools like Jupyter Notebook, Visual Studio Code, and other CLIs.
  • Implement data quality checks and data validation processes to ensure data integrity.
  • Manage and monitor data pipelines using Airflow and CI/CD tools (e.g., Jenkins, Screwdriver) for automation.
  • Collaborate with data analysts and data scientists to understand data requirements and translate them into technical solutions.
  • Provide consultation and support to application development teams for database design, implementation, and monitoring.
  • Proficiency in Unix/Linux OS fundamentals, Shell/Perl/Python scripting, and Ansible for automation.
  • Disaster Recovery & High Availability expertise, including backup/restore operations.
  • Experience with geo-redundant databases and Red Hat cluster.
  • Accountable for ensuring that delivery is within the defined SLA and agreed milestones (projects) by following best practices and processes for continuous service improvement.
  • Work closely with other Support Organizations (DB, Google, PySpark data engineering, and Infrastructure teams).
  • Incident Management, Change Management, Release Management, and Problem Management.