Data Engineer – Job code : DE-39
Posted 3 years ago
Responsibilities:
- Develop and Deploy ETL Pipelines Developing ETL (Extract/Transform/Load) processes, design database systems, and develop tools for real-time and offline analytical processing using Python and SQL.
- Automate ETL pipelines to dynamically capture the correct paths for logs and ingesting data in data lakes in individual reports and pipelines by leveraging APIs and configure files.
- Develop and release of pipelines using Continuous Integration, Continuous Delivery, and Continuous Deployment (CI/CD) methodology.
- Contribute to the internal Py PI (Python) utilities package to enhance the capabilities of ETL Pipelines based on Cyber and Enterprise initiatives. To give production support and fix bugs in the utility package.
- Leverage Amazon Web Services (AWS) Technologies and configuring AWS Glue, IAM Roles to correctly access Data Catalog, i.e., configure the source and target tables.
- Job processing environments such as packages, arguments, dependency, et cetera.
- Generate PySpark script and ensure annotations are configured correctly. Tests and implements new pipeline releases through regression testing.
- Add data quality rules such as records count, threshold of error tolerance and sophisticated regular expressions for any new, existing and migration pipelines and following enterprise recommended standards to ensure compliance.
- Address data quality issues in case of failure to meet standards or rule failures.
- Troubleshoot software and pipeline processes for data consistency and integrity.
- Partner with internal clients to gain an enhanced understanding of business functions and informational needs.
- Reviews the end product with the client to ensure intent alignment and deliverables.
- Provide job aids to team members and business users.
- Leverage AWS Software Development Kit (boto3) instead of GUI Applications to automate and deploy the applications and pipelines using Python
- Build and Enhance Tableau Dash boarding to support Data and Metric Visualization for process improvement and facilitating business decisions.
Minimum Educational Requirements: Minimum of Bachelor’s degree in computer science, computer information systems, information technology, or a combination of education and experience equating to the U.S. equivalent of a Bachelor’s degree in one of the aforementioned subjects.
Work Location: Sritek Inc, 3120 Hudson Crossing, Suite # D1, McKinney TX 75070
Note: Need to travel across USA frequently on client need basis. Interested may forward your resume to our office or email to [email protected]