Director Application Development
- Tech lead of an Agile team for maintaining/enhancing a cloud-based data lake that includes multiple ETL data pipelines and is capable of:
- Extracted enterprise financial data from multiple on-prem data sources ( such as Oracle Financials, Oracle DB, SQL servers, DB2, Hadoop file system, CSV files, AWS RDS DB, AWS Redshift data share, AnaPlan Platform).
- Applied ETL to imported raw financial data on S3 per business requirements and then saved the curated data to AWS S3/Athena/Glue data catalog and Redshift data cluster.
- Per various data consumption needs (for reporting and ETL needs), dynamically generated customized data sets from data lake for consumption by other data lakes or BI tools (such as PowerBI or Tableau).
- Components used in above data lake system include AWS lambda functions, Glue crawler, Glue data catalog/ETL jobs (PySpark-based), S3, Athena DBs, Redshift clusters/data shares, AWS RDS DBs (Posgres-based).