Sr. Data Engineer
Cleburne, Texas, United States
- Designed and implemented data pipelines using Azure cloud platform (HDInsight, Data Lake, Data Bricks, Blob Storage, Data Factory, Synapse, SQL, SQL DB, DWH and Data Storage Explorer).
- Developed a custom ETL, batch processing and real-time facts ingestion pipeline to transport facts to and from Hadoop cluster the usage of PySpark and shell scripts.
- Integrating on-premises data (MySQL, Cassandra) with the cloud (Blob Storage, Azure SQL Database) and applied transformations to reload… Show more
- Integrating on-premises data (MySQL, Cassandra) with the cloud (Blob Storage, Azure SQL Database) and applied transformations to reload into Azure Synapse using Azure Data Factory.
- Docker container images built and published using the Azure Container Registry and deployed to Azure Kubernetes Service (AKS).
- Experienced in CDC data using Spark and saved as Parquet in HDFS for later analysis.