I have been primarily coding in Python for about 10 years dealing with large data systems like Hadoop File System and Spark Clusters, application server code via Django and Flask, statistical analysis libraries like Pandas and NumPy, and crafting SQL queries for data fetching with efficiency. Along with code development, I have been a principal architect for data storage, ETL, and raising solutions in the AWS Public Cloud with Glue, S3, Datalake, and Redshift Spectrum. These data infrastructure solutions have been implemented via Hashicorp’s Terraform, and ETL code scheduling management through Airflow DAGs. In managing code, infrastructure, and deployments, I am keenly familiar with Jenkins and Git for CI/CD and code version control respectively. Additionally, for deployment, I use Docker images and for managing more complex applications, I have leveraged my skills with Kubernetes.
Listed skills include Python, Git, Repository Management, Technical Design, and 7 others.