Data Engineer
Current- Involved in Analyzing data from different sources like Oracle, MySQL and Sqooping data into Hive using Sqoop.
- Involved in identifying customers using unstructured data with Pyspark.
- Written Shell scripts to initiate jobs with required features and environment.
- Involved in moving huge data between servers in Hadoop by using compression techniques.
- Created a Realtime data pipelines and frameworks with Kafka, Spark streaming and loading data to Hbase also feeding APIs.
- Developed ETL jobs to automating the several student’s data from enrollment step to graduation process and service fee calculation for weekly finance payment to Confidential.