Data Analyst
Raleigh, North Carolina, United States
- Written Hive queries for data analysis to meet the business requirements
- Migrated an existing on premises application to AWS.
- Developed PIG Latin scripts to extract the data from the web server output files and to load into HDFS
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting
- Created many Sparks UDF and UDAFs in Hive for functions that were not preexisting in Hive and Spark SQL.
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs and Scala.