Akhil K

Akhil K Email and Phone Number

Data Engineer @ American Express
United States
Akhil K's Location
United States, United States
About Akhil K

With over 10 years of experience in IT, specializing in Big Data technologies. My expertise includes various tools and frameworks such as Hadoop (MapReduce, Pig, Hive, Zookeeper, Hbase, Sqoop, Oozie, Flume, Drill, Spark) and Snowflake for data sharing and security. I've built data pipelines using Kafka and Spark, and I'm skilled in generating insights from large datasets, applying machine learning, mathematical modeling, and operations research. I'm proficient in Apache Flume and Kafka for data processing, and I've developed workflows using PySpark to handle large-scale data from diverse sources. My IT skills also cover a wide range including programming languages, databases, operating systems, analytics tools, data warehousing, ETL tools, and cloud platforms like AWS and Azure.

Akhil K's Current Company Details
American Express

American Express

View
Data Engineer
United States
Employees:
79797
Akhil K Work Experience Details
  • American Express
    Data Engineer
    American Express
    United States
  • American Express
    Big Data Engineer
    American Express May 2023 - Present
    Phoenix, Arizona, United States
    • • Responsible for estimating the cluster size, monitoring and troubleshooting of the Spark data bricks cluster• Automated resulting scripts and workflow using Apache Airflow and shell scripting to ensure daily execution in production.• Analyze, design and build Modern data solutions using Azure PaaS service to support visualization of data. Understand current Production state of application and determine the impact of new implementation on existing business processes.• Integrated machine learning models with PySpark to enhance fraud detection accuracy and responsiveness.• Implement Continuous integration/continuous development best practice using Azure DevOps, ensuring code versioning• Creating and managing tables, schemas, and databases in Snowflake• Created Power BI report to identify bottleneck in process as Lean Champion.• Developed custom Kafka producer and consumer for different publishing and subscribing to Kafka topics. • Designed and implemented a PySpark-based ETL pipeline to process and analyze customer interaction data, resulting in improved customer segmentation and targeted marketing strategies.• Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL and U-SQL Azure Data Lake Analytics. Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in in Azure Databricks.• Migrated Map reduce jobs to Spark jobs to achieve better performance.• Written the Map Reduce programs, Hive UDFs in Java• Extracted and updated the data into HDFS using Sqoop import and export. • Developed a Spark job in Java which indexes data into ElasticSearch from external Hive tables which are in HDFS.• Extensively worked on Azure Databricks• Configured on-premises Power BI gateway for different source matching analysis criteria and further modeling for Power Pivot and Power View.
  • State Of Wisconsin
    Data Engineer
    State Of Wisconsin Dec 2021 - Apr 2023
    Madison, Wisconsin, United States
    • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in Map Reduce way.• Implementation of a data lake architecture using PySpark and AWS S3, enabling centralized storage and processing of diverse data sources.• Experience developing Kafka producers and Kafka Consumers for streaming millions of events per second on streaming data• Extracted Mega Data from AWS using SQL Queries to create reports. • Design and maintain CI/CD pipelines.• Involved in HBASE setup and storing data into HBASE, which will be used for further analysis.• Expertise in writing Hadoop Jobs for analyzing data using Spark, Hive, Pig MapReduce, Hive.• Capable of understanding and knowledge of job workflow scheduling and locking tools/services like Oozie, Zookeeper, Airflow and Apache NiFi.• Created continuous integration and continuous delivery (CI/CD) pipeline on AWS that helps to automate steps in software delivery process• Responsible for developing data pipeline with Amazon AWS to extract the data from weblogs and store in HDFS and worked extensively with Sqoop for importing metadata from Oracle.• Implemented sentiment analysis and text analytics on Twitter social media feeds and market news using Scala and Python.• Experience in configuring the Zookeeper to coordinate the servers in clusters and to maintain the data consistency which is important for decision making in the process.• Designed a data analysis pipeline in Python, using Amazon Web Services such as S3, EC2, Lambda, Auto Scaling, Cloud Watch, IAM, Security Groups, Cloud Formation and Elastic Map Reduce.• Developed ETL jobs using Spark-Scala to migrate data from Oracle to new hive tables.• Setting up DevOps pipelines for CI/CD on GIT, Jenkins, Nexus repository• Developed workflow in Oozie to automate the tasks of loading the data into Nifi and pre-processing with Pig.
  • Cna Insurance
    Data Engineer
    Cna Insurance Aug 2019 - Nov 2021
    Chicago, Illinois, United States
    • Developed HIVE UDFs to incorporate external business logic into Hive script and Developed join data set scripts using HIVE join operations. • Created Linked Services for multiple source system (i.e.: Azure SQL Server, ADLS, BLOB, Rest API).• Performed Data Visualization and Designed Dashboards with Tableau and generated complex reports including chars, summaries, and graphs to interpret the findings to the team and stakeholders. • Developed spring boot applications to read data from Kafka in an event-based manner. These applications were developed to run as micro-services that deals with parts• Responsible for loading the data from BDW Oracle database, Teradata into HDFS using Sqoop. • Implemented AJAX, JSON, and Java script to create interactive web screens. • Configured and implemented the Azure Data Factory Triggers and scheduled the Pipelines; monitored the scheduled Azure Data Factory pipelines and configured the alerts to get notification of failure pipelines.• Expertise in writing Hadoop Jobs for analyzing data using Hive QL (Queries), Pig Latin (Data flow language), and custom MapReduce programs in Java. • Implemented Bucketing and Partitioning using hive to assist the users with data analysis. • Used Oozie scripts for deployment of the application and perforce as the secure versioning software.• Performed statistical analysis using SQL, Python, R Programming and Excel.• Used Python& SAS to extract, transform & load source data from transaction systems, generated reports, insights, and key conclusions.• Developed story telling dashboards in Tableau Desktop and published them on to Tableau Server which allowed end users to understand the data on the fly with the usage of quick filters for on demand needed information.
  • Tufts Health Plan
    Data Engineer
    Tufts Health Plan Feb 2017 - Jul 2019
    Watertown, Massachusetts, United States
    • • Created HBase tables to load large sets of structured data. • Stored data in AWS S3 like HDFS and performed EMR programs on data stored. • Used the AWS-CLI to suspend an AWS Lambda function. Used AWS CLI to automate backups of ephemeral data-stores to S3 buckets, EBS.• Worked extensively with HIVE DDLs and Hive Query language (HQLs).• Processed data into HDFS by developing solutions.• Experience in fact dimensional modeling (Star schema, Snowflake schema), transactional modeling and SCD (Slowly changing dimension)• Devised PL/SQL Stored Procedures, Functions, Triggers, Views and packages. Made use of Indexing, Aggregation and Materialized views to optimize query performance.• Analyzed the data using Map Reduce, Pig, Hive and produce summary results from Hadoop to downstream systems.• Implemented Sqoop for large dataset transfer between Hadoop and RDBMs.• Created components like Hive UDFs for missing functionality in HIVE for analytics.• Developing Scripts and Batch Job to schedule a bundle (group of coordinators) which consists of various.• Developed logistic regression models (Python) to predict subscription response rate based on customer’s variables like past transactions, response to prior mailings, promotions, demographics, interests, and hobbies, etc.• Develop near real time data pipeline using spark• Connected to AWS Redshift through Tableau to extract live data for real time analysis• Cluster co-ordination services through Zookeeper.
  • Digital Ignite
    Data Engineer
    Digital Ignite Jan 2014 - Oct 2016
    Hyderabad, Telangana, India
    • • Designed, implemented and deployed within a customer’s existing Hadoop / Cassandra cluster a series of custom parallel algorithms for various customer defined metrics and unsupervised learning models.• Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.• Installed Oozie workflow engine to run multiple Hive and Pig Jobs.• Implemented Avro and parquet data formats for apache Hive computations to handle custom business requirements.• Developed Simple to complex Map/reduce Jobs using Hive and Pig• Developed Map Reduce Programs for data analysis and data cleaning.• Performed data cleansing, enrichment, mapping tasks and automated data validation processes to ensure meaningful and accurate data was reported efficiently.• Implemented Apache PIG scripts to load data from and to store data into Hive.• Extensively used SSIS transformations such as Lookup, Derived column, Data conversion, Aggregate, Conditional split, SQL task, Script task and Send Mail task etc.

Akhil K Education Details

Frequently Asked Questions about Akhil K

What company does Akhil K work for?

Akhil K works for American Express

What is Akhil K's role at the current company?

Akhil K's current role is Data Engineer.

What schools did Akhil K attend?

Akhil K attended Jntuh College Of Engineering Hyderabad.

Who are Akhil K's colleagues?

Akhil K's colleagues are Sutapa Samanta, Erik Alvarado, Jingjin L., Vikas Anand, Wilson Murillo, Afsana Shaikh, Hugo Miguel Robel Orozco.

Not the Akhil K you were looking for?

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles
Get direct phone numbers & mobile contacts
Access company data & employee information
Works directly on LinkedIn - no copy/paste needed
Get Chrome Extension - Free

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.