Deepak A

Deepak A Email and Phone Number

Sr Scala Developer @ AT&T
Alpharetta, GA, US
Deepak A's Location
Alpharetta, Georgia, United States, United States
About Deepak A

• Strong knowledge on Distributed Computing Concepts and Parallel Processing Techniques using frameworks like Map reduce and Spark.• Strong hands on experience using Hadoop ecosystem components like Spark, MapReduce, HDFS, HBase, Zookeeper, Hive, Sqoop, Spark, Flume, Pig and Oozie.• Experience includes Requirements Gathering/Analysis, Design, Development, Versioning, Integration, Documentation, Testing, Build and Deployment• Strong experience working with SQL solutions like Hive and Impala for performing data analysis on large data sets.• Familiar with (No-SQL) big-data databases like Hbase, MongoDB and Cassandra• Experience working with MapReduce programs, Pig scripts and Hive to deliver the best results.• Good Knowledge in creating event processing data pipelines using Kafka and Spark Streaming.• Good Knowledge and experience with the Hive Query optimization and Performance tuning.• Hands on experience in writing Pig Latin Scripts and custom implementations using Hive and Pig UDF'S.• Experience in supporting data analysis projects by using Elastic MapReduce on the Amazon Web Services (AWS) cloud. Performed Export and import of data into S3• Importing and exporting data into HDFS and Hive using Sqoop.• Experience in using Flume to load log files into HDFS and Oozie for work flow design and scheduling.• Worked on developing, monitoring and Jobs Scheduling using UNIX Shell Scripting• Hands-on experience with Hadoop applications (such as administration, configuration management, monitoring, debugging, and performance tuning).• Experience in tuning of Hadoop Cluster to achieve good performance in processing• Well versed in installation, Configuration, Supporting and Managing of Big Data and underlying infrastructure of Hadoop Cluster

Deepak A's Current Company Details
AT&T

At&T

View
Sr Scala Developer
Alpharetta, GA, US
Deepak A Work Experience Details
  • At&T
    Sr Scala Developer
    At&T
    Alpharetta, Ga, Us
  • At&T
    Sr Scala Developer
    At&T Jan 2022 - Present
    Dallas, Tx, Us
    • Developed custom data Ingestion adapters to extract the log data and click stream data from external systems and load into HDFS by making use of Spark/Scala.• Developed the Pysprk code for AWS Glue jobs and for EMR.• Worked on scalable distributed data system using Hadoop ecosystem in AWS EMR, MapR distribution.• Developed spark program using scala API’s to compare the performance of spark with hql.• Implemented spark using Scala and SparkSql.• Creating Hive tables, loading data and writing hive queries for building Analytical Datasets.• Worked on real time data ingestion and processing using Spark Streaming, and HBase.• Used Spark and Spark-SQL to read the data and create the tables in hive using the Scala API.• Implemented Spark using Scala and Spark SQL for faster testing and processing of data.• Implemented Spark using Scala and utilizing Data frames and Spark SQL API for faster processing of data.• Worked with Play framework and Akka parallel processing.• Work with spark eco system using Spark SQL and Scala queries on different formats like Text file, CSV file.• Developed Kafka producer and Spark Streaming consumer to read the stream of events as per business rules.• Designed and developed Job flows using TWS.• Developed Sqoop commands to pull the data from Teradata.• Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.• Used AVRO, Parquet File formats and Snappy compression through the project.• Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.• Configured Spark streaming to get ongoing information from the source and store the stream information to HDFS.
  • At&T
    Sr. Big Data/Aws Developer
    At&T Apr 2020 - Dec 2021
    Dallas, Tx, Us
    • Worked on AWS pipelines for data migration to create and monitor multiple instances such as EC2, S3, AWS lambda, step function, EMR, spark, and Hadoop Development.• Design, development, and implementation of ETL solutions on AWS and in the Big Data environment, as well as migration of existing objects from on-premises, Redshift, and HDFS to S3 Data Lake and Snowflake.• Developed reusable framework to be leveraged for future migrations that automates ETL from RDBMS systems to the Data Lake utilizing Spark Data Sources and Hive data objects.• Used SQL, NumPy, Pandas, Boto3, and Hive for data analysis and model building• Created Python scripts in Spark for data aggregation, queries, and writing data back into OLTP (Online Transactional processing) system using Data frames/SQL/Data sets and RDD/MapReduce.• Involved in design and analysis of the issues and providing solutions and workarounds to the users and end-clients.• Used Python to create multiple Spark Streaming and Spark SQL jobs on AWS.• Created a job monitoring tool to monitor jobs scheduled using Data Pipeline, Step Functions, and EMR.• Created a Data Pipeline utilizing Processor Groups and numerous processors in Apache Nifi for Flat File, RDBMS as part of a Proof of Concept (POC) on Amazon EC2.• Provisioning of EC2 instances, as well as transient and long-running EMRs (Elastic Map Reduce) to handle petabytes of data• Configure access to RDS DB services, DynamoDB tables, and EBS volumes for inbound and outbound traffic to set alarms for notifications or automated actions on AWS.• Developed a data pipeline for various events such as ingestion, aggregation, and loading consumer response data from an AWS S3 bucket into Hive external tables in HDFS to feed tableau dashboards.• Wrote scripts to collect high-frequency log data from various sources and integrate it into AWS using Kinesis, staging data in the Data Lake for later analysis.
  • At&T
    Sr Spark/Scala Developer
    At&T Sep 2018 - Apr 2020
    Dallas, Tx, Us
    • Developed custom data Ingestion adapters to extract the log data and click stream data from external systems and load into HDFS by making use of Spark/Scala.• Developed the Pysprk code for AWS Glue jobs and for EMR.• Worked on scalable distributed data system using Hadoop ecosystem in AWS EMR, MapR distribution.• Developed spark program using scala API’s to compare the performance of spark with hql.• Implemented spark using Scala and SparkSql.• Creating Hive tables, loading data and writing hive queries for building Analytical Datasets.• Worked on real time data ingestion and processing using Spark Streaming, and HBase.• Used Spark and Spark-SQL to read the data and create the tables in hive using the Scala API.• Implemented Spark using Scala and Spark SQL for faster testing and processing of data.• Implemented Spark using Scala and utilizing Data frames and Spark SQL API for faster processing of data.• Worked with Play framework and Akka parallel processing.• Work with spark eco system using Spark SQL and Scala queries on different formats like Text file, CSV file.• Developed Kafka producer and Spark Streaming consumer to read the stream of events as per business rules.• Designed and developed Job flows using TWS.• Developed Sqoop commands to pull the data from Teradata.• Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.• Used AVRO, Parquet File formats and Snappy compression through the project.• Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.• Configured Spark streaming to get ongoing information from the source and store the stream information to HDFS.• Used various spark Transformations and Actions for cleansing the input data.• Developed shell scripts to generate the hive create statements from the data and load the data into the table.• Optimized Hive QL/ pig scripts by using execution engine like Tez, Spark.
  • At&T
    Data Engineer
    At&T Feb 2017 - Aug 2018
    Dallas, Tx, Us
    • Involved in the entire project life cycle, from design discussions to production deployment.• Good Knowledge of monitoring, managing, and reviewing Hadoop clusters using Cloudera Manager.• Expertise in Cluster Analysis using various big data analytic tools such as MapReduce and Hive• Supported the Hadoop Architect team in developing a Database Design in HDFS using HBase Architecture Design.• Used Scala to create Spark applications to ease the transition to Hadoop.• Developed scripts and batch jobs for scheduling various Hadoop programs and was involved in the maintenance and review of Hadoop log files.• Used Sqoop for data import and export from RDBMS to HDFS• Extracted the required data from the server into HDFS and Bulk Loaded cleaned data into HBase.• Built a data pipeline for a compliance report proof-of-principle consisting of Sqoop, Hadoop (HDFS), SparkSQL, Scala, Elasticsearch, and Kibana.• The open-source Python web scraping framework was used to crawl and extract data from web pages, and the conversion was performed using Hadoop, Hive, and MapReduce.• Created Airflow Scheduling scripts in Python.• Used Apache Spark to ingest Kafka data. Loading and transforming large volumes of structured, semi-structured, and unstructured data was done.• Expert in Oozie, and Data pipeline operational services to coordinate clusters and plan workflows.• Used Cloudera Manager to continuously monitor and manage the Hadoop cluster.• Mappings done with reusable components such as worklets and mapplets, as well as other transformations.
  • Ge
    Hadoop Developer
    Ge Jan 2014 - Aug 2015
    Boston, Ma, Us
    • Installed and configured Hadoop MapReduce HDFS Developed multiple MapReduce jobs in java for data cleaning and preprocessing.• Experience in installing configuring and using Hadoop ecosystem components.• Importing and exporting data into HDFS and Hive using Sqoop.• Experienced in defining job flows.• Knowledge in performance troubleshooting and tuning Hadoop clusters.• Experienced in managing and reviewing Hadoop log files.• Participated in development/implementation of Cloudera Hadoop environment.• Load and transform large sets of structured semi structured and unstructured data.• Responsible to manage data coming from different sources.• Got good experience with NOSQL database.• Supported Map Reduce Programs those are running on the cluster.• Involved in loading data from UNIX file system to HDFS.• Installed and configured Hive and also written Hive UDFs.• Involved in creating Hive tables loading with data and writing hive queries which will run internally in map reduce way.• Worked on installing cluster commissioning decommissioning of datanode namenode recovery capacity planning and slots configuration.• Created HBase tables to store variable data formats of PII data coming from different portfolios.• Implemented best income logic using Pig scripts.• Load and transform large sets of structured semi structured and unstructured data• Cluster coordination services through Zookeeper.• Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.• Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.• Managing the version control for the deliverables by streamlining and re-basing the development streams of the SVN.Environment: Hadoop, MapReduce, HDFS, Hive, Java, SQL, PIG, Zookeeper, Python and Sqoop, Java/JDK, XML, Web Services, SVN, JUnit, Log4J, Windows, Oracle.

Deepak A Education Details

  • Northwest Missouri State University
    Northwest Missouri State University
    Computer Science

Frequently Asked Questions about Deepak A

What company does Deepak A work for?

Deepak A works for At&t

What is Deepak A's role at the current company?

Deepak A's current role is Sr Scala Developer.

What schools did Deepak A attend?

Deepak A attended Northwest Missouri State University.

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles
Get direct phone numbers & mobile contacts
Access company data & employee information
Works directly on LinkedIn - no copy/paste needed
Get Chrome Extension - Free

Aero Online

Your AI prospecting assistant

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.