Deepika R. Email and Phone Number
Passionate Sr. Data engineer | Cloud EnthusiastData Engineer with 7 years of software engineering expertise, specializing in Big Data technologies. Proficient in Hadoop/Spark development, I excel in cloud engineering on AWS and Azure, contributing to the end-to-end software design lifecycle. My commitment to staying abreast of the latest IT skills and industry knowledge is unwavering.Keyskills:Hadoop, HDFS, MapReduce, Pig, Hive, Spark, Kafka, Flume, Sqoop, Impala, Oozie, Zookeeper, YARN, Hue. Cloudera (CDH4, CDH5), Hortonworks, EMR. Python, Java, Scala. NoSQL (HBase, Cassandra, MongoDB), MySQL, Oracle, DB2, MS SQL Server. AWS (Lambda, EMR, S3, Athena, Redshift, CloudWatch), Azure (ADF, Databricks, Data Lake). Spring, Hibernate, Struts. Scripting: Python, Shell Scripting. Servlets, JavaBeans, JSP, JDBC, EJB. Apache Tomcat, WebSphere, WebLogic, JBoss. ETL Abintito, SAS, Informatica. Reporting Tools: Power BI, TableauExpertise:Developed applications for large-scale distributed data processing using Hadoop ecosystem tools: HDFS, YARN, Sqoop, Flume, Kafka, MapReduce, Pig, Hive, Spark, Pyspark, Spark SQL, Spark Streaming, HBase, Cassandra, MongoDB, Mahout, Oozie, and AWS.Proficient in various Hadoop distributions: Hortonworks, Cloudera, EMR.Skilled in data ingestion tools: Kafka, Sqoop, Flume.Expert in in-memory real-time data processing with Apache Spark.Developed Kafka Producers and Consumers as per business requirements.Extensive work on Spark components: Spark SQL, MLlib, Spark Streaming.Configured Spark Streaming for real-time data processing from Kafka to HDFS using Spark and Scala.In-depth understanding and practical implementation of AWS Cloud-Specific technologies.Worked with Azure, setting up big data clusters using Azure Databricks.Experience with Impala for data analysis.Worked with NoSQL databases: HBase, Cassandra, MongoDB, DynamoDB.Extended HIVE and PIG core functionality with custom UDFs, UDAFs.Expertise in relational databases: MySQL, SQL Server, DB2, Oracle.Collaborated with Azure for Data Center migration to the cloud using Cosmos DB, ARM templates.Extensive work on AWS services: EC2 instance, S3, EMR, Cloud Formation, Cloud Watch, Lambda.Knowledgeable in Hadoop security requirements, integrating with Kerberos authentication.Identified job dependencies and designed workflows for Oozie & YARN resource management.Experience in Core Java, J2EE, JDBC, ODBC, JSP, Java Eclipse, EJB, Servlets.Strong experience in Data Warehousing ETL concepts using Informatica and Abinitio.
Verizon
View- Website:
- verizon.com
- Employees:
- 151940
-
Sr. Data EngineerVerizon Jan 2024 - PresentUnited StatesUtilized PySpark, AWS Glue, and DBeaver to develop ETL jobs for migrating eligibility processes to AWS Cloud.Implemented customized Glue Crawlers and Data Catalogs to ensure efficient data ingestion with consistent metadata.Migrated an on-premises application to AWS, utilizing EC2 and S3 for small dataset processing and storage.Executed short-term ad-hoc queries and jobs on S3-stored data using AWS EMR.Worked with S3, EMR, Redshift, Athena, and Glue Metastore.Validated data from SQL Server to… Show more Utilized PySpark, AWS Glue, and DBeaver to develop ETL jobs for migrating eligibility processes to AWS Cloud.Implemented customized Glue Crawlers and Data Catalogs to ensure efficient data ingestion with consistent metadata.Migrated an on-premises application to AWS, utilizing EC2 and S3 for small dataset processing and storage.Executed short-term ad-hoc queries and jobs on S3-stored data using AWS EMR.Worked with S3, EMR, Redshift, Athena, and Glue Metastore.Validated data from SQL Server to Snowflake for consistency and accuracy, utilizing Snowflake Multi-Cluster Warehouses.Processed real-time data streams from AWS Kinesis using Spark structured streaming, storing data in Snowflake in Parquet format.Implemented multi-table full-load and incremental load data ingestion pipelines.Loaded and transformed large sets of structured and semi-structured data using AWS Glue and PySpark.Experienced with Snowflake, Azure Cloud, Azure Databricks, and Azure Data Factory, showcasing proficiency in cloud-based data analytics and processing.Developed, deployed, and supported fault-tolerant data pipelines leveraging distributed data movement technologies.Established data governance frameworks using tools like Azure Purview, Collibra, etc., ensuring data integrity and compliance.Leveraged Azure Data tool stack, including Azure SQL, Synapse, and Fabric, for comprehensive data management and analysis.Set up CI/CD pipelines using Jenkins, Maven, GitHub, and AWS for automated deployment.Utilized GitHub for version control and Jira for issue tracking.Stored data output in Avro and Parquet file formats for optimized performance.Utilized Azure Cloud services such as Azure Databricks and Azure Data Factory for scalable and cost-effective data processing solutions.Developed and maintained data pipelines in Azure Data Factory, orchestrating data movement and transformation tasks across diverse data.Implemented a Continuous Delivery pipeline with Docker and GitHub. Show less -
Sr. Data EngineerTrimble Inc. Oct 2021 - Dec 2023Utilized AWS (Lambda, Glue, EMR) for data ingestion, transformation, and real-time monitoring. Employed AWS EMR for seamless data transfer across AWS stores, with storage in S3 and DynamoDB. Utilized CloudWatch for application monitoring and log analysis. Stored and analyzed data in Amazon Redshift before loading it into the end database. Designed Azure Data Factory for extensive data ingestion and Spark job execution on Databricks. Explored Spark for performance improvement in Hadoop… Show more Utilized AWS (Lambda, Glue, EMR) for data ingestion, transformation, and real-time monitoring. Employed AWS EMR for seamless data transfer across AWS stores, with storage in S3 and DynamoDB. Utilized CloudWatch for application monitoring and log analysis. Stored and analyzed data in Amazon Redshift before loading it into the end database. Designed Azure Data Factory for extensive data ingestion and Spark job execution on Databricks. Explored Spark for performance improvement in Hadoop, utilizing Spark Context, Spark-SQL, Data Frame, Pair RDDs, and Spark YARN. Implemented Kafka and Spark Streaming for real-time data processing, saving results as Parquet format in HDFS. Transformed relational database models to the Hadoop ecosystem, working with Spark and Spark-SQL via PySpark API. Managed Linux systems and RDBMS databases, ingesting data with Sqoop. Reviewed Hadoop and HBase logs, imported metadata into Hive, and migrated applications to Hive and AWS cloud. Developed and implemented complex ETL processes, utilizing Azure Databricks and other compute services. Possess understanding of Teradata MPP architecture. Created HBase tables, partitions, and buckets for efficient processing. Developed scripts for data modeling and mining, facilitating access to Azure Logs and App Insights. Engineered Kafka data pipelines, utilized Spark API over Hadoop YARN, and managed Hadoop clusters through Cloudera Manager. Involved in the review of functional and non-functional requirements, developed ETL processes using HIVE and HBASE, prepared technical specifications, and managed diverse data sources. Loaded CDRs from relational DB using Sqoop, processed large volumes of data in parallel using Talend functionality, and installed/configured Apache Hadoop, Hive, and Pig environments. Show less -
Data EngineerBank Of America Mar 2020 - Oct 2021Built scalable distributed data solutions using Spark and Hadoop, with a solid understanding of Hadoop HDFS, Map-Reduce, and ecosystem projects. Analyzed Hadoop cluster with tools like Kafka, Pig, Hive, and MapReduce. Practiced story-driven agile development, participating in daily scrum meetings. Worked on batch and streaming data processing, ingesting into NoSQL and HDFS with parquet and AVRO formats. Developed Kafka Producers and Consumers, customized partitions for optimization. Configured… Show more Built scalable distributed data solutions using Spark and Hadoop, with a solid understanding of Hadoop HDFS, Map-Reduce, and ecosystem projects. Analyzed Hadoop cluster with tools like Kafka, Pig, Hive, and MapReduce. Practiced story-driven agile development, participating in daily scrum meetings. Worked on batch and streaming data processing, ingesting into NoSQL and HDFS with parquet and AVRO formats. Developed Kafka Producers and Consumers, customized partitions for optimization. Configured Hadoop environment on AWS (EC2, EMR, Redshift, CloudWatch, Route). Wrote batch scripts in Scala using Spark for AWS S3 data transformations. Configured Spark streaming to receive real-time data from Kafka and store in HDFS using Scala. Developed data pipelines using Flume, Sqoop, and Pig for weblogs extraction and storage in HDFS. Used Spark for interactive queries, streaming data processing, and integration with NoSQL databases. Employed Spark-Cassandra Connector for data loading to and from Cassandra. Imported data into HDFS using Sqoop, performed transformations using Hive, MapReduce, and exported analyzed data to relational databases. Collected and aggregated log data with Flume, staged in HDFS for analysis. Conducted analysis with Hive queries and Pig scripts to study customer behaviour. Extracted, transformed, and loaded large volumes of data into various targets. Developed data-formatted web applications using HTML5, XHTML, CSS, and JavaScript. Assisted in cluster maintenance, monitoring, troubleshooting, and managed data backups and log files. Show less -
Data EngineerAllstate Jan 2016 - Mar 2020Managed Apache Hadoop clusters, configuring tools like Hive, Pig, HBase, Flume, Oozie, Zookeeper, and Sqoop. Installed and configured Hadoop MapReduce and HDFS, developed Java MapReduce jobs for data cleaning. Integrated logs into HDFS using Flume. Utilized Data Frames and Spark SQL for efficient querying and analysis. Implemented Hadoop cluster on Azure for proof of concept. Used Sqoop to migrate data from MySQL to HDFS and Hive DB. Developed MapReduce jobs on YARN for daily and monthly… Show more Managed Apache Hadoop clusters, configuring tools like Hive, Pig, HBase, Flume, Oozie, Zookeeper, and Sqoop. Installed and configured Hadoop MapReduce and HDFS, developed Java MapReduce jobs for data cleaning. Integrated logs into HDFS using Flume. Utilized Data Frames and Spark SQL for efficient querying and analysis. Implemented Hadoop cluster on Azure for proof of concept. Used Sqoop to migrate data from MySQL to HDFS and Hive DB. Developed MapReduce jobs on YARN for daily and monthly reports. Employed Spark API on Hortonworks Hadoop YARN for analytics in Hive. Developed Spark and Spark-SQL/Streaming for faster data processing. Extensive experience in HDFS, Pig, Hive, Sqoop, Flume, Oozie, MapReduce, Zookeeper, Kafka, Spark, and HBase. Integrated Apache Storm with Kafka for web analytics and clickstream data processing. Migrated MongoDB shared/replica cluster between data centers without downtime. Managed and monitored large MongoDB shared cluster environments. Imported/exported RDBMS data into HDFS with Hive and Pig using Sqoop. Experienced in Agile Development for diverse requirements. Conducted advanced procedures like text analytics using Spark with Scala and Python. Deployed Hadoop Yarn, Spark, and Storm integration with Cassandra, Ignite, and Kafka. Implemented a log producer in Scala for Kafka and Zookeeper-based log collection platform. Show less -
J2Ee DeveloperCisco Jun 2014 - Dec 2015Designed and developed Java backend batch jobs for updating product offer details, utilizing Core Java with Multithreading and Design Patterns. Employed Spring MVC framework and dependency injection for architecture development. Created screens, Controller classes, business services, and Dao layer for modules, implementing Business Logic using POJOs. Developed Graphical User Interfaces with HTML, JSPs, and AngularJS. Utilized DAO pattern to decouple business logic and data, implemented… Show more Designed and developed Java backend batch jobs for updating product offer details, utilizing Core Java with Multithreading and Design Patterns. Employed Spring MVC framework and dependency injection for architecture development. Created screens, Controller classes, business services, and Dao layer for modules, implementing Business Logic using POJOs. Developed Graphical User Interfaces with HTML, JSPs, and AngularJS. Utilized DAO pattern to decouple business logic and data, implemented Hibernate in the data access object layer for SQL Server Database interaction. Applied Core Java concepts like Multi-Threading, Exception Handling, and Collection APIs. Wrote JUnit test cases for unit testing. Interfaced with Oracle backend using Hibernate and XML configured files. Created dynamic HTML pages, implemented client-side validations with JavaScript, and used AJAX for interactive front-end GUI. Consumed Web Services and Restful Web services for data transfer. Coded, maintained, and administered Servlets and JSP components on a Spring Boot. Wrote PL/SQL queries, stored procedures, and triggers for back-end database operations. Used Maven for building the J2EE application. Developed code modules in Eclipse IDE, established connectivity with SQL database using JDBC, and implemented logging with Log4j framework. Show less
Frequently Asked Questions about Deepika R.
What company does Deepika R. work for?
Deepika R. works for Verizon
What is Deepika R.'s role at the current company?
Deepika R.'s current role is Sr. Data Engineer | Scala Developer | Big Data engineer| Expert in Hadoop/Spark, Cloud (AWS, Azure), Java, Automation, and Full Software Lifecycle.
Who are Deepika R.'s colleagues?
Deepika R.'s colleagues are Ben And Marsha Butler, Debra Johnson, Carol Do, Ashrit Mathew, Lisa Nguyen, Edel Moore-O'brien, Dakota Davis.
Not the Deepika R. you were looking for?
-
Deepika R.
Mckinney, Tx
Free Chrome Extension
Find emails, phones & company data instantly
Aero Online
Your AI prospecting assistant
Select data to include:
0 records × $0.02 per record
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial