Neha T is a Spark developer at Lumen Technologies.
-
Spark DeveloperLumen TechnologiesGeorgetown, Tx, Us -
Senior Data EngineerOptum Apr 2023 - PresentEden Prairie , Mn, Us -
Spark DeveloperLumen Technologies Jan 2022 - PresentMonroe, Louisiana , UsDeveloped Kafka producer and consumers, HBase clients, Spark, and Hadoop MapReduce jobs along with components on HDFS, Hive. Used Hive to create tables and involved in data loading, writing to Hive using UDF. Design and develop ETL Data Pipelines using Spark, Spark Streaming, Sqoop and Scala for Open-source Hadoop applications to ingest, transform, analyze customer data and maintained ETL, technical documentation. Worked on developing ETL processes (Data Stage Open Studio) to load data from multiple data sources to HDFS using FLUME and Sqoop and performed structural modifications using Map Reduce HIVE. Proficient in creating GCP firewall rules to allow or deny traffic to and from the VM's instances based upon specified configuration and configured GCP cloud CDN (content delivery network) to deliver the content from GCP cache locations drastically developing user experience and latency. Multiple data pipelines, end-to-end ET, and ET processes for data ingestion and transformation in GCP were developed, built, and architected, and team tasks were coordinated. Used Azure Data Factory, SQL API, Mongo API, integrated data from MongoDB, MS SQL, cloud (Blob, Azure SQL DB, Azure Cosmos DB). Involved in Custom Process design of Transformation via Azure Data Factory. Extensively used the Azure Service like Azure Data Factory and Logic App for ETL, to push in/out the data from DB to Blob storage, HDInsight - HDFS, Hive Tables. Monitored GCP the Hive Meta store and the cluster nodes with the help of Hue. Managed resources and scheduling across the cluster using Azure Kubernetes Service. Used Pub/ Sub topics and subscriptions when the file is dropped in the GCS bucket and the topic gets triggered where the subscribers of that particular topic will start executing the scripts. Participated, run, and validated the ETL interfaces in System testing, executed positive/negative/minus testing cases and documented the results for future reference. -
Hadoop DeveloperArgano Jan 2019 - Dec 2020Plano, Texas, UsIngested data from various data sources into Hadoop HDFS/Hive Tables using SQOOP, Flume, and Kafka.Extended Hive core functionality by writing custom UDFs using Java.Developing Hive Queries for the user requirement. Worked on multiple POCs in Implementing Data Lake for Multiple Data Sources ranging from Teamcenter, SAP, Workday, and Machine logs.Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.Ingested data into Azure Blob storage and processed the data using Databricks. Involved in writing Spark Scala scripts and UDF's to perform transformations on large datasets. Develop data pipeline features to process incoming healthcare information quickly and reliably.Planning, scheduling, and implementing Oracle to MS SQL server migrations for AMAT in-house applications and tools.Model complex ETL jobs that transform data visually with data flow or by using compute services Azure Databricks, Azure Blob Storage, Azure SQL Database. Designed and wrote the entire ETL/ELT process to support Data Warehouse with complex dependencies in hybrid Business Intelligence environment (Azure & SQL Server). Designed and implemented Azure Data factory framework (V2) with Error logging to populate data in Azure SQL Data warehouse from Azure Blob storage. Worked on Solr Search Engine to index incident reports data and developed dashboards in the Banana Reporting tool.Integrated Tableau with Hadoop data source for building a dashboard to provide various insights on sales of the organization.Worked on Spark in building BI reports using Tableau. Tableau was integrated with Spark using Spark-SQL. Developed Spark jobs using Scala and Python on top of Yarn/MRv2 for interactive and Batch Analysis. Created multi-node Hadoop and Spark clusters in AWS instances to generate terabytes of data and stored it in AWS HDFS.Developed workflows in Live compared to Analyze SAP Data and Reporting. -
Software EngineerTricon Infotech Feb 2017 - Dec 2018Bangalore, Karnataka, InDesigned and implemented Java Classes, Interfaces, Model design, and interface layer design with other team members.Developed JSPs and Servlets to dynamically generate HTML and display the data to the client side. Extensively used JSP tag libraries.Designed and developed web-based software using STRUTS MVC Framework.Created and modified Stored Procedures, Functions, and Triggers Complex SQL Commands for the application using PL/SQL. Responsible for optimizing, improving the performance of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames, and Pair RDDs. Performed data operations like Text Analytics and Data Processing, using the in-memory computing capabilities of Spark using Scala. Extract real-time data feed using Kafka, process core job using Spark Streaming to Resilient Distributed Datasets (RDD) to process them as Data Frames and save as Parquet format in HDFS and NoSQL databases. Used Spark-SQL to read, process the parquet data, and create the tables using the Scala API. Monitored Spark Application to capture the logs generated by Spark jobs. Design and Develop ETL Processes using AWS Glue to migrate the data collected from external sources like S3, SQL Server, Mongo DB, and SFTP server into AWS Redshift. Configured Zookeeper to manage Kafka cluster nodes, coordinate the brokers/cluster topology. Utilized Kafka functionalities like distribution, partition, replicated commit log service for messaging systems by maintaining feeds and create applications, which monitors consumer lag within Apache Kafka clusters. Implement Data Interface to get information on customers using Rest API. Written Oozie workflow to run the Sqoop and HQL scripts in Amazon EMR. Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs. Involvement in creating custom UDFs for Pig and Hive to consolidate strategies and usefulness of Python into Pig Latin and HQL (HiveQL).
Neha T Education Details
-
Sacred Heart UniversityComputer Science
Frequently Asked Questions about Neha T
What company does Neha T work for?
Neha T works for Lumen Technologies
What is Neha T's role at the current company?
Neha T's current role is Spark developer.
What schools did Neha T attend?
Neha T attended Sacred Heart University.
Who are Neha T's colleagues?
Neha T's colleagues are David Mulatz, Damon Simon, Doyle Remington, Kolaboina Akshitha, Shubham Shrestha, Gary Bundage, Marc Weinmann.
Free Chrome Extension
Find emails, phones & company data instantly
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial