• Experience in developing custom UDFs for Pig and Hive to incorporate methods and functionality of Python/Java into Pig Latin and HQL (HiveQL). • Expertise in core Java, JDBC and proficient in using Java API's for application development. • Expertise in Java Script, JavaScript MVC patterns, Object Oriented JavaScript Design Patterns and AJAX calls. • Leveraged and integrated Google Cloud Storage and Big Query applications, which connected to Tableau for end user web-based dashboards and reports. • Good working experience in Application and web Servers like JBoss and Apache Tomcat. • Good Knowledge in Amazon Web Service (AWS) concepts like Athena, EMR and EC2 web services which provides fast and efficient processing of Teradata Big Data Analytics. • Expertise in Big Data architecture like Hadoop (Azure, Hortonworks, Cloudera) distributed system, MongoDB, NoSQL and HDFS, parallel processing - MapReduce framework • Development of Spark-based application to load streaming data with low latency, using Kafka and Pyspark programming. • Hands on experience on Hadoop /Big Data related technology experience in Storage, Querying, Processing and analysis of data. • Experience in development of Big Data projects using Hadoop, Hive, HDP, Pig, Flume, Storm and MapReduce open-source tools. • Experience in installation, configuration, supporting and managing Hadoop clusters. • Experience in working with MapReduce programs using Apache Hadoop for working with Big Data. • Experience in developing, support and maintenance for the ETL (Extract, Transform and Load) processes using Talend Integration Suite. • Experience in installation, configuration, supporting and monitoring Hadoop clusters using Apache, Cloudera distributions and AWS. • Strong hands-on experience with AWS services, including but not limited to EMR, S3, EC2, route53, RDS, ELB, DynamoDB, CloudFormation, etc. • Hands on experience in Hadoop ecosystem including Spark, Kafka, HBase, Scala, Pig, Impala, Sqoop, Oozie, Flume, Storm, big data technologies. • Experienced in working with different scripting technologies like Python, UNIX shell scripts. • Good Knowledge in Amazon Web Service (AWS) concepts like EMR and EC2 web services successfully loaded files to HDFS from Oracle, SQLServer, Teradata and Netezza using Sqoop. • Expert in Amazon EMR, Spark, Kinesis, S3, ECS, Elastic Cache, Dynamo DB, Redshift. • Experience in installation, configuration, supporting and managing -Cloudera Hadoop platform along with CDH4&CDH5 clusters.
Jefferson Bank, Jefferson City, Mo
-
Sr. Big Data EngineerJefferson Bank, Jefferson City, Mo Jul 2021 - Present• Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs. • Responsible for Big data initiatives and engagement including analysis, brainstorming, POC, and architecture. • Loaded and transformed large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts. • Installed and Configured Apache Hadoop clusters for application development and Hadoop tools. • Installed and configured Hive and written Hive UDFs and used repository of UDF's for Pig Latin. • Developed data pipeline using Pig, Sqoop to ingest cargo data and customer histories into HDFS for analysis. • Migrated the existing on-prem code to AWS EMR cluster. • Installed and configured Hadoop Ecosystem components and Cloudera manager using CDH distribution.• Created automated pipelines in AWS Code Pipeline to deploy Docker containers in AWS ECS using S3. • Used HBase NoSQL Database for real time and read/write access to huge volumes of data in the use case. • Extracted Real time feed using Spark streaming and convert it to RDD and process data into Data Frame and load the data into HBase. • Developed AWS Lambda to invoke glue job as soon as a new file is available in Inbound S3 bucket. • Created spark jobs to apply data cleansing/data validation rules on new source files in inbound bucket and reject records to reject-data S3 bucket. • Developed AWS cloud formation templates and setting up Auto scaling for EC2 instances and involved in the automated provisioning of AWS cloud environment using Jenkins. • Created HBase tables to load large sets of semi-structured data coming from various sources. • Responsible for loading the customer's data and event logs from Kafka into HBase using REST API. • Created tables along with sort and distribution keys in AWS Redshift. • Created shell scripts and python scripts to automate our daily tasks (includes our production tasks as well)
-
Azure Data EngineerThomson Reuters Eagan, Mn Sep 2019 - Jun 2021• Migrated SQL database to Azure data Lake, Azure data lake Analytics, Azure SQL Database, Data Bricks, Azure SQL Data warehouse and controlling and granting database access and Migration On-premise databases to Azure Data Lake store using Azure Data factory.• Experience in GCP Dataproc, GCS, Cloud functions, Big Query.• Experience in Developing Spark applications using Spark/PySpark - SQL in Databricks for data extraction, transformation, and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage, consumption patterns, and behavior.• Skilled dimensional modeling, forecasting using large-scale datasets (Star schema, Snowflake schema), transactional modeling, and SCD (Slowly changing dimension).• Developed scripts to transfer data from FTP server to the ingestion layer using Azure CLI commands.• Created Azure HD Insights cluster using PowerShell scripts to automate the process.• Used stored procedure, lookup, execute the pipeline, data flow, copy data, Azure function features in ADF.• Used Azure Data Lake storage gen2 to store excel files, parquet files, and retrieve user data using Blob API.• Worked on Azure data bricks, PySpark, Spark SQL, Azure ADW, and Hive used to load and transform data.• Used Azure Data Lake as Source and pulled data using Azure Polybase.• Azure data lake, Azure Blob used for storage and performed analytics in Azure Synapse Analytics.• 1+ years of experience in Azure Cloud, Azure Data Factory, Azure Data Lake Storage, Azure Synapse Analytics, Azure Analytical services, Azure Cosmos NO SQL DB, Azure HDInsight Big Data Technologies (Hadoop and Apache Spark) and Data bricks.• Experience in designing Azure Cloud Architecture and Implementation plans for hosting complex application workloads on MS Azure.• Ingested data from RDBMS and performed data transformations, and then export the transformed data to Cassandra• Used JIRA for bug tracking and CVS for version control.
-
Big Data DeveloperBroadcom, San Jose, Ca Mar 2018 - Aug 2019Contributing to the development of key data integration and advanced analytics solutions leveraging Apache HadoopDeveloped a data pipeline using Kafka, HBase, Spark and Hive to ingest, transform and analyzing customer behavioral data also developed Spark jobs and Hive Jobs to summarize and transform data. Worked extensively with importing metadata into Hive and migrated existing tables and applications to work on Hive and PySpark. • Implemented Sqooping from Oracle and MongoDB to Hadoop and load back in parquet format. • Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data; Worked under Map Distribution and familiar with HDFS. • Handled importing data from different data sources into HDFS using Sqoop and performing transformations using Hive, MapReduce and then loading data into HDFS. • Collecting and aggregating large amounts of log data using Flume and staging data in HDFS for further analysis. • Designed and maintained Test workflows to manage the flow of jobs in the cluster. • Worked with the testing teams to fix bugs and ensured smooth and error-free code. • Designed dimensional data models using Star and Snowflake Schemas. • Preparation of docs like Functional Specification document and Deployment Instruction documents. • Experience in making the Devops pipelines using Openshift and Kubernetes for the Microservices Architecture. • Fixed defects during the QA phase, support QA testing, troubleshoot defects and identify the source of defects. • Involved in installing Hadoop Ecosystem components (Hadoop, MapReduce, Spark, Pig, Hive, Sqoop, Flume, Zookeeper and HBase). • Leveraged and integrated Google Cloud Storage and Big Query applications, which connected to Tableau for end user web-based dashboards and reports. • Worked collaboratively with all levels of business stakeholders to architect, implement and test Big Data based analytical solution from disparate sources.
-
Hadoop DeveloperMaisa Solutions Private Limited Hyderabad, India Apr 2015 - Dec 2017• Implemented solutions for ingesting data from various sources and processing the Data-at-Rest utilizing Bigdata technologies such as Hadoop, Map Reduce Frameworks, HBase, and Hive. • Used Sqoop to efficiently transfer data between databases and HDFS and used Flume to stream the log data from servers. • Developer full SDLC of AWS Hadoop cluster based on client's business need • Involved in loading and transforming large sets of structured, semi structured and unstructured data from relational databases into HDFS using Sqoop imports. • Implement enterprise grade platform (mark logic) for ETL from mainframe to NOSQL (Cassandra) • Responsible for importing log files from various sources into HDFS using Flume • Analyzed data using HiveQL to generate payer by reports for transmission to payer's form payment summaries.• Imported millions of structured data from relational databases using Sqoop import to process using Spark and stored the data into HDFS in CSV format. • Used Data Frame API in Scala for converting the distributed collection of data organized into named columns. • Performed data profiling and transformation on the raw data using Pig, Python, and Java. • Developed predictive analytic using ApacheSparkScalaAPIs. • Involved in working of big data analysis using Pig and User defined functions (UDF). • Created Hive External tables and loaded the data into tables and query data using HQL. • Implemented Spark Graph application to analyze guest behavior for data science segments. • Enhancements to traditional data warehouse based on STAR schema, update data models, perform Data Analytics and Reporting using Tableau. • Involved in migration of data from existing RDBMS (oracle and SQL server) to Hadoop using Sqoop for processing data. •Developed Shell, Perl and Python scripts to automate and provide Control flow to Pig scripts. Developed prototype for Big Data analysis using Spark, RDD, Data Frames and Hadoop eco system with CSV, JSON and HDFSfiles
-
Big Data DeveloperCeequence Technologies Hyderabad, India Jun 2013 - Mar 2015• Involved in complete SDLC life cycle of big data project that includes requirement analysis, design, coding, testing and production. • Extensively Used Sqoop to import/export data between RDBMS and hive tables, incremental imports and created Sqoop jobs for last saved value. • Established custom Map Reduces programs to analyze data and used Pig Latin to clean unwanted data. • Installed and configured Hive and wrote Hive UDF to successfully implement business requirements. • Involved in creating hive tables, loading data into tables, and writing hive queries those are running in MapReduce way. • Experienced with using different kind of compression techniques to save data and optimize data transfer over network using Lzo, Snappy, etc. in Hive tables.• Implemented custom interceptors for flume to filter data and defined channel selectors to multiplex the data into different sinks. • Experience in working with Spark SQL for processing data in the Hive tables. • Developing Scripts and Tidal Jobs to schedule a bundle (group of coordinators), which consists of various Hadoop Programs using Oozie. • Involved in writing test cases, implementing unit test cases. • Installed Oozie workflow engine to run multiple Hive and Pig jobs which run independently with time and data availability. • Hands on experience with Accessing and perform CURD operations against HBase data using Java API. • Analyzed the data by performing Hive queries and running Pig scripts to know user behavior. • Implemented POC to migrate map reduce jobs into Spark RDD transformations using Scala. • Developed spark applications using Scala for easy Hadoop transitions. • Extensively used Hive queries to query data according to the business requirement. • Used Pig for analysis of large data sets and brought data back to HBase by Pig
Frequently Asked Questions about Sravani D
What company does Sravani D work for?
Sravani D works for Jefferson Bank, Jefferson City, Mo
What is Sravani D's role at the current company?
Sravani D's current role is Actively Looking for New Opportunities.
Free Chrome Extension
Find emails, phones & company data instantly
Aero Online
Your AI prospecting assistant
Select data to include:
0 records × $0.02 per record
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial