Dinesh R Email and Phone Number
• Big Data Engineer with 6 years of working experience in designing and implementing completeend-to-end Hadoop Infrastructure using MapReduce, PIG, HIVE, Sqoop, Oozie, Flume, Spark, HBase, and zookeeper.• 4+ years of experience as a Data Analyst and a strong understanding of Client/Server Applications, Visualization tools, and Data Warehouses. Data-Driven quantitative analysis, Data Integration, and Resource utilization in the Big Data ecosystem.• Experience in multiple Hadoop distributions like HortonWorks, AWS, Cloudera, and MapR• Hands-on experience in the Analysis, Design, Coding, and Testing phases of Software Development Life Cycle (SDLC).• Hands-on experience with AWS (Amazon Web Services), Elastic Map Reduce (EMR), Storage S3, EBS, EC2 instances, and Data Warehousing.Technical Acumen:• Big Data Eco-system: HDFS, MapReduce, Pig, Hive, YARN, Impala, Sqoop, Flume, Oozie, Zookeeper, Spark, Scala, Storm, Kafka, Spark SQL, Azure SQL• Hadoop Technologies: Apache Hadoop 1.x, Apache Hadoop 2.x, Cloudera CDH4/CDH5, Hortonworks• Programming Languages: Java, MATLAB, Python, Scala, Shell Scripting, HiveQL• Operating Systems: Windows (XP/7/8/10), Linux (Ubuntu, Centos)• NoSQL Database: HBase, Cassandra, and Mongo DB• Database: RDBMS, MySQL, Teradata, DB2, Oracle• BI Tool: Tableau, Power BI• Cloud: AWS (EC2, S3, Simple DB, ECS, ELB, IAM, Cloud Watch, Lambdas, Azure• Web Development Build Tools: Git, Maven, Jenkins, SVN• IDE Tools: Eclipse, Anaconda, PyCharm, Jupyter, IntelliJ• Application servers: Apache Tomcat, J2EE, JDBC, ODBC• Machine Learning and Analytical Tools: Supervised Learning (Linear Regression, Logistic Regression, Decision Tree, Random Forest, SVM, Classification), Unsupervised Learning (Clustering, KNN, Factor Analysis, PCA), Neural Networks, Natural Language Processing, Google Analytics Fiddler, Tableau.
Fidelity Investments
View- Website:
- fidelity.com
- Employees:
- 10
- Company email:
- Johrenk7@gmail.com
-
Big Data EngineerFidelity Investments Jun 2020 - PresentBoston, Ma, Us• Worked with Spark and improved the performance and optimized the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, RDD's, Spark YARN.• Design, Develop, Implement the ETL objects by extracting the data using Sqoop from source system to Hadoop Files system (HDFS). • Designed and developed ETL code using Informatica Mappings to load data from heterogeneous Source systems flat files, XML’s, MS Access files, Oracle to target system Oracle under Stage, then to the data warehouse and then to Data Mart tables for reporting.• Developed scripts in Hive to perform transformations on the data and loaded it into target systems for reports.• Collecting and aggregating large amounts of log data using Flume and tagging data in HDFS for further analysis. • Worked extensively with importing metadata into Hive using Python and migrated existing tables and applications to work on the AWS cloud (S3). • Involved in complete Big Data flow of the application starting from data ingestion from upstream to HDFS, processing the data in HDFS, and analyzing the data.• Automated the process for the extraction of data from warehouses and weblogs by developing workflows and coordinator jobs in Oozie. • Worked on performance tuning of Apache Kafka workflow to optimize the data ingestion speeds.• Involved in the development of full life cycle implementation of ETL using Informatica, Oracle and helped with designing the Date warehouse by defining Facts, Dimensions, and relationships between them and applied the Corporate Standards in Naming Conventions.• Implemented ETL pipelines on AWS EMR.• Worked on AWS Lambda to run the code in response to events, such as changes to data in an Amazon -
Big Data DeveloperVerizon Oct 2019 - Jun 2020Basking Ridge, Nj, Us• Developed Spark scripts, UDF's using both Spark DSL and Spark SQL query for data aggregation, querying, and writing data back into RDBMS through Sqoop. • Designed and developed a Data Lake using Hadoop for processing raw and processed claims via Hive and Informatica. • Used Polybase for ETL/ELT process with Azure Data Warehouse to keep data in Blob Storage with almost no limitation on data volume. • Import data from sources like HDFS/HBase into Spark RDD. • Built Azure Data Warehouse Table Data sets for Power BI Reports. • Worked on BI reporting with At Scale OLAP for Big Data. • Implemented Kafka for streaming data and filtered, processed the data. • Created ETL/Talend jobs both design and code to process data to target databases. • Worked with Hadoop infrastructure to storage data in HDFS storage and use Spark / HIVE SQL to migrate underlying SQL codebase in Azure. • Wrote Pig Scripts to generate Map Reduce jobs and performed ETL procedures on the data in HDFS. • Experienced in loading the real-time data to NoSQL database like Cassandra. • Generate metadata, create Talend jobs, mappings to load data warehouse, data lake• Used Talend for Big data Integration using Spark and Hadoop. • Worked in developing Pig Scripts for data capture change and delta record processing between newly arrived data and already existing data in HDFS. -
Data AnalystFlipkart Jul 2017 - Jul 2019Bangalore, Karnataka, In• Implemented and supported strategic data sourcing to automate the population of annual, monthly, and quarterly CCAR reports in the Big Data platform.• Developed the statistical models designed to forecast market variables under stress scenarios within Financial Models using R.• Created queries using Hive, SAS (Proc SQL), and PL/SQL to load large amounts of data from MongoDB and SQL Server into HDFS to spot data trends.• Wrote Hive-QL to retrieve, query, and process raw data.• Utilized K-means clustering technique to classify unlabeled data.• Worked on data pattern recognition, data cleaning as well as data visualizations such as Scatter Plot, Box Plot, and Histogram Plot to explore the data using packages Matplotlib, Seaborn in Python, ggplot in R and SAS.• Used LDA, PCA, and Factor Analysis to perform the dimensional reduction.• Modified and applied Machine Learning algorithms such as Neural Networks, SVM, Bagging, Gradient Boosting, K-Means, PySpark, and MLlib to detect target customers.• Worked on customer segmentation based on the similarities of the customers using an unsupervised learning technique - cluster analysis.• Used Pandas, Numpy, Scipy, Scikit-learn, NLTK in Python for scientific computing and data analysis.• Applied cross-validation to evaluate and compare the performance among different models. Validated the machine learning classifiers using ROC Curves and Lift Charts.• Configured Spark Streaming with Kafka to clean and aggregate real-time data. • Involved in Text Analytics such as analyzing text, language syntax, structure, and semantics.• Generated weekly and monthly reports and maintained, manipulated data using SAS macro, Tableau, and D3.js.• Involved in using Sqoop to load historical data from SQL Server into HDFS.• Used Git for version control. -
Data EngineerAirtel May 2015 - Jun 2017Gurgaon, Haryana, In• Worked on analyzing the Hadoop cluster using different big data analytic tools including Pig, Hive, and MapReduce. • Involved in loading data from the LINUX file system to HDFS.• Designed ETL process using Talend Tool to load from Sources to Targets through data Transformations. • Translated business requirements into working logical and physical data models for OLTP &OLAP systems. • Reviewed Stored Procedures for reports and wrote test queries against the source system (SQL Server-SSRS) to match the results with the actual report against the Datamart (Oracle) • Owned and managed all changes to the data models. Created data models, solution designs, and data architecture documentation for complex information systems. • Developed Advanced PL/SQL packages, procedures, triggers, functions, Indexes, and Collections to implement business logic using SQL Navigator. • Working on the OLAP for data warehouse and data mart developments using OLTP models, both and interacting with all the involved stakeholders and SMEs to derive the solution. • Created the best fit Physical Data Model based on discussions with DBAs and ETL developers. • Identified required dimensions and Facts using Erwin tool for the Dimensional Model. • Implemented ETL techniques for Data Conversion, Data Extraction, and Data Mapping for different processes as well as applications. • Validated and updated the appropriate Models to process mappings, screen designs, use cases, business object models, and system object models as they evolved and changed. • Created Model reports including Data dictionaries, Business reports. • Generated SQL scripts and implemented the relevant databases with related properties from keys, constraints, indexes & sequences. • Created conceptual, logical, and physical data models, data dictionaries, DDL, and DML to deploy and load database table structures in support of system requirements.
Dinesh R Education Details
-
University At BuffaloIndustrial Engineering
Frequently Asked Questions about Dinesh R
What company does Dinesh R work for?
Dinesh R works for Fidelity Investments
What is Dinesh R's role at the current company?
Dinesh R's current role is Data Engineer | Hadoop | SQL | AWS | Data Analytics | Machine Learning | Actively looking for C2C roles.
What schools did Dinesh R attend?
Dinesh R attended University At Buffalo.
Who are Dinesh R's colleagues?
Dinesh R's colleagues are Owen Findley, Matthew Olson, Michael Bombara, Josh Peña, Cfp®, Prem Kumar, Tori Fetner, Ratnesh Singh.
Free Chrome Extension
Find emails, phones & company data instantly
Aero Online
Your AI prospecting assistant
Select data to include:
0 records × $0.02 per record
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial