Arun Kumar Email and Phone Number
I'm a Big Data Developer with 6 years of experience, skilled in the Hadoop ecosystem, Apache Spark, Scala, and Python. I specialize in building data pipelines on AWS, ETL testing, and Spark development. I'm adept at resolving performance bottlenecks in Hive and optimizing Hive query performance. I excel in collaborating with teams to deliver solutions that drive growth and enhance decision-making. I'm ready for new challenges and opportunities.
Morgan Stanley
View- Website:
- morganstanley.com
- Employees:
- 78669
-
Big Data DeveloperMorgan Stanley Oct 2020 - Present•Developed and deployed Spark applications for distributed data processing, leveraging Scala and Python for efficient development and execution.•Integrated Spark with Hadoop ecosystem components such as HDFS, Hive, and HBase, leveraging Hadoop's storage and processing capabilities to enhance data processing workflows.•Developed Spark Streaming applications for real-time data processing, enabling timely insights and actionable intelligence from streaming data sources.•Utilized Hive for data warehousing and SQL-like querying, optimizing Hive queries and managing Hive meta store for efficient data retrieval and analysis.•Utilized Sqoop for data ingestion and integration, transferring data between Hadoop and external data sources such as relational databases, data warehouses, and cloud storage solutions.•Integrated Spark with cloud-based platforms such as Amazon S3, Azure Blob Storage, and Google Cloud Storage.•Designed and implemented ETL processes using Spark and cloud storage solutions, extracting, transforming, and loading data between different data sources and destinations.•Developed and optimized big data processing frameworks using AWS services like Glue, or Athena.•Implemented Spark SQL queries for data querying and aggregation, enabling complex analytics and reporting capabilities on large-scale datasets.•Leveraging AWS Glue data catalog to maintain metadata information about the data.•Leveraged AWS Athena for interactive query services to analyze data in Amazon S3 using standard SQL.•Monitored applications, system performance, and environment health with AWS CloudWatch, set up alarms and notifications for any discrepancies.•Fine-tuning AWS Glue ETL jobs and Lambda functions to improve performance and reduce processing time.•Developed serverless applications using AWS Lambda to automatically trigger functions in response to events. -
Etl Tester/ Data EngineerFgf Brands Apr 2018 - Sep 2020Greater Toronto Area, Canada•Designed and executed comprehensive test plans and test cases for ETL workflows, ensuring the accuracy and completeness of data transformations.•Conducted end-to-end testing of ETL processes, including data extraction, transformation, and loading, to validate compliance with business requirements and data integrity standards.•Collaborated with development teams to understand ETL design specifications and identify potential areas of improvement in data transformation rules and logic.•Implemented data quality checks and validation routines to identify and resolve data inconsistencies and anomalies during the ETL process.•Utilized ETL testing tools such as Informatica PowerCenter, Talend, or SSIS to automate testing procedures and streamline testing efforts.•Documented test results, defects, and resolution strategies, ensuring traceability and providing insights for process improvement initiatives.•Worked closely with stakeholders to communicate testing progress, identify risks, and prioritize testing activities based on project timelines and objectives.•Gained proficiency in programming languages such as Scala and Python within the Databricks environment.•Transitioned to a role as a big data developer working with Azure and Azure Databricks.•Leveraged Azure Databricks, a fast, collaborative Apache Spark-based analytics platform, for scalable data processing.•Designed and orchestrated ETL pipelines using Azure Data Factory to move data between various sources and destinations.•Utilized DataFrames for structured data manipulation and analysis.•Designed and implemented Spark jobs using Python.•Performed data cleansing and preprocessing using Spark transformations.•Designed and implemented data processing solutions using Databricks notebooks.•Utilized Databricks clusters for both batch and real-time data processing.•Integrated Databricks with Azure Data Lake Storage for scalable and secure data storage.
Frequently Asked Questions about Arun Kumar
What company does Arun Kumar work for?
Arun Kumar works for Morgan Stanley
What is Arun Kumar's role at the current company?
Arun Kumar's current role is Big Data Developer @ Morgan Stanley | Big Data Development.
Who are Arun Kumar's colleagues?
Arun Kumar's colleagues are Megha K., Steven R. Miller, Joyce Feuille, Steve Evanchik, Cfp®, Jennifer Reid, Edgar Marita, Thomas Carlyle.
Not the Arun Kumar you were looking for?
-
1kaltire.com
-
Arun Kumar
Waterloo, On -
Arun Kumar
Brampton, On -
Arun Kumar
Westmount, Qc3bcaresearch.com, bcaresearch.com, alpinemacro.com1 (514) 4XXXXXXX
Free Chrome Extension
Find emails, phones & company data instantly
Aero Online
Your AI prospecting assistant
Select data to include:
0 records × $0.02 per record
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial