Kiran K. is a Senior Data Engineer at Capgemini.
Capgemini
View- Website:
- capgemini.com
- Employees:
- 232507
-
Senior Data EngineerCapgemini Aug 2022 - PresentVancouver, British Columbia, CanadaResponsibilities:• Lead the migration of data from Google Cloud Platform (GCP) to Azure, ensuring a seamless transition and minimal downtime.• Collaborate with cross-functional teams to understand data migration requirements, including data sources, schemas, and transformation logic.• Design and implement efficient data migration pipelines using Azure Data Factory (ADF), Dataflow, and other relevant Azure services.• Extract data from various GCP data sources such as Big Query, Cloud Storage and transform it into Azure-compatible formats.• Develop custom scripts or tools to automate data extraction, transformation, and loading (ETL) processes during the migration.• Perform data quality checks and validation to ensure accurate and reliable migration results.• Optimize data migration processes for performance and scalability, considering factors like network bandwidth, data volume, and latency.• Collaborate with Azure cloud architects and administrators to ensure the target Azure environment is properly provisioned and optimized for data migration.• Monitor the data migration process, track progress, and resolve any issues or errors that arise.• Document the data migration process, including the overall architecture, data mapping, transformation rules, and any custom scripts or tools used.• Provide post-migration support, addressing any data-related issues, troubleshooting performance bottlenecks, and optimizing data workflows in the Azure environment.• Stay updated with the latest Azure services, features, and best practices for data migration and integration, continuously improving the data engineering capabilities of the organization.• Actively participated in agile ceremonies, pair programming and demos.Technologies: GCP, BigQuery, google Cloud Storage, Azure Data Factory, Azure Data Lake Storage, Azure Data Factory Dataflow, Azure Functions, Azure SQL Database, Azure Blob Storage, Azure Databricks, Azure Monitor, SQL, Python, ETL. -
Data EngineerEa Sports Oct 2021 - Oct 2022Vancouver, British Columbia, CanadaResponsibilities:• Designed, developed, maintained and lead the implementation of data pipelines on Azure delta lake.• Worked on productionalizing the complex machine learning algorithms for identifying IP bad blocks, IP phishing, IP domains, risk assessment.• Ingested the data into delta lake from different sources including Databases, Salesforce, Kafka topics, APIs, files (flat files, XML files) etc.• Migrated data related to multiple functional areas from existing oracle warehouse to Delta Lake and created ETL from source systems to feed directly to Delta Lake.• Loaded the data into Azure Synapse using Azure Data Factory ADF and Polybase.• Written complex SQL queries for data analysis, reverse engineering, analytics.• Deployed the pipelines to different environments using CICD in Azure dev-ops.• Created wrapper scripts using Unix shell scripting for spark-submit, python script execution.• Validated the data between source and target to make sure the data ingestion happened as expected.• Written PySpark, SQL scripts on Databricks to analyze huge amounts of data.• Performed SCD1, SCD2, lookups, aggregations, joins, data cleansing on initial and incremental pipelines using ADF.• Created different jobs in Databricks to ingest the data, derive the required fields and load it into Snowflake.• Scheduled the ETL jobs at required intervals and set up the alerts in ADF.• Created ETL pipelines to egress the data from Delta Lake to Snowflake and also to files.• Derived, converted data on ad hoc basis using PySpark, Synapse, Snowflake and provided it to data science team.Technologies: Azure Data Factory (ADF), Azure Delta Lake, Databricks, Azure Functions, Azure Blob storage, Azure Dev-ops, Azure Synapse, SQL Server, Oracle, SQL, PostgreSQL, Python, PySpark, HDFS, Hive, Snowflake, SQL, Kafka Unix, PowerShell, GitLab. -
Data EngineerRogers Communications Oct 2020 - Oct 2021Ontario, CanadaResponsibilities:• Deployed the ETL applications onto dockers in EKS pods using Bamboo.• Worked on designing and implementing data models for various types of databases, such as relational databases (e.g., SQL) or NoSQL databases (e.g., MongoDB, Cassandra).• Loaded the data into s3 and then into snowflake from different data sources.• Ingested data from multiple source systems to data lake using airflow.• Productionalized ML algorithm which identifies duplicate customer accounts and provided production support.• Triaged the production issues and provided short-term and long-term solutions.• Written several unit test cases to improve code coverage.• Performed code reviews and provided suggestions for performance improvements.• Created several external tables, partitioned and bucketed managed tables in hive.• Written Lambda functions which acts as triggers to perform actions on data in S3.Technologies: Snowflake, AWS, S3, EKS, Lambda, Python, PySpark, Airflow, Docker, Oracle, SQL Server, SQL, PL/SQL, Git, Bitbucket, Bamboo, MongoDB, Redshift, data lake, HDFS, Hive, Unix, Agile -
Associate Data EngineerMicrosoft Nov 2016 - Aug 2018Hyderabad, Telangana, IndiaResponsibilities:• Created data pipelines to load the data from different source systems into data lake.• Created and maintained a framework in PySpark to load the data from database into HDFS and then into Hive.• Loaded several dimension and fact tables into data lake from data warehouse/ databases.• Migrated licensing data from on-premise Hadoop bigdata system to Azure.• Written PySpark SQL to analyze the data and provided the requested insights.• Ingested the data from source systems as Parquet files on to HDFS and written them as ORC files in Hive after transformations.• Created ETLs for initial load data ingestions and also incremental data ingestions.• Used Sqoop to ingest the data from oracle into HDFS.• Performed joins, and several transformations on data in PySpark and loaded into dynamically portioned hive tables.• Identified the delta load and programmatically implemented it on Hive tables.Technologies: Azure, Azure Blob Storage, Hadoop, HDFS, Hive, Sqoop, Python, PySpark, PyCharm, SQL, PL/SQL, Oracle, PostgreSQL, Linux, SQL Server, Docker, GIT
Kiran K. Education Details
-
Master Of Business Administration - Mba
Frequently Asked Questions about Kiran K.
What company does Kiran K. work for?
Kiran K. works for Capgemini
What is Kiran K.'s role at the current company?
Kiran K.'s current role is Senior Data Engineer.
What schools did Kiran K. attend?
Kiran K. attended Vancouver Island University.
Who are Kiran K.'s colleagues?
Kiran K.'s colleagues are Gopinath Rountla Kasiviswanathan ( Pmp® Pmi-Acp® Lssbb ), Khaja Moidheen, Marcelo Reffatti, Ankita Dubey Zarkariya, Mariamicheal Raj S, Vishnu Vardhan V, Conceicao Peres Da Silva.
Not the Kiran K. you were looking for?
Free Chrome Extension
Find emails, phones & company data instantly
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial