Thiago Pauli

Thiago Pauli Email and Phone Number

Data Engineer at Qubika | Big Data | Python | SQL | Spark | AWS | Airflow @ Qubika
Thiago Pauli's Location
Jaraguá do Sul, Santa Catarina, Brazil, Brazil
Thiago Pauli's Contact Details

Thiago Pauli personal email

About Thiago Pauli

I am a dedicated Data Engineer with a wealth of experience in architecting and optimizing processes within data pipelines. My expertise lies in integrating information from various sources, ensuring data accuracy and accessibility. I specialize in designing and implementing end-to-end data solutions that transform raw data into invaluable insights. Leveraging my proficiency in Python, SQL, and Spark, I craft data pipelines that gather, clean and transform data to load into Data Lakehouses / Data Warehouses. These streamlined ETL/ELT processes not only accelerate data accessibility but also enhance data accuracy, ultimately driving more informed decision-making.Additionally, I've harnessed the capabilities of leading cloud platforms, having gained hands-on experience with AWS and GCP. My expertise extends to a diverse set of tools including Airflow, Databricks, Git, Docker, Metabase, and Power BI. I utilize these tools to create data-driven solutions that align seamlessly with broader business goals.With a background in Mechanical Engineering and six years of experience in the manufacturing industry, I've honed my problem-solving skills and process optimization abilities. This blend of expertise empowers me to approach data engineering from a multidisciplinary perspective, facilitating creative and effective solutions.Website: https://www.datascienceportfol.io/thiagopauliGitHub: https://github.com/ThiPauli

Thiago Pauli's Current Company Details
Qubika

Qubika

View
Data Engineer at Qubika | Big Data | Python | SQL | Spark | AWS | Airflow
Thiago Pauli Work Experience Details
  • Qubika
    Senior Data Engineer
    Qubika Aug 2024 - Present
    Austin, Texas, United States
  • Lyncas
    Senior Data Engineer
    Lyncas May 2023 - Aug 2024
    Jaraguá Do Sul, Santa Catarina, Brazil
    • Collaborate on a comprehensive on-premise data project utilizing open-source technologies such as MinIO, Apache Airflow, Apache Spark, and ClickHouse.• Contribute to the implementation and maintenance of a robust Delta Lake, utilizing Upsert transactions with Merge to increment new data.• Translate stakeholder business rules into effective data processing workflows through the application of Apache Spark (Pyspark), Python, and SQL. This process enables the creation of essential Key Performance Indicators (KPIs) for key users.• Leverage Apache Airflow to monitor the data pipeline, implementing email alerts for proactive issue detection and resolution.• Utilize Apache Airflow and Python to automate web scraping for targeted information. Implement dynamic logics to determine the data collection strategy from the API based on the responses obtained from web scraping.• Extract data from diverse databases, such as MySQL, PostgreSQL, and SQL Server. Apply logic to initially ingest large historical tables, followed by the incorporation of daily incremental data updates.• Utilize GitLab for effective version control and employ CI/CD practices for efficient code build and deployment to homolog and production environments.• Spearheaded the initial setup of a core project on Google Cloud Platform (GCP), incorporating Google Cloud Storage, Composer, Dataproc, and BigQuery to establish a robust Data Lake medallion architecture.
  • B2B Stack
    Data Engineer
    B2B Stack Aug 2022 - Jun 2023
    São Paulo, Brazil
    • Implemented end-to-end Data Pipelines by leveraging AWS resources for effective Data Lake management. Utilized Apache Airflow to orchestrate the ETL (Extract, Transform, Load) process, ensuring seamless and automated data workflows.• Executed data ingestion from diverse sources, encompassing databases and APIs such as Google Analytics, Mixpanel, and ActiveCampaign. Employed AWS Glue to crawl a MySQL database on AWS RDS, facilitating the seamless movement of data to AWS S3 for subsequent processing.• Applied advanced data transformation and aggregation techniques to convert raw data into meaningful insights. Leveraged Apache Spark (Pyspark), Python and SQL on AWS EMR (Elastic MapReduce) to handle large-scale data processing, ensuring accuracy and efficiency in data manipulation.• Loaded processed data into ClickHouse, serving as a Data Warehouse for real-time analytical data reports. Utilized SQL queries to extract valuable insights from ClickHouse, enabling timely and informed decision-making.• Implemented Metabase as a Business Intelligence (BI) tool to create dynamic data reports. Integrated charts and dashboards within Metabase, providing stakeholders with a user-friendly interface to explore and interpret data trends.• Orchestrated the setup of Airflow, ClickHouse, and Metabase within Docker containers deployed on AWS EC2 instances. This containerized infrastructure streamlined deployment, maintenance, and scalability, enhancing overall system efficiency.• Implemented GitLab for efficient version control and streamlined deployment, ensuring seamless code transitions from development to production on AWS EC2 instances.
  • Neomove
    Data Engineer
    Neomove Jan 2022 - Sep 2022
    São Paulo, Brazil
    • Developed and implemented a robust ETL pipeline using AWS and Databricks, effectively extracting, transforming, and loading diverse data sources.• Managed data stored in various formats, including CSV and Parquet, within Amazon S3, ensuring seamless accessibility and organization of information.• Conducted comprehensive data engineering tasks, employing Python, SQL, and Spark to clean and transform data, enhancing its quality and usability for analytical purposes.• Leveraged Databricks Jobs Compute functionality to establish a robust framework for orchestrating and executing data pipelines.• Implemented external tables in the Databricks Unity Catalog and used Databricks SQL Serverless for integrating data with Power BI.• Orchestrated the setup and optimization of data in Amazon Redshift Data Warehousing, contributing to efficient storage and retrieval of processed data.• Utilized Power BI to create insightful dashboards and reports, empowering stakeholders with visually appealing and interactive tools for data-driven decision-making.
  • Limerick Blow Moulding Ltd.
    General Operative
    Limerick Blow Moulding Ltd. Nov 2019 - May 2021
    Limerick, County Limerick, Ireland
    Period dedicated to improving my English language capabilities as a part-time student, in conjunction with experience working for Limbo Limerick Blow Moulding Limited as a general operator. Experience which developed a better understanding in the plastic industry and injection moulding process. Leveraging skills in time management, teamwork, communication and problem-solving.
  • Jaraguá Cnc
    Sales Consultant
    Jaraguá Cnc Sep 2018 - Apr 2019
    Jaraguá Do Sul, Santa Catarina, Brazil
    • Analyzed customers' needs in order to provide appropriate solutions as well as created budgets and technical proposals regarding CNC machinery (Router, Machining Centre and Laser).• Utilized Excel to store, track and analyze the sales data. Worked closely with the company to identify customer needs and demands. It led to product stock optimization and increased the sales since some customers needed shorter lead time.• Demonstrated the functionality and performance of CNC machinery to customers through tests using simulations on software CAM.
  • Bosch Rexroth
    Engineering Intern
    Bosch Rexroth Mar 2017 - Aug 2018
    Pomerode, Santa Catarina, Brazil
    Role: Cost Analyst• Created budgets datasheets for standard and special projects using Excel as well as updated their costs via TOTVS ERP system.• Managed third party suppliers to budget materials, services and accessories.Role: Mechanical Designer• Designed the products ball screws and linear guides through AutoCAD software according to catalog, standards and customers' requirements such as, tolerances, sizes, and respective applications.• Managed the bill of materials used in the designs through the TOTVS ERP system.Accomplishments:• Reduced the budgets elaboration time for standard projects by creating an automate tool with Excel.• Developed a project about the costs and main differences such as components, manufacture and assemble between specific products as a curricular internship. The project was presented to the Centro Universitário - Católica de Santa Catarina.
  • Weg
    Machine Operator
    Weg Jan 2013 - Jan 2016
    Jaraguá Do Sul, Santa Catarina, Brazil
    • Operated a manual lathe machine to manufacture parts according to technical drawing.• Controlled the components stock via SAP ERP system to send to the assembly sector.

Thiago Pauli Education Details

Frequently Asked Questions about Thiago Pauli

What company does Thiago Pauli work for?

Thiago Pauli works for Qubika

What is Thiago Pauli's role at the current company?

Thiago Pauli's current role is Data Engineer at Qubika | Big Data | Python | SQL | Spark | AWS | Airflow.

What is Thiago Pauli's email address?

Thiago Pauli's email address is th****@****ail.com

What schools did Thiago Pauli attend?

Thiago Pauli attended Centro Universitário - Católica De Santa Catarina, Centro De Treinamento Weg.

Not the Thiago Pauli you were looking for?

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles
Get direct phone numbers & mobile contacts
Access company data & employee information
Works directly on LinkedIn - no copy/paste needed
Get Chrome Extension - Free

Aero Online

Your AI prospecting assistant

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.