SummaryHighly motivated big data engineer with strong experiences in developing end-to-end ETL pipelines for high-impact data science projects on Hadoop platform, designing optimized SQL queries, and automating processes with Python.Core Competencies• Languages: Python, SQL (Presto, Hive, Spark, Impala), Shell (Bash), R• Big Data Ecosystems: Hadoop, HDFS, Hive, Oozie, Airflow, AWS (S3), MapReduce, REST API
-
Data EngineerMeta Feb 2021 - PresentMenlo Park, Ca, Us -
Data EngineerMeta Mar 2020 - Feb 2021Menlo Park, Ca, Us -
Junior Big Data EngineerSamsung Electronics America Jul 2019 - Mar 2020Ridgefield Park, Nj, Us• Implemented multiple end-to-end ETL processes in Shell scripting that extract batch data from AWS S3 bucket or SFTP to HDFS (Hadoop Distributed File System) and load into target tables using HiveQL in both static and dynamic partitioning for efficient data access and usage• Developed a data quality dashboard pipeline that monitors 100+ core data science ETL jobs by using Oozie workflow scheduler command-line APIs to extract workflow running status on a daily basis, securing production issue tracking and recognized by the leadership to be the best practice for data quality initiative• Optimized SQL query design for the highest priority data science ETL pipeline and migrated its HiveQL queries to PySpark and redesigning intermediate table loading processes, reducing runtime by 30% • Redesigned a critical incoming data pipeline with the upstream web development team to migrate data transmission process from multiple REST API calls to a single batch CSV data file ingestion in AWS S3 bucket, enhancing the pipeline efficiency and stability• Maintained and troubleshot critical data quality Tableau dashboards for data science model pipelines to correctly reflect and debug model running issues ranging from platform resource limitation to table loading errors• Automated ad hoc business requests such as downstream table dependency checking to ensure smooth changes to existing tables without unexpected downstream impact using Python and Shell scripting • Aggregated and analyzed various data sets to provide actionable platform insights
Julia Lin Skills
Julia Lin Education Details
-
Uc San DiegoBusiness Analytics -
National Taiwan UniversityBachelor'S Degree
Frequently Asked Questions about Julia Lin
What company does Julia Lin work for?
Julia Lin works for Meta
What is Julia Lin's role at the current company?
Julia Lin's current role is Data Engineer at Facebook.
What schools did Julia Lin attend?
Julia Lin attended Uc San Diego, National Taiwan University.
What are some of Julia Lin's interests?
Julia Lin has interest in 好色龍, 旅英黑特少女, Coey的職場叩叩門, 標註自由, Community, Model Apec Taiwan, Home Decor, 5 Couture, 何景窗, The Brush Bar.
What skills is Julia Lin known for?
Julia Lin has skills like Rest Api, Hql, Data Pipelines, Scikit Learn, R, Etl, Amazon Web Services, Shell, Data Engineering, Web Scraping, Data Analysis, Chinese.
Free Chrome Extension
Find emails, phones & company data instantly
Aero Online
Your AI prospecting assistant
Select data to include:
0 records × $0.02 per record
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial