Jaime Kaufman Email and Phone Number
Jaime Kaufman work email
- Valid
- Valid
- Valid
- Valid
- Valid
Jaime Kaufman personal email
- Valid
Jaime Kaufman is a data professional with over ten years of experience with a concentration on data lakes / big data, batch and stream processing, software engineering / tooling, data analytics, and data integrations in the industries: technology, retail, e-commerce, automobiles, healthcare, insurance, and consulting. His primary areas of impact are reducing complexity, generalizing similar codebases, reducing cost, and improving scalability and performance. He is creative in finding new and innovative ways to solve complex challenges. He is a quick study and takes pride in continuous improvement both personally and professionally.
Lucid Motors
View- Website:
- lucidmotors.com
- Employees:
- 201
-
Staff Data Engineer, Big DataLucid Motors Dec 2023 - PresentNewark, California, UsStack: Apache Spark, Iceberg, Kafka, Airflow, Python, Scala, Trino SQL, Spark Streaming, AWS S3, RedShift• Developed Spark structured streaming application that consumes incrementally Iceberg table and produces to Kafka, resulting in an 80% reduction in costs for a total annual savings of over $200k• Rewrote preexisting batch processes as Spark streaming pipelines that consume from Kafka and append to Iceberg table, resulting up to 99% improved read performance• Designed, benchmarked, and developed supplemental higher-grain layers of machine data, resulting in significant cost reductions to downstream pipelines including up to 95% improved read performance and reduced development time• Performed extensive Apache Iceberg research and benchmarking; documented and presented on best practices with a focus on partitioning and sorting recommendations• Released configurable Iceberg table maintenance DAG; implemented in Airflow to maintain the largest data sets for the company• Created configurable Iceberg metadata stats collection DAG to be used for efficient, proactive monitoring of large, critical data sets -
Lead Data Engineer, Apache SparkCarbon Arc Mar 2023 - Aug 2023New York, New York, UsStack: Spark, Python, Athena/Presto SQL, Airflow, AWS S3, Glue, RedShift, Postgres, MIRO• Brought Apache Spark to the company; provided documentation, best practices, training, and templates• Designed and developed PySpark-based util package with over 25 configurable functions to provide an easy and standardized mechanism to interface with S3, Postgres, RedShift, and more• Converted and performance tuned over 20 data science pipelines using Spark on AWS Glue, saving the company tens of thousands of dollars per month• Created functionality to leverage Hive table metadata in order to build enforceable upstream table dependencies into data pipelines• Utilized Apache Spark and Apache Sedona to design and develop a three-phase process and data model to efficiently and accurately clean up, deduplicate, and roll up massive data set containing 1 trillion rows (over 150 TB) of data, which ultimately provided the capability to reprocess in minutes and read in seconds -
Lead Data Engineer, Client FeaturesStitch Fix Sep 2021 - Jan 2023San Francisco, Ca, UsStack: Spark, Python, Presto/Trino SQL, Hive, Airflow, AWS S3, EMR, Kafka• Owned, maintained, and incrementally improved canonical client features: a set of streaming applications, batch processes, API, and CLI• Designed, developed, and popularized a set of eight debugging pipelines and self-service tables, providing upstream and downstream teams with a simpler and more efficient way to connect the dots and analyze complex, multi-nested JSON data sets• Redesigned and implemented a new mechanism to accurately parse raw questions and answers into client features, resulting in a significant reduction of complexity, deleting 2,000 lines of code, and reducing risk and time spent with ongoing support• Performed a large upgrade from Python 3.6 to Python 3.9 (Spark 3.2) including significant package dependency changes and resulting in no issues• Removed the disconnect, functioning as the primary partner collaborating with other SE and DS teams including client onboarding, qualifications, and styling support• Optimized Spark jobs, significantly reducing overall execution time and cost -
Senior Data Engineer, Channel IntegrationsAetna, A Cvs Health Company Oct 2018 - Sep 2021Hartford, Connecticut, UsStack: Spark, Python, Hive SQL, HDFS, bash shell, JSON, Airflow• Served as the tech lead for the channel integrations (adapters) team• Leveraged Python and Spark to design and develop a new data processing framework, enabling the capability to generically configure channel adapters for member outreach• Developed out-of-the-box channel adapter leveraging Python, Spark, and an API to integrate campaign data, including dynamic content, for provider outreach via fax• Developed PySpark reusable utility class providing a generic approach to enforcing journey/channel-level derived permissions and enriching with additional columns associated with the member• Developed PySpark reusable utility class providing capability of configuring automatic generation of UAT lists containing all combinations of cohorts for each internal member including mocked up PII columns• Refactored PySpark-based channel integration adapters to utilize generalized, reusable class functions to process, enrich, and send data for member outreach• Engineered a proof of concept, an automated process utilizing Spark to process and clean up large (TB's) tables into a more efficient compressed, partitioned, columnar format with significantly improved query performance and decreased storage size resulting in significant cost savings• Collaborated with cross-functional teams (data science and non-technical members) to test and productionalize marketing campaign data pipelines -
Consultant, Analytics Data EngineerFusion Alliance Feb 2017 - Oct 2018Carmel, Indiana, Us• Engineered database and complex Impala SQL pipeline pulling over 3 years of online shopping and order item-level out-of-stock metrics into Hive/Impala table; tuned for sub 3-second ad-hoc query performance; leveraged as a general-purpose working set to provide key insights to senior leadership, positively impacting eCommerce, merchandising, operations, and in-store decision making and profitability• Modeled database and utilized Impala SQL to build multi-step, complex data pipeline to load over 3 years of heavily aggregated and curated household-level engagement information (from shopping, clickstream, loyalty, and marketing data sources) into Cloudera Big Data environment; configured final tables for sub 5-second performance; leveraged new capability to present key insights to senior leadership• Collaborated with external teams by supplying working set logic and metric definitions to get owned, automated, and made available via canned reports and dashboards• Developed reusable, parameterized SQL-based Alteryx modules that provide key metrics—on an ad hoc, weekly, and period basis—across the organization including Digital, Customer Experience, Operations, Merchandising, and Marketing• Significantly improved speed of automated reporting process by converting Alteryx modules to leverage massively parallel processing on Hadoop cluster instead of local• Hosted numerous SQL training sessions weekly to assist five analysts in developing more complete SQL skills in support of team reporting needs; mentored team members in simpler approach to complex SQL• Collaborated with external team--supplying working set logic and metric definitions-- providing crucial out-of-stock metrics available and easily accessible to the enterprise -
Consultant, Hadoop Ingest DeveloperIllumination Works Sep 2014 - Feb 2017Beavercreek, Oh, Us• Named 2016 Most Valuable Consultant• Was key in winning million-dollar consulting contract by building POC Cloudera Hadoop cluster on Amazon EC2 Cloud, designing Hive/Impala tables/views; manually ingesting data into HDFS; and presenting to the client on advantages of Big Data• Developed, tested, and supported automation of over 80 import/export jobs for over 10 data sets of structured and semi-structured data• Provided primary production support by daily monitoring and troubleshooting/resolving job and dependency data failures• Improved speed, efficiency, and reliability of jobs with Spring XD framework utilizing state-of-the-art technologies including Apache Spark, Hive SQL, Impala, and Sqoop import/export tool• Redesigned ingest process to better utilize the cluster in parallel resulting in improved daily processing time of Clickstream data from 8-20 hours down to 1-2 hours• Developed custom Spark Scala and Bash Shell scripts to correct data quality issues across several large data sets• Designed and implemented over thirty reusable infrastructure alerts (Nagios) to monitor environment and job-specific data import and export schedules• Detected and deployed hotfixes to resolve over 300 production errors, incompatibilities, and other issues• Streamlined ingest/egress process for team using Agile tool (JIRA)• Served as technical lead providing training to and addressing questions, concerns, and issues from internal customers• Hosted numerous training sessions and developed over twenty pages of training documentation using company’s Wiki (Confluence)• Mentored ten developers across several teams; provided direction / technical expertise -
Database Administrator / Executive AssistantJewish Community Foundation Of North East Florida, Inc. Mar 2011 - Aug 2014Jacksonville, Fl, Us• Managed help-desk technical support for all desktop users• Set up / maintained user accounts for all employees• Implemented technology upgrades and provided user training• Redesigned relational database for more efficient processing of households• Customized SQL queries / Access forms and formatted reports in Excel• Coordinated migration to new web hosting and email providers• Saved company over $3,000 by re-engineering processes for daily operations -
Navy Musician / WebmasterUnited States Navy Oct 2003 - Mar 2011Washington, Dc, Us• Performed in over 1,000 engagements to include ceremonies for U.S. Presidents• Deployed on six month overseas humanitarian mission throughout Southeast Asia• Designed / maintained organization website, featuring articles, and public events• As brass quintet unit leader, rehearsed ensemble and lead in various performances• As trumpet section leader, mentored a team of eight musicians by improving musicality, cohesiveness, sense of time, and self-confidence
Jaime Kaufman Skills
Jaime Kaufman Education Details
-
University Of North FloridaComputer And Information Sciences: Software Engineering -
Ohio UniversityMusic Performance: Trumpet -
Wright State UniversityMusic Performance: Trumpet
Frequently Asked Questions about Jaime Kaufman
What company does Jaime Kaufman work for?
Jaime Kaufman works for Lucid Motors
What is Jaime Kaufman's role at the current company?
Jaime Kaufman's current role is Experienced Data Engineer | Creative Problem Solver | Professional Musician.
What is Jaime Kaufman's email address?
Jaime Kaufman's email address is jk****@****fix.com
What schools did Jaime Kaufman attend?
Jaime Kaufman attended University Of North Florida, Ohio University, Wright State University.
What skills is Jaime Kaufman known for?
Jaime Kaufman has skills like Sql, Java, Microsoft Office, Music, Software Development, Trumpet, Database Design, Jazz, Leadership, Big Band, Hadoop, Big Data.
Who are Jaime Kaufman's colleagues?
Jaime Kaufman's colleagues are Rohit Patel, Venezia De France, Jeremiah Johnson, Jesus Garcia, Tianshi Li, Destiney Benson, Reza Soleimani.
Free Chrome Extension
Find emails, phones & company data instantly
Aero Online
Your AI prospecting assistant
Select data to include:
0 records × $0.02 per record
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial