Tony Werner

Tony Werner Email and Phone Number

Distinguished Engineer @ Tsavorite Scalable Intelligence
Tony Werner's Location
Denton, Texas, United States, United States
Tony Werner's Contact Details

Tony Werner personal email

n/a
About Tony Werner

Experienced ASIC designer

Tony Werner's Current Company Details
Tsavorite Scalable Intelligence

Tsavorite Scalable Intelligence

View
Distinguished Engineer
Tony Werner Work Experience Details
  • Tsavorite Scalable Intelligence
    Principal Engineer - Deep Learning Architect/Design
    Tsavorite Scalable Intelligence Oct 2023 - Present
    Milpitas, California, Us
    Powering Enterprise AI @Zettascale
  • Marvell Technology
    Distinguished Engineer
    Marvell Technology May 2022 - Oct 2023
    Santa Clara, Ca, Us
  • Tanzanite Silicon Solutions
    Principal Engineer
    Tanzanite Silicon Solutions Jul 2020 - May 2022
    Milpitas, California, Us
    Developed a CXL-based Data Center ASICTanzanite was acquired by Marvell Technology in May 2022
  • Intel
    Principal Engineer - Deep Learning Architect/Design
    Intel Aug 2016 - May 2020
    Accelerating convolution in deep learning ASICsTaped out the Nervana Deep Learning ASIC after acquisition. The ASIC supported a peak performance of 40 TeraOPs.Continued the architectural development of the Nervana Crest architecture. Taped out the follow-on ASIC called Spring Crest. This ASIC had a peak performance of 110 TOPs * Implemented custom logic to accelerate normalization layers (Batch Norm and Group Norm) required as neural networks grow deeper * Added support for filter dilation for all CNN related tasks * Added features to reduce data movementWrote a software compiler tool to improve performance of the software. The tool distributes the convolution layers across the ASIC processing cores. The tool determines the optimum division of the workload and generates the software instructions required to achieve that distribution. This compiler is integrated into the software kernels. This tool also estimates performance and was used to evaluate proposed architectural features.Researched Deep Neural Network features and performance enhancement algorithms. * Worked with and analyzed different numeric formats related to machine learning technologies, including integer and floating point. * Investigated the hardware implementation of several new deep learning technologies, including sparsity, deformable convolution, and depth-wise separable convolution * Developed a hardware implementation of the Winograd algorithm, which supports the full theoretical speedup of 2.25x provided by a 2x2 output tiling. * Analyzed options for reducing memory accesses (minimizing data movement)
  • Nervana Systems
    Principal Engineer - Deep Learning Architect/Design
    Nervana Systems Oct 2014 - Aug 2016
    Santa Clara, California, Us
    Developed a convolution engine architecture for Nervana's Deep Learning Training ASIC. I was responsible for the development of the architecture and RTL implementation. Software assigned workloads to the convolution engine by identifying tensors for processing. The convolution engine generated all load, store, and computation micro-instructions to perform the required task on the identified tensor. * The architecture supported the 3 primary CNN operations of forward propagation, backward propagation, and update* The architecture also supported forward and backward poolingNervana was acquired by Intel in August 2016.
  • Cisco
    Tech Lead
    Cisco Jul 2007 - Oct 2014
    ASIC and emulator developmentIncreased the read/write bandwidth of a large packet buffer memory block. This block reads and writes packet data and maintains the corresponding linked lists for several sources, which are active simultaneously. Implemented a top-level netlist connection tool that builds the port list for each of the major blocks, builds physical design routable subchips that consist of one or more major blocks, and builds the ASIC core (which instantiates all of the subchips). The tool also assisted the design verification team by building interface interconnection files, the physical design team by inserting inter-block pipeline stages to assist routing, building the cpu, reset, and halt trees, feedthrough paths, etc. The tool was also used to route the ASIC interconnections through dozens of FPGAs in our emulator. Tool has been used for four tape-outs thus far.Designed a hierarchical scheduler that scheduled packets as either a defined data rate or as a fraction of available bandwidth. The algorithm, which was used in previous schedulers, was optimized to provide 100% overspeed to support high drop rates or provide higher data rates for a given system clock frequency.Designed a packet data scheduler that guarantees no under-run (no intra-packet gaps) and no scheduler induced under-utilization (no inter-packet gaps).I supported all aforementioned activities through the development of the initial ASIC and multiple derivatives. Currently working on a signficant performance updates for the next derivative.
  • Cisco
    Hardware Engineer
    Cisco Aug 2001 - Jul 2007
    Designed components of the packet scheduler for the Quantum Flow Processor. I was a member of the team throughout the development of the micro-architecture for the scheduler. I completed both a C++ reference model and the RTL for my components.Lead logic designer of a Fabric Controller ASIC for a 20Gbps linecard in the Cisco GSR Router.Designed major sub-blocks of a Fabric Interface ASIC for a 20Gbps linecard in the Cisco GSR Router. Development involved doubling the throughput of the existing 10Gbps linecard architecture.
  • Auroranetics
    Hardware Engineer
    Auroranetics Oct 2000 - Aug 2001
    Designed components of a Resilient Packet Ring (RPR) ASIC, including JTAG and BIST.Developed Perl-based tools to identify the individual scan chains in the synthesized ASIC and then built automated Verilog testbenches for testing BIST and scan-related logic.Company was acquired by Cisco in August 2001.
  • Ati Research
    Hardware Engineer
    Ati Research Apr 2000 - Oct 2000
    Santa Clara, California, Us
    Joined during the development of an X86 compatible architecture. Debugged RTL code in the instruction decode and translation stages of the pipeline. Conversion stage decoded the X86 instructions while the Translation stage generated equivalent native instructions. Project canceled before completion.
  • Hitachi
    Hardware Engineer
    Hitachi Apr 1997 - Apr 2000
    Chiyoda-Ku, Tokyo, Jp
    SH-5 Embedded Processor – Participated in initial architecture analyses, which included the instruction set architecture (instruction types, SIMD operations, and instruction length), the use of predicated execution and branch target buffers, number of register file ports, and pipeline length.§Developed C simulation model to evaluate various configurations of the branch target buffer for the SH-5.Cache Study – Optimized filtering approach that controls page replacement in a prediction cache, which utilizes both victim and stream caching techniques

Tony Werner Skills

Verilog Asic Rtl Design Perl Logic Design Integrated Circuit Design Processors Application Specific Integrated Circuits C Functional Verification Hardware System On A Chip Debugging Systemverilog Field Programmable Gate Arrays Embedded Systems Simulations

Tony Werner Education Details

  • University Of California, Davis
    University Of California, Davis
    Electrical Engineering
  • University Of Illinois Urbana-Champaign
    University Of Illinois Urbana-Champaign
    Computer Engineering
  • University Of Arizona
    University Of Arizona
    Electrical Engineering

Frequently Asked Questions about Tony Werner

What company does Tony Werner work for?

Tony Werner works for Tsavorite Scalable Intelligence

What is Tony Werner's role at the current company?

Tony Werner's current role is Distinguished Engineer.

What is Tony Werner's email address?

Tony Werner's email address is tw****@****esi.com

What schools did Tony Werner attend?

Tony Werner attended University Of California, Davis, University Of Illinois Urbana-Champaign, University Of Arizona.

What skills is Tony Werner known for?

Tony Werner has skills like Verilog, Asic, Rtl Design, Perl, Logic Design, Integrated Circuit Design, Processors, Application Specific Integrated Circuits, C, Functional Verification, Hardware, System On A Chip.

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles
Get direct phone numbers & mobile contacts
Access company data & employee information
Works directly on LinkedIn - no copy/paste needed
Get Chrome Extension - Free

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.