Tony Werner work email
- Valid
- Valid
- Valid
Tony Werner personal email
Experienced ASIC designer
-
Principal Engineer - Deep Learning Architect/DesignTsavorite Scalable Intelligence Oct 2023 - PresentMilpitas, California, UsPowering Enterprise AI @Zettascale -
Distinguished EngineerMarvell Technology May 2022 - Oct 2023Santa Clara, Ca, Us -
Principal EngineerTanzanite Silicon Solutions Jul 2020 - May 2022Milpitas, California, UsDeveloped a CXL-based Data Center ASICTanzanite was acquired by Marvell Technology in May 2022 -
Principal Engineer - Deep Learning Architect/DesignIntel Aug 2016 - May 2020Accelerating convolution in deep learning ASICsTaped out the Nervana Deep Learning ASIC after acquisition. The ASIC supported a peak performance of 40 TeraOPs.Continued the architectural development of the Nervana Crest architecture. Taped out the follow-on ASIC called Spring Crest. This ASIC had a peak performance of 110 TOPs * Implemented custom logic to accelerate normalization layers (Batch Norm and Group Norm) required as neural networks grow deeper * Added support for filter dilation for all CNN related tasks * Added features to reduce data movementWrote a software compiler tool to improve performance of the software. The tool distributes the convolution layers across the ASIC processing cores. The tool determines the optimum division of the workload and generates the software instructions required to achieve that distribution. This compiler is integrated into the software kernels. This tool also estimates performance and was used to evaluate proposed architectural features.Researched Deep Neural Network features and performance enhancement algorithms. * Worked with and analyzed different numeric formats related to machine learning technologies, including integer and floating point. * Investigated the hardware implementation of several new deep learning technologies, including sparsity, deformable convolution, and depth-wise separable convolution * Developed a hardware implementation of the Winograd algorithm, which supports the full theoretical speedup of 2.25x provided by a 2x2 output tiling. * Analyzed options for reducing memory accesses (minimizing data movement)
-
Principal Engineer - Deep Learning Architect/DesignNervana Systems Oct 2014 - Aug 2016Santa Clara, California, UsDeveloped a convolution engine architecture for Nervana's Deep Learning Training ASIC. I was responsible for the development of the architecture and RTL implementation. Software assigned workloads to the convolution engine by identifying tensors for processing. The convolution engine generated all load, store, and computation micro-instructions to perform the required task on the identified tensor. * The architecture supported the 3 primary CNN operations of forward propagation, backward propagation, and update* The architecture also supported forward and backward poolingNervana was acquired by Intel in August 2016. -
Tech LeadCisco Jul 2007 - Oct 2014ASIC and emulator developmentIncreased the read/write bandwidth of a large packet buffer memory block. This block reads and writes packet data and maintains the corresponding linked lists for several sources, which are active simultaneously. Implemented a top-level netlist connection tool that builds the port list for each of the major blocks, builds physical design routable subchips that consist of one or more major blocks, and builds the ASIC core (which instantiates all of the subchips). The tool also assisted the design verification team by building interface interconnection files, the physical design team by inserting inter-block pipeline stages to assist routing, building the cpu, reset, and halt trees, feedthrough paths, etc. The tool was also used to route the ASIC interconnections through dozens of FPGAs in our emulator. Tool has been used for four tape-outs thus far.Designed a hierarchical scheduler that scheduled packets as either a defined data rate or as a fraction of available bandwidth. The algorithm, which was used in previous schedulers, was optimized to provide 100% overspeed to support high drop rates or provide higher data rates for a given system clock frequency.Designed a packet data scheduler that guarantees no under-run (no intra-packet gaps) and no scheduler induced under-utilization (no inter-packet gaps).I supported all aforementioned activities through the development of the initial ASIC and multiple derivatives. Currently working on a signficant performance updates for the next derivative.
-
Hardware EngineerCisco Aug 2001 - Jul 2007Designed components of the packet scheduler for the Quantum Flow Processor. I was a member of the team throughout the development of the micro-architecture for the scheduler. I completed both a C++ reference model and the RTL for my components.Lead logic designer of a Fabric Controller ASIC for a 20Gbps linecard in the Cisco GSR Router.Designed major sub-blocks of a Fabric Interface ASIC for a 20Gbps linecard in the Cisco GSR Router. Development involved doubling the throughput of the existing 10Gbps linecard architecture.
-
Hardware EngineerAuroranetics Oct 2000 - Aug 2001Designed components of a Resilient Packet Ring (RPR) ASIC, including JTAG and BIST.Developed Perl-based tools to identify the individual scan chains in the synthesized ASIC and then built automated Verilog testbenches for testing BIST and scan-related logic.Company was acquired by Cisco in August 2001.
-
Hardware EngineerAti Research Apr 2000 - Oct 2000Santa Clara, California, UsJoined during the development of an X86 compatible architecture. Debugged RTL code in the instruction decode and translation stages of the pipeline. Conversion stage decoded the X86 instructions while the Translation stage generated equivalent native instructions. Project canceled before completion. -
Hardware EngineerHitachi Apr 1997 - Apr 2000Chiyoda-Ku, Tokyo, JpSH-5 Embedded Processor – Participated in initial architecture analyses, which included the instruction set architecture (instruction types, SIMD operations, and instruction length), the use of predicated execution and branch target buffers, number of register file ports, and pipeline length.§Developed C simulation model to evaluate various configurations of the branch target buffer for the SH-5.Cache Study – Optimized filtering approach that controls page replacement in a prediction cache, which utilizes both victim and stream caching techniques
Tony Werner Skills
Tony Werner Education Details
-
University Of California, DavisElectrical Engineering -
University Of Illinois Urbana-ChampaignComputer Engineering -
University Of ArizonaElectrical Engineering
Frequently Asked Questions about Tony Werner
What company does Tony Werner work for?
Tony Werner works for Tsavorite Scalable Intelligence
What is Tony Werner's role at the current company?
Tony Werner's current role is Distinguished Engineer.
What is Tony Werner's email address?
Tony Werner's email address is tw****@****esi.com
What schools did Tony Werner attend?
Tony Werner attended University Of California, Davis, University Of Illinois Urbana-Champaign, University Of Arizona.
What skills is Tony Werner known for?
Tony Werner has skills like Verilog, Asic, Rtl Design, Perl, Logic Design, Integrated Circuit Design, Processors, Application Specific Integrated Circuits, C, Functional Verification, Hardware, System On A Chip.
Free Chrome Extension
Find emails, phones & company data instantly
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial