SC15 Emerging Technologies

Emulating Future HPC SoC Architectures

Poster: null PDF | Slides: null PDF

SC15 Emerging Technologies

Current HPC ecosystems rely upon Commercial Off-the-Shelf (COTS) building blocks to enable cost-effective design by sharing costs across a larger ecosystem. Modern HPC nodes use commodity chipsets and processor chips integrated together on custom motherboards. Commodity HPC is heading into a new era where the chip acts as the “silicon motherboard” that interconnects commodity Intellectual Property (IP) circuit building blocks to create a complete integrated System-on-a-Chip (SoC). These SoC designs have the potential for higher performance at better power efficiency than current COTS solutions. Going further, a custom SoC could potentially contain a diverse collection of accelerators that could enhance performance and reliability at all levels of the software stack—from applications to runtimes. To properly explore this broad design space there will need to be a true codesign effort requiring the cooperation of hardware architects, application developers, OS / runtime researchers and domain scientists.
We are demonstrating how the advanced tools provided by the Co-Design for Exascale (CoDEx) project running on a cloud-based FPGA system can provide powerful insights into future SoC architectures tailored to the needs of HPC. The large-scale emulation environment shown here will demonstrate how we are building the tools needed to evaluate new and novel architectures at speeds fast enough to evaluate whole application performance.

IP Shopping List: What do we need to build a semi-custom SoC?

OpenSoC FabricAn alternative model for commodity HPC is emerging where the chip acts as the “silicon motherboard” that interconnects commodity Intellectual Property (IP) circuit building blocks to create a complete integrated SoC—a common practice in the fast growing and innovative embedded processing market. By leveraging the enormous commodity IP market for design tools, processors, memory controllers, and I/O circuit designs, a chip designer can focus their effort and NRE costs on engineering a handful of essential features that are not covered by the commodity ecosystem allowing the rapid creation of semi-custom designs. This presents a new design paradigm and architecture for HPC, cloud, and high performance embedded systems.

Emergence of Open Source IP

In lieu of relying exclusively on commercial IP, we are have focused on bringing together multiple, emerging, open source technologies in novel ways to creating new tools and techniques to enable advanced architectural exploration.

ChiselChisel, developed at UC Berkeley, is a powerful new HDL based on Scala that significantly decreases development time and promotes the sharing and reuse of components. Chisel provides both the software and hardware models with one single code-base and is the language used to describe both the processor and the on-chip-network for this system.
RISC-V ISA and the Z-scale Core
Z-ScaleThe processor ISA used for this emulated system will be a RISC-V based Z-scale core. In addition to being a completely open-source and well tested ISA RISC-V also comes with a well-developed software stack including full compiler support.
The particular architecture that we are using is the Z-scale implementation, which is a tiny 32-bit RISC-V core suited for microcontrollers and embedded systems, in the same vein as ARM Cortex M0/M0+/M3/M4.
OpenSoC Fabric
OpenSoC Fabric HierarchyOpenSoC Fabric is a on-chip network generation infrastructure which provides a parameterizable and powerful on-chip network generator for evaluating future high performance computing architectures based on SoC technology. OpenSoC Fabric is written using Chisel which enables rapid development and significant code re-use. Currently, our model is able to generate networks with multiple topologies, concentrations, virtual channels etc. and has been verified against other NoC models to ensure accuracy.
OpenSoC Fabric Block Diagram

A Large Scale SoC Exploration Platform

SoC Design Block DiagramThe FPGA system is a collection of PCIe backplanes, each with six FPGA modules. The FPGAs are able to communicate over a PCIe Switch on each of the FPGA PCIe backplanes. Each backplane can communicate with others through the PCIe root complex on the server’s motherboard. Inside each FPGA, is a 4×4 single concentration mesh network generated using OpenSoC Fabric with the network ports divided between Z-scale RISC-V cores, the FPGA module’s 8GB DRAM and the off-chip network between the FPGA backplanes. The large scale emulation system demonstrated provides a realistic proving ground for advanced hardware architectures, programming models, runtime and applications research by providing an early test-bed for experimentation.