5.6. Using Hammer To Place and Route a Custom Block
Important
In order to use the Hammer VLSI flow, you need access to Hammer tools and technology plugins. You can obtain these by emailing hammer-plugins-access@lists.berkeley.edu with a request for which plugin(s) you would like access to. Make sure your email includes your github ID and proof (through affiliation or otherwise) that you have licensed access to relevant tools.
5.6.1. Initialize the Hammer Plug-ins
In the Chipyard root, ensure that you have the Chipyard conda environment activated. Then, depending on if you are using a technology plugin included with Hammer (ASAP7, Sky130) or as a separate plugin, you will run either of the commands below.
For Hammer-provided plugins (<tech-plugin-name>
is asap7
or sky130
):
./scripts/init-vlsi.sh <tech-plugin-name>
For separate technology plugins (this is a typical use-case for proprietry process technologies with require NDAs and secure servers), submodule them directly
into VLSI directory with the name hammer-<tech-plugin-name>-plugin
before calling the init-vlsi.sh
script.
For example, for an imaginary process technology called tsmintel3:
cd vlsi
git submodule add git@my-secure-server.berkeley.edu:tsmintel3/hammer-tsmintel3-plugin.git
cd -
./scripts/init-vlsi.sh tsmintel3
If submoduled plugins need to be updated, call the upgrade-vlsi.sh
script. This will checkout and pull the latest master branch.
Note
Some VLSI EDA tools are supported only on RHEL-based operating systems. We recommend using Chipyard on RHEL7 and above. However, many VLSI server still have old operating systems such as RHEL6, which have software packages older than the basic chipyard requirements. In order to build Chipyard on RHEL6, you will likely need to use tool packages such as devtoolset (for example, devtoolset-8) and/or build from source gcc, git, gmake, make, dtc, cc, bison, libexpat and liby.
5.6.2. Setting up the Hammer Configuration Files
The first configuration file that needs to be set up is the Hammer environment configuration file env.yml
. In this file you need to set the paths to the EDA tools and license servers you will be using. You do not have to fill all the fields in this configuration file, you only need to fill in the paths for the tools that you will be using.
If you are working within a shared server farm environment with an LSF cluster setup (for example, the Berkeley Wireless Research Center), please note the additional possible environment configuration listed in the Advanced Environment Setup segment of this documentation page.
Hammer relies on YAML-based configuration files. While these configuration can be consolidated within a single files (as is the case in the ASAP7 Tutorial and the Sky130 + OpenROAD Tutorial), the generally suggested way to work with an arbitrary process technology or tools plugins would be to use three configuration files, matching the three Hammer concerns - tools, tech, and design.
The vlsi
directory includes three such example configuration files matching the three concerns: example-tools.yml
, example-tech.yml
, and example-design.yml
.
The example-tools.yml
file configures which EDA tools hammer will use. This example file uses Cadence Innovus, Genus and Voltus, Synopsys VCS, and Mentor Calibre (which are likely the tools you will use if you’re working in the Berkeley Wireless Research Center). Note that tool versions are highly sensitive to the process-technology in-use. Hence, tool versions that work with one process technology may not work with another.
The example-design.yml
file contains basic build system information (how many cores/threads to use, etc.), as well as configurations that are specific to the design we are working on such as clock signal name and frequency, power modes, floorplan, and additional constraints that we will add later on.
Finally, the example-tech.yml
file is a template file for a process technology plugin configuration. We will copy this file, and replace its fields with the appropriate process technology details for the tech plugin that we have access to. For example, for the asap7
tech plugin, we will replace the <tech_name> field with “asap7” and the path to the process technology files installation directory. The technology plugin (which for ASAP7 is within Hammer) will define the technology node and other parameters.
We recommend copying these example configuration files and customizing them with a different name, so you can have different configuration files for different process technologies and designs (e.g. create tech-tsmintel3.yml
from example-tech.yml
)
5.6.3. Building the Design
After we have set the configuration files, we will now elaborate our Chipyard Chisel design into Verilog, while also performing the required transformations in order to make the Verilog VLSI-friendly.
Additionally, we will automatically generate another set of Hammer configuration files matching to this design, which will be used in order to configure the physical design tools.
We will do so by calling make buildfile
with appropriate Chipyard configuration variables and Hammer configuration files.
As in the rest of the Chipyard flows, we specify our SoC configuration using the CONFIG
make variable.
However, unlike the rest of the Chipyard flows, in the case of physical design we might be interested in working in a hierarchical fashion and therefore we would like to work on a single module.
Therefore, we can also specify a VLSI_TOP
make variable with the same of a specific Verilog module (which should also match the name of the equivalent Chisel module) which we would like to work on.
The makefile will automatically call tools such as Tapeout-Tools and the MacroCompiler (Tapeout-Tools) in order to make the generated Verilog more VLSI friendly.
By default, the MacroCompiler will attempt to map memories into the SRAM options within the Hammer technology plugin. However, if you are working with a new process technology and prefer to work with flip-flop arrays, you can configure the MacroCompiler using the TOP_MACROCOMPILER_MODE
make variable. For example, if your technology plugin does not have an SRAM compiler ready, you can use the TOP_MACROCOMPILER_MODE='--mode synflops'
option (Note that synthesizing a design with only flipflops is very slow and will often may not meet constraints).
We call the make buildfile
command while also specifying the name of the process technology we are working with (same tech_name
for the configuration files and plugin name) and the configuration files we created. Note, in the ASAP7 tutorial ((ASAP7 Tutorial)) these configuration files are merged into a single file called example-asap7.yml
.
Hence, if we want to monolithically place and route the entire SoC, the relevant command would be
make buildfile CONFIG=<chipyard_config_name> tech_name=<tech_name> INPUT_CONFS="example-design.yml example-tools.yml example-tech.yml"
In a more typical scenario of working on a single module, for example the Gemmini accelerator within the GemminiRocketConfig Chipyard SoC configuration, the relevant command would be:
make buildfile CONFIG=GemminiRocketConfig VLSI_TOP=Gemmini tech_name=tsmintel3 INPUT_CONFS="example-design.yml example-tools.yml example-tech.yml"
5.6.4. Running the VLSI Flow
Running a basic VLSI flow using the Hammer default configurations is fairly simple, and consists of simple make
command with the previously mentioned Make variables.
5.6.4.1. Synthesis
In order to run synthesis, we run make syn
with the matching Make variables.
Post-synthesis logs and collateral will be saved in build/<config-name>/syn-rundir
. The raw QoR data (area, timing, gate counts, etc.) will be found in build/<config-name>/syn-rundir/reports
.
Hence, if we want to monolithically synthesize the entire SoC, the relevant command would be:
make syn CONFIG=<chipyard_config_name> tech_name=<tech_name> INPUT_CONFS="example-design.yml example-tools.yml example-tech.yml"
In a more typical scenario of working on a single module, for example the Gemmini accelerator within the GemminiRocketConfig Chipyard SoC configuration, the relevant command would be:
make syn CONFIG=GemminiRocketConfig VLSI_TOP=Gemmini tech_name=tsmintel3 INPUT_CONFS="example-design.yml example-tools.yml example-tech.yml"
It is worth checking the final-qor.rpt report to make sure that the synthesized design meets timing before moving to the place-and-route step.
5.6.4.2. Place-and-Route
In order to run place-and-route, we run make par
with the matching Make variables.
Post-PnR logs and collateral will be saved in build/<config-name>/par-rundir
. Specifically, the resulting GDSII file will be in that directory with the suffix *.gds
. and timing reports can be found in build/<config-name>/par-rundir/timingReports
.
Place-and-route is requires more design details in contrast to synthesis. For example, place-and-route requires some basic floorplanning constraints. The default example-design.yml
configuration file template allows the tool (specifically, the Cadence Innovus tool) to use it’s automatic floorplanning capability within the top level of the design (ChipTop
). However, if we choose to place-and-route a specific block which is not the SoC top level, we need to change the top-level path name to match the VLSI_TOP
make parameter we are using.
Hence, if we want to monolitically place-and-route the entire SoC with the default tech plug-in parameters for power-straps and corners, the relevant command would be:
make par CONFIG=<chipyard_config_name> tech_name=<tech_name> INPUT_CONFS="example-design.yml example-tools.yml example-tech.yml"
In a more typical scenario of working on a single module, for example the Gemmini accelerator within the GemminiRocketConfig Chipyard SoC configuration,
vlsi.inputs.placement_constraints:
- path: "Gemmini"
type: toplevel
x: 0
y: 0
width: 300
height: 300
margins:
left: 0
right: 0
top: 0
bottom: 0
The relevant make
command would then be:
make par CONFIG=GemminiRocketConfig VLSI_TOP=Gemmini tech_name=tsmintel3 INPUT_CONFS="example-design.yml example-tools.yml example-tech.yml"
Note that the width and height specification can vary widely between different modulesi and level of the module hierarchy. Make sure to set sane width and height values.
Place-and-route generally requires more fine-grained input specifications regarding power nets, clock nets, pin assignments and floorplanning. While the template configuration files provide defaults for automatic tool defaults, these will usually result in very bad QoR, and therefore it is recommended to specify better-informed floorplans, pin assignments and power nets. For more information about cutomizing theses parameters, please refer to the Customizing Your VLSI Flow in Hammer sections or to the Hammer documentation.
Additionally, some Hammer process technology plugins do not provide default values for required settings such as tool paths and pin assignments (for example, ASAP7). In those cases, these constraints will need to be specified manually in the top-level configuration yml files, as is the case in the example-asap7.yml
configuration file.
Place-and-route tools are very sensitive to process technologes (significantly more sensitive than synthesis tools), and different process technologies may work only on specific tool versions. It is recommended to check what is the appropriate tool version for the specific process technology you are working with.
Note
If you edit the yml configuration files in between synthesis and place-and-route, the make par
command will automatically re-run synthesis. If you would like to avoid that and are confident that your configuration file changes do not affect synthesis results, you may use the make redo-par
command instead with the variable HAMMER_EXTRA_ARGS='-p <your-changed.yml>'
.
5.6.4.3. Power Estimation
Power estimation in Hammer can be performed in one of two stages: post-synthesis (post-syn) or post-place-and-route (post-par). The most accurate power estimation is post-par, and it includes finer grained details of the places instances and wire lengths. Post-par power estimation can be based on static average signal toggles rates (also known as “static power estimation”), or based on simulation-extracted signal toggle data (also known as “dynamic power estimation”).
Warning
In order to run post-par power estimation, make sure that a power estimation tool (such as Cadence Voltus) has been defined in your example-tools.yml
file. Make sure that the power estimation tool (for example, Cadence Voltus) version matches the physical design tool (for example, Cadence Innovus) version, otherwise you will encounter a database mismatch error.
Simulation-exacted power estimation often requires a dedicated testharness for the block under evalution (DUT). While the Hammer flow supports such configurations (further details can be found in the Hammer documentation), Chipyard’s integrated flows support an automated full digital SoC simulation-extracted post-par power estimation through the integration of software RTL simulation flows with the Hammer VLSI flow. As such, full digital SoC simulation-extracted power estimation can be performed by specifying a simple binary executable with the associated make
command.
make power-par BINARY=/path/to/baremetal/binary/rv64ui-p-addi.riscv CONFIG=<chipyard_config_name> tech_name=tsmintel3 INPUT_CONFS="example-design.yml example-tools.yml example-tech.yml"
The simulation-extracted power estimation flow implicitly uses Hammer’s gate-level simulation flow (in order to generate the saif
activity data file). This gate-level simulation flow can also be run independantly from the power estimation flow using the make sim-par
command.
Note
The gate-level simulation flow (and there the simulation-extracted power-estimation) is currently integrated only with the Synopsys VCS simulation (Verilator does not support gate-level simulation. Support for Cadence Xcelium is work-in-progress)
5.6.4.4. Signoff
During chip tapeout, you will need to perform sign-off check to make sure the generated GDSII can be fabricated as intended. This is done using dedicated signoff tools that perform design rule checking (DRC) and layout versus schematic (LVS) verification. In most cases, placed-and-routed designs will not pass DRC and LVS on first attempts due to nuanced design rules and subtle/silent failures of the place-and-route tools. Passing DRC and LVS will often requires adding manual placement constraints to “force” the EDA tools into certain patterns. If you have placed-and-routed a design with the goal of getting area and power estimates, DRC and LVS are not strictly neccessary and the results will likely be quite similar. If you are intending to tapeout and fabricate a chip, DRC and LVS are mandatory and will likely requires multiple-iterations of refining manual placement constraints. Having a large number of DRC/LVS violations can have a significant impact on the runtime of the place-and-route procedure (since the tools will try to fix each of them several times). A large number of DRC/LVS violations may also be an indication that the design is not necessarily realistic for this particular process technology, which may have power/area implications.
Since signoff checks are required only for a complete chip tapeout, they are currently not fully automated in Hammer, and often require some additional manual inclusion of custom Makefiles associated with specific process technologies. However, the general steps from running signoff within Hammer (under the assumption of a fully automated tech plug-in) are Make commands similar to the previous steps.
In order to run DRC, the relevant make
command is make drc
. As in the previous stages, the make command should be accompanied by the relevant configuration Make variables:
make drc CONFIG=GemminiRocketConfig VLSI_TOP=Gemmini tech_name=tsmintel3 INPUT_CONFS="example-design.yml example-tools.yml example-tech.yml"
DRC does not emit easily audited reports, as the rule names violated can be quite esoteric. It is often more productive to rather use the scripts generated by Hammer to open the DRC error database within the appropriate tool. These generated scripts can be called from ./build/<config-name>/drc-rundir/generated-scripts/view_drc
.
In order to run LVS, the relevant make
command is make lvs
. As in the previous stages, the make command should be accompanied by the relevant configuration Make variables:
make lvs CONFIG=GemminiRocketConfig VLSI_TOP=Gemmini tech_name=tsmintel3 INPUT_CONFS="example-design.yml example-tools.yml example-tech.yml"
LVS does not emit easily audited reports, as the violations are often cryptic when seen textually. As a result it is often more productive to visually see the LVS issues using the generated scripts that enable opening the LVS error database within the appropriate tool. These generated scripts can be called from ./build/<config-name>/lvs-rundir/generated-scripts/view_lvs
.
5.6.5. Customizing Your VLSI Flow in Hammer
5.6.5.1. Advanced Environment Setup
If you have access to a shared LSF cluster and you would like Hammer to submit it’s compute-intensive jobs to the LSF cluster rather than your login machine, you can add the following code segment to your env.yml
file (completing the relevant values for the bsub binary path, the number of CPUs requested, and the requested LSF queue):
#submit command (use LSF)
vlsi.submit:
command: "lsf"
settings: [{"lsf": {
"bsub_binary": "</path/to/bsub/binary/bsub>",
"num_cpus": <N>,
"queue": "<lsf_queu>",
"extra_args": ["-R", "span[hosts=1]"]
}
}]
settings_meta: "append"
5.6.5.2. Composing a Hierarchical Design
For large designs, a monolithic VLSI flow may take the EDA tools a very long time to process and optimize, to the extent that it may not be feasable sometimes. Hammer supports a hierarchical physical design flow, which decomposes the design into several specified sub-components and runs the flow on each sub-components separetly. Hammer is then able to assemble these blocks together into a top-level design. This hierarchical approach speeds up the VLSI flow for large designs, especially designs in which there may me multiple instantiations of the same sub-components(since the sub-component can simply be replicated in the layout). While hierarchical physical design can be performed in multiple ways (top-down, bottom-up, abutment etc.), Hammer currently supports only the bottom-up approach. The bottom-up approach traverses a tree representing the hierarchy starting from the leaves and towards the direction of the root (the “top level”), and runs the physical design flow on each node of the hierarchy tree using the previously layed-out children nodes. As nodes get closer to the root (or “top level”) of the hierarchy, largers sections of the design get layed-out.
The Hammer hierarchical flow relies on a manually-specified description of the desired hierarchy tree. The specification of the hierarchy tree is defined based on the instance names in the generated Verilog, which sometime make this specification challenging due to inconsisent instance names. Additionally, the specification of the hierarchy tree is intertwined with the manual specification of a floorplan for the design.
For example, if we choose to specifiy the previously mentioned GemminiRocketConfig
configuration in a hierarchical fashion in which the Gemmini accelerator and the last-level cache are run separetly from the top-level SoC, we would replace the floorplan example in example-design.yml
from the Place-and-Route section with the following specification:
vlsi.inputs.hiearchical.top_module: "ChipTop"
vlsi.inputs.hierarchical.mode: manual"
vlsi.inputs.manual_modules:
- ChipTop:
- RocketTile
- InclusiveCache
- RocketTile:
- Gemmini
vlsi.manual_placement_constraints:
- ChipTop
- path: "ChipTop"
type: toplevel
x: 0
y: 0
width: 500
height: 500
margins:
left: 0
right: 0
top: 0
bottom: 0
- RocketTile
- path: "chiptop.system.tile_prci_domain.tile"
type: hierarchical
master: ChipTop
x: 0
y: 0
width: 250
height: 250
margins:
left: 0
right: 0
top: 0
bottom: 0
- Gemmini
- path: "chiptop.system.tile_prci_domain.tile.gemmini"
type: hierarchical
master: RocketTile
x: 0
y: 0
width: 200
height: 200
margins:
left: 0
right: 0
top: 0
bottom: 0
- InclusiveCache
- path: "chiptop.system.subsystem_l2_wrapper.l2"
type: hierarchical
master: ChipTop
x: 0
y: 0
width: 100
height: 100
margins:
left: 0
right: 0
top: 0
bottom: 0
In this specification, vlsi.inputs.hierarchical.mode
indicates the manual specification of the hierarchy tree (which is the only mode currently supported by Hammer), vlsi.inputs.hierarchical.top_module
sets the root of the hierarchical tree, vlsi.inputs.hierarchical.manual_modules
enumerates the tree of hierarchical modules, and vlsi.inputs.hierarchical.manual_placement_constraints
enumerates the floorplan for each module.
For more information about the Hammer hierarchical flow and specifying the hierarchy and constraints, visit the Hammer documentation.
Note
You must generate the hierarchical hierarchy BEFORE running the make buildfile
target. This is because Hammer encodes its hierarchical flow graph in a generated Makefile in $(OBJ_DIR)/hammer.d
. If you modify your physical hierarchy, you must wipe and regenerate this Makefile. Finally, you must always override the VLSI_TOP
variable to be the hierarchical block that you are working on. This is required for hierarchical simulation and power flows.
5.6.5.3. Customizing Generated Tcl Scripts
The example-vlsi
python script is the Hammer entry script with placeholders for hooks. Hooks are additional snippets of python and TCL (via x.append()
) to extend the Hammer APIs. Hooks can be inserted using the make_pre/post/replacement_hook
methods as shown in the example-vlsi
entry script example. In this particular example, a list of hooks is passed in the get_extra_par_hooks
function in the ExampleDriver
class. Refer to the Hammer documentation on hooks for a detailed description of how these are injected into the VLSI flow.