
Design for Testability: Scan Design Implementation and Enhancement
Technology mapping is where your abstract, optimized Boolean network is methodically covered by the actual cells in your target library. At this stage, synthesis tools convert each node (AND, OR, MUX, adder, etc.) into one or more library primitives — choosing among cell variants, complex macros, and buffer trees to satisfy timing, area, and power budgets.
1. Boolean Network Preparation and Levelization
After elaboration and pre‑mapping optimizations, synthesis represents your design as a Directed Acyclic Graph (DAG) of logic nodes — ANDs, ORs, XORs, MUXes, adders, etc. Each node has fanins/fanouts and no physical characteristics attached yet.
The mapping engine’s job is to cover this DAG with mapping patterns. A mapping pattern is a small subgraph shape that corresponds to one or more library cells.
Key Steps:
- Levelization: Assign each node a level equal to the longest path distance from any primary input. This guides topological traversal and ensures timing calculation order.
- Cone Identification: For reconvergent logic, identify cones (subgraphs) that share fan‑in to avoid exponential duplication during cut enumeration.
- Cut Enumeration Parameters: Define maximum cut size (number of inputs) and depth to bound computational complexity. Typical values: 4–6 inputs.
For example : y=~(a&b)
can be covered either by:
- Two cells: AND2_X1 + INV_X1
- One cell: AOI21_X1 (if available)
Mapping picks the combination that minimizes your chosen cost (delay, area, or power) while respecting constraints.
2. Cut Enumeration & Pattern Matching Algorithms
Cut Enumeration: For each node, enumeration generates all k‑input cuts — subsets of leaf nodes whose union of cones covers the node’s functionality. Tools use:
- Recursive Traversal with memoization to avoid re‑computing cuts for shared subgraphs.
- Batched Enumeration to prune dominated cuts early (if one cut’s cost always exceeds another’s across all metrics).
Pattern Matching: Each cut is tested against library cell signatures and multi‑level macros. Matching engines in Genus/DC maintain a pre‑built pattern database:
- Atomic Patterns: Single‑level gates (INV, NAND2, NOR3).
- Multi‑level Macros: AOI/OAI gates (AOI21, OAI33), MUXes, small arithmetic units.
- Custom Macros: User‑defined complex patterns via pattern files.
Example: Pattern Matching in Technology Mapping
The leftmost column shows a Boolean network built from logic gates like AND, INV, and AOI. This is converted into a subject graph (middle) representing logic dependencies and cut candidates.
Each vertex (node) in the subject graph is evaluated for possible matches from the standard cell library, shown in the “Match” column. For example:
- Vertex x matches a NAND2(b,c) using match pattern t2.
- Vertex o can be implemented in two ways:
‣ Using basic gates (3 NAND2 + 2 INV), or
‣ As a single AOI21(x,d,a) macro (t6B), which is more efficient.
The Gate column lists the mapped standard cells, and the Cost column estimates implementation effort in terms of number of gates.
3. Drive‑Strength Selection & Buffer Tree Synthesis
After function coverage, fan‑out-driven buffering and cell upsizing are required:
Load Estimation: For each mapped cell output, estimate downstream net capacitance from Liberty .capacitance and wireload models.
Use heuristics or threshold-based decisions:
- Upsizing a single cell (e.g., INV_X1 → INV_X4) if it meets both drive and area constraints.
- Buffer Trees when a single cell cannot handle high fanout or when wire slew constraints demand distributed buffering.
Cadence Genus will then generate a minimal buffer tree or upsize cells, re-evaluate slew constraints, and iterate.
4. Macro Exploitation & Arithmetic Cell Mapping
Libraries often include highly optimized macros for arithmetic and multiplexing:
- Adders: CLA4_X1, ADDH_X2, multi-bit blocks
- Multipliers: MUL2_X2, Wallace tree macros
- Comparators & Shifters: CMP_EQ, SHF_L32
Macro Matching Flow:
Pattern Recognition: Identify arithmetic patterns in the Boolean network (e.g., carry propagate chains).
Macro Preference: Adjust mapping engine to prefer macros via weighting:
Structural Replacement: Entire subgraphs replaced with single macro instances, dramatically reducing depth and power.
5. Timing‑Driven Remapping and STA Integration
Mapping is inherently timing-driven. After initial mapping:
Write Out Provisional Netlist (.v).
Invoke STA (e.g., Tempus or PrimeTime) with the same .sdc constraints:
Identify Critical Paths with negative/marginal slack.
Selective Remapping: Direct remapping of critical cones into faster cells:
Reiterate map → STA until timing goals are met or no further improvement.
This iterative loop ensures your final map meets both local and global timing requirements.