Chapter 11: Advanced Core Semantics

Deep Dive: Fixed-Point Datatypes (sc_fixed) and Quantization

Master the IEEE 1666-2023 fixed-point types, exploring quantization modes, overflow mechanics, and the underlying math of sc_fixed and sc_ufixed.

How to Read This Lesson

These core semantics are where experienced SystemC engineers earn their calm. We will name the scheduler rule, then show how the source enforces it.

Deep Dive: Fixed-Point Datatypes (sc_fixed) and Quantization

In hardware design, floating-point arithmetic (like C++ float or double) is typically avoided due to massive silicon area, high power consumption, and long propagation delays. Instead, designers rely on Fixed-Point Arithmetic. The IEEE 1666 standard provides dedicated classes for this: sc_fixed (signed) and sc_ufixed (unsigned), along with their fast equivalents sc_fixed_fast and sc_ufixed_fast.

This tutorial breaks down the anatomy of a fixed-point number, explores the detailed quantization (sc_q_mode) and overflow (sc_o_mode) mechanics, and examines the Accellera source code to understand the performance overhead of these types.

Source and LRM Trail

Advanced core behavior should always be checked against Docs/LRMs/SystemC_LRM_1666-2023.pdf before source details. For implementation, read .codex-src/systemc/src/sysc/kernel and .codex-src/systemc/src/sysc/communication, especially the scheduler, events, object hierarchy, writer policy, report handler, and async update path.

Anatomy of a Fixed-Point Type

To use a fixed-point type, you must define its geometry:

sc_fixed<wl, iwl, q_mode, o_mode, n_bits>
  • wl (Word Length): Total number of bits.
  • iwl (Integer Word Length): Number of bits located to the left of the binary point (including the sign bit for signed types).
  • q_mode (Quantization Mode): How to handle bits that are discarded on the right (fractional bits) when casting to a type with fewer fractional bits.
  • o_mode (Overflow Mode): How to handle bits that overflow on the left (integer bits) when casting to a type with fewer integer bits.
  • n_bits: Number of saturated bits (only relevant for certain overflow modes).

The number of fractional bits is simply wl - iwl. Note that iwl can be greater than wl (implying trailing zeros) or negative (implying leading fractional zeros), though typical use cases have 0 < iwl <= wl.

Source Code Mechanics: sc_fxnum vs sc_fxval

If you read the Accellera source code in sysc/datatypes/fx/, you will see that sc_fixed is merely a template wrapper around the base class sc_fxnum.

When you perform arithmetic on an sc_fixed, the kernel does not operate directly on the bit array. Instead, the operands are converted into an intermediate representation called sc_fxval.

  • sc_fxval dynamically allocates an array of 32-bit words (m_rep) to hold an arbitrary-precision mantissa, along with an exponent.
  • The arithmetic operation (addition, multiplication) is performed in this high-precision sc_fxval space.
  • The result is then cast back into the target sc_fxnum. During this cast, the target's sc_fxtype_params are applied, which triggers the Quantization (sc_q_mode) and Overflow (sc_o_mode) logic.

Furthermore, every sc_fxnum contains a pointer to an sc_fxnum_observer. This is a design pattern used to notify VCD waveform tracers whenever the fixed-point value changes, which adds memory overhead to every single fixed-point variable.

Quantization Modes (sc_q_mode)

When you assign a highly precise number to a less precise fixed-point variable, you lose fractional bits. Quantization defines how that loss is handled in the sc_fxval to sc_fxnum cast:

  1. SC_TRN (Truncation): Default. Simply chops off the extra bits. This approaches negative infinity.
  2. SC_TRN_ZERO: Truncates towards zero.
  3. SC_RND (Round to positive infinity): Adds 0.5 to the LSB being kept, carrying over if needed.
  4. SC_RND_ZERO: Rounds towards zero.
  5. SC_RND_MIN_INF: Rounds towards negative infinity.
  6. SC_RND_INF: Rounds away from zero.
  7. SC_RND_CONV (Convergent Rounding / Banker's Rounding): Rounds to the nearest even number if exactly halfway. Minimizes statistical bias in DSP algorithms.

Overflow Modes (sc_o_mode)

When an assignment exceeds the maximum representable value (integer bits are lost), overflow handling determines the outcome:

  1. SC_WRAP (Wrap-around): Default. The bits simply roll over, ignoring the lost MSBs.
  2. SC_WRAP_SM: Wrap-around with Sign Magnitude representation.
  3. SC_SAT (Saturation): Clips to the maximum positive or negative representable value. Crucial for DSP (e.g., audio doesn't flip from loud positive to loud negative, it just distorts softly).
  4. SC_SAT_ZERO: Clips to zero on overflow.
  5. SC_SAT_SYM: Symmetrical saturation. (e.g., if max is 7, min is -7 instead of -8).

The Fast Types (sc_fixed_fast)

The standard types (sc_fixed) use arbitrary-precision arithmetic internally (sc_fxval), which relies on heap allocations and loops. If your wl is less than or equal to 53 bits (the mantissa size of a standard double), you should use sc_fixed_fast and sc_ufixed_fast.

In the Accellera kernel, sc_fixed_fast derives from sc_fxnum_fast. Arithmetic operations convert operands to sc_fxval_fast, which is backed directly by a native C++ double (m_val). This bypasses all array allocations, delegating the math directly to your CPU's FPU, resulting in massive simulation speedups while retaining bit-accurate semantics during the final assignment cast.

End-to-End Example: DSP Accumulator

Here is a complete sc_main example demonstrates how truncation and saturation affect signal processing values.

#define SC_INCLUDE_FX // Required to include fixed-point headers
#include <systemc>
#include <iostream>
#include <iomanip>
 
int sc_main(int argc, char* argv[]) {
    // Suppress default SystemC info messages
    sc_core::sc_report_handler::set_actions("/IEEE_Std_1666/deprecated", sc_core::SC_DO_NOTHING);
 
    std::cout << "--- SystemC Fixed-Point Tutorial ---" << std::endl;
 
    // 1. Basic Declaration
    // wl = 8, iwl = 4 -> 4 integer bits, 4 fractional bits.
    // Signed type, so range is [-8.0, 7.9375]
    sc_dt::sc_fixed<8, 4> basic_val;
    basic_val = 3.5;
    std::cout << "Basic Value: " << basic_val << std::endl;
 
    // 2. Exploring Quantization (Rounding vs Truncation)
    // Source number needs high precision
    sc_dt::sc_fixed<16, 4> high_prec = 2.6875; // 0010.1011
 
    // Target: only 2 fractional bits. 
    // Truncation (Default)
    sc_dt::sc_fixed<6, 4, sc_dt::SC_TRN> trn_val = high_prec; 
    // Rounding (Adds to LSB)
    sc_dt::sc_fixed<6, 4, sc_dt::SC_RND> rnd_val = high_prec; 
 
    std::cout << "\n--- Quantization ---" << std::endl;
    std::cout << "Original  (16,4): " << high_prec << std::endl;
    std::cout << "Truncated (6,4) : " << trn_val << " (Lost precision)" << std::endl;
    std::cout << "Rounded   (6,4) : " << rnd_val << " (Rounded up)" << std::endl;
 
    // 3. Exploring Overflow (Wrap vs Saturation)
    // Source number needs high integer range
    sc_dt::sc_fixed<8, 8> large_val = 14; 
 
    // Target: only 3 integer bits (signed, range [-4, 3])
    // Wrap (Default)
    sc_dt::sc_fixed<5, 3, sc_dt::SC_TRN, sc_dt::SC_WRAP> wrap_val = large_val;
    // Saturation
    sc_dt::sc_fixed<5, 3, sc_dt::SC_TRN, sc_dt::SC_SAT> sat_val = large_val;
 
    std::cout << "\n--- Overflow ---" << std::endl;
    std::cout << "Original   (8,8): " << large_val << std::endl;
    std::cout << "Wrapped    (5,3): " << wrap_val << " (Rolled over)" << std::endl;
    std::cout << "Saturated  (5,3): " << sat_val  << " (Clipped to max positive)" << std::endl;
 
    // 4. Bit-level Introspection
    std::cout << "\n--- Bit-level introspection ---" << std::endl;
    // sc_fixed allows reading/writing individual bits using []
    // Bits are indexed from 0 (LSB) to wl-1 (MSB)
    sc_dt::sc_fixed<4, 4> mask = 5; // Binary 0101
    std::cout << "Value: " << mask << ", Binary: ";
    for (int i = 3; i >= 0; --i) {
        std::cout << mask[i];
    }
    std::cout << std::endl;
 
    // Flip the MSB (Sign bit)
    mask[3] = 1;
    std::cout << "After flipping MSB: " << mask << std::endl;
 
    return 0;
}

LRM Strictness: #define SC_INCLUDE_FX

By default, the SystemC header (#include <systemc>) does not include the fixed-point library. The fixed-point headers bring a massive amount of template code into your translation unit, drastically slowing down compilation.

The LRM specifies that you must define the macro SC_INCLUDE_FX before including <systemc> in any file that uses fixed-point types.

#define SC_INCLUDE_FX
#include <systemc>

If you forget this, you will receive "type not declared" errors from the compiler for sc_fixed.

Conclusion

Understanding sc_fixed and sc_q_mode/sc_o_mode is critical for designing DSP algorithms, Neural Network accelerators, and modem pipelines in SystemC. By utilizing bit-accurate datatypes and favoring sc_fixed_fast where applicable, you achieve a perfect balance between hardware accuracy and simulation speed.

Comments and Corrections