Datatype Performance and Correctness
Choosing between C++ types, SystemC integer types, bit vectors, logic vectors, fixed-point types, and TLM byte arrays.
How to Read This Lesson
These core semantics are where experienced SystemC engineers earn their calm. We will name the scheduler rule, then show how the source enforces it.
Datatype Performance and Correctness
SystemC provides an extensive library of custom datatypes because hardware modeling requires precise bit widths, four-state logic, and fixed-point arithmetic. However, a common beginner trap is using the most "hardware-looking" type everywhere. This drastically reduces simulation performance and makes the C++ code cumbersome to read.
The IEEE 1666 LRM strictly defines these datatypes. Knowing when to use native C++ types versus SystemC types is a hallmark of an expert SystemC architect.
Under the Hood: C++ Implementation in Accellera SystemC
How are SystemC datatypes implemented, and why are some slower than others?
sc_int<W>andsc_uint<W>(Fast): In the Accellera source code, these are implemented via C++ template metaprogramming. Ansc_uint<32>holds a single underlying nativeuint64_tdata member. All operators (+,-,&) are heavily inlined and simply mask the upper bits to enforce the widthW. This allows the C++ compiler to optimize them almost to the speed of native integers.sc_bigint<W>(Slow): IfW > 64, SystemC dynamically allocates an array of 32-bitunsigned intwords to hold the value. Basic arithmetic now requires aforloop across multiple words, mimicking a software big-number library.sc_lv<W>(Very Slow): A logic vector does not store bits. It stores an array ofsc_logicobjects (representing 0, 1, Z, X). Every logical operation requires evaluating the LRM 4-state resolution tables for every single element in the array.- Proxy Classes (
sc_subref): A major performance pitfall is the use of proxy classes for bit-selection ([]) and part-selection (range()). When you writereg.range(15, 8), SystemC returns a temporary proxy object (sc_dt::sc_subref). If you nest these deeply, the C++ compiler generates massive amounts of temporary proxy objects and virtual method calls, severely degrading simulation speed.
Source and LRM Trail
Advanced core behavior should always be checked against Docs/LRMs/SystemC_LRM_1666-2023.pdf before source details. For implementation, read .codex-src/systemc/src/sysc/kernel and .codex-src/systemc/src/sysc/communication, especially the scheduler, events, object hierarchy, writer policy, report handler, and async update path.
The LRM Datatype Categories
The standard defines several datatype groups under the sc_dt namespace:
- Native C++ Types: (
int,uint32_t,bool) Performance: Maximum. Use Case: Virtual Platform (TLM) internal state, counters, flags, memory arrays. - Limited-Precision Fixed-Width Integers: (
sc_dt::sc_int<W>,sc_dt::sc_uint<W>) Performance: High (implemented using 64-bit native integers under the hood). Valid for $W \le 64$. Use Case: Register fields, exact small hardware width arithmetic. - Arbitrary-Precision Integers: (
sc_dt::sc_bigint<W>,sc_dt::sc_biguint<W>) Performance: Slow (dynamically allocates arrays of words). Valid for $W > 64$. Use Case: Cryptographic keys, very wide buses, wide memory payloads. - Bit and Logic Vectors: (
sc_dt::sc_bv<W>,sc_dt::sc_lv<W>) Performance: Very Slow (uses proxy objects, stores bit arrays, resolves 4-state logic forsc_lv). Use Case: Pin-level RTL interfaces, unknown ('X') or high impedance ('Z') states. - Fixed-Point Types: (
sc_dt::sc_fixed,sc_dt::sc_ufixed) Performance: Moderate to Slow (handles quantization and overflow). Use Case: DSP algorithms, AMS (Analog Mixed Signal) boundaries.
The Proxy Object Problem
A major performance pitfall in SystemC datatypes is the use of proxy classes for bit-selection ([]) and part-selection (range()).
When you write reg.range(15, 8), SystemC does not return an integer. It returns a temporary proxy object (sc_dt::sc_subref). If you nest these deeply, the C++ compiler generates massive amounts of temporary proxy objects, severely degrading simulation speed.
Best Practice: Convert to native C++ types for complex arithmetic, then assign back to SystemC types only at the module boundaries.
Complete Example: Datatype Trade-offs
Here is a complete sc_main example demonstrates how to correctly mix native C++ types with SystemC limited-precision integers, and how to use part-select proxies safely.
#include <systemc>
#include <iostream>
#include <iomanip>
SC_MODULE(DatatypeDemo) {
// Port using exact-width hardware type
sc_core::sc_in<sc_dt::sc_uint<12>> address_in{"address_in"};
// Internal state using fast native C++ type (Best Practice for VPs)
uint32_t internal_memory[4096];
// Hardware-accurate register representing a 32-bit control register
sc_dt::sc_uint<32> control_reg;
SC_CTOR(DatatypeDemo) {
SC_METHOD(process_transaction);
sensitive << address_in;
dont_initialize();
// Initialize memory
for (int i = 0; i < 4096; i++) internal_memory[i] = 0;
control_reg = 0;
}
void process_transaction() {
// 1. Read from SystemC type to native C++ type (Fast)
uint32_t addr = address_in.read();
// 2. Perform operations using native C++ (Fast)
if (addr < 4096) {
internal_memory[addr] = 0xDEADBEEF;
}
// 3. Using SystemC Proxy Objects (range) correctly
// Extracting bits [11:8] as a 4-bit unsigned integer
sc_dt::sc_uint<4> page = address_in.read().range(11, 8);
// Packing bits into the control register
// Avoid deep nesting: reg.range() = (a, b);
control_reg.range(3, 0) = page;
control_reg.range(31, 28) = 0xF;
std::cout << "@ " << sc_core::sc_time_stamp()
<< " Addr: 0x" << std::hex << addr
<< " Page: 0x" << page
<< " Control Reg: 0x" << control_reg << "\n";
}
};
// Testbench to drive the module
SC_MODULE(Testbench) {
sc_core::sc_signal<sc_dt::sc_uint<12>> addr_sig{"addr_sig"};
DatatypeDemo* demo;
SC_CTOR(Testbench) {
demo = new DatatypeDemo("demo_inst");
demo->address_in(addr_sig);
SC_THREAD(drive);
}
void drive() {
wait(10, sc_core::SC_NS);
addr_sig.write(0x0A4); // Write 12-bit value
wait(10, sc_core::SC_NS);
addr_sig.write(0xF00);
}
~Testbench() {
delete demo;
}
};
int sc_main(int argc, char* argv[]) {
Testbench tb("tb");
std::cout << "Starting simulation...\n";
sc_core::sc_start(50, sc_core::SC_NS);
return 0;
}Explanation of the Execution
When run, the output shows:
Starting simulation...
@ 10 ns Addr: 0xa4 Page: 0x0 Control Reg: 0xf0000000
@ 20 ns Addr: 0xf00 Page: 0xf Control Reg: 0xf000000f
Notice how address_in.read().range(11, 8) correctly extracts the top 4 bits of the 12-bit address. When driving 0xF00, the top nibble is F, which is packed into the lowest 4 bits of the 32-bit control_reg.
Using uint32_t for the internal_memory ensures that the simulation runs at native C++ speeds for the bulk of the data storage, while sc_dt::sc_uint is reserved for explicit hardware boundaries.
Deep Dive: Accellera Source for sc_signal and update()
The sc_signal<T> channel perfectly illustrates the Evaluate-Update paradigm of SystemC. In the Accellera source (src/sysc/communication/sc_signal.cpp), sc_signal inherits from sc_prim_channel.
The write() Implementation
When you call write(const T&), the signal does not immediately change its value. Instead, it stores the requested value in m_new_val and registers itself with the kernel:
template<class T>
inline void sc_signal<T>::write(const T& value_) {
if( !(m_new_val == value_) ) {
m_new_val = value_;
this->request_update(); // Inherited from sc_prim_channel
}
}The request_update() call appends the channel to sc_simcontext::m_update_list.
The update() Phase
After the Evaluate phase finishes (all ready processes have run), the kernel iterates over m_update_list and calls the update() virtual function on each primitive channel. For sc_signal, this looks like:
template<class T>
inline void sc_signal<T>::update() {
if( !(m_new_val == m_cur_val) ) {
m_cur_val = m_new_val;
m_value_changed_event.notify(SC_ZERO_TIME); // Notify processes sensitive to value_changed_event()
}
}This guarantees that all concurrent processes see the same old value until the delta cycle advances, perfectly mimicking hardware register delays.
Comments and Corrections