Virtual Platform Patterns

How to Read This Lesson

Treat this as engineering practice, not trivia. The patterns here are the ones that keep large models understandable after the original author has moved on.

A virtual platform is a SystemC model of a system that software can run against. It usually trades pin-level detail for speed and architectural usefulness.

Source and LRM Trail

Practice lessons should still cite their roots. Use Docs/LRMs/SystemC_LRM_1666-2023.pdf for behavior, .codex-src/systemc for the reference kernel, and .codex-src/systemc-common-practices for reusable modeling patterns. The goal is to turn standard rules into habits that survive real project scale.

Industry Standard Architectures

When building Virtual Platforms in SystemC, avoid proprietary or vendor-specific bus models unless strictly necessary. Instead, strictly model the architecture after the official Accellera TLM-2.0 loosely-timed (LT) and approximately-timed (AT) open-source examples, or the industry-standard Doulos Simple Bus.

Most standards-compliant platforms contain:

CPU or instruction-set simulator wrapper (Initiator)
memory map and address decoder (Interconnect/Bus)
RAM and ROM models (Targets)
timers, interrupt controllers, UARTs, DMA (Targets & Initiators)
debug and tracing utilities

Address Decoding & The Simple Bus

An interconnect routes transactions by address. To demonstrate this pattern, here is a complete, compilable, and standards-compliant TLM-2.0 Simple Router model. It uses the simple_target_socket to receive transactions from an initiator (e.g., a CPU) and forwards them to the correct target (e.g., Memory or UART) via an array of simple_initiator_sockets.

#include <systemc>
#include <tlm>
#include <tlm_utils/simple_initiator_socket.h>
#include <tlm_utils/simple_target_socket.h>
#include <vector>
 
using namespace sc_core;
 
// 1. The Targets
SC_MODULE(MemoryBlock) {
  tlm_utils::simple_target_socket<MemoryBlock> socket{"socket"};
  SC_CTOR(MemoryBlock) { socket.register_b_transport(this, &MemoryBlock::b_transport); }
  void b_transport(tlm::tlm_generic_payload& trans, sc_time& delay) {
    delay += sc_time(10, SC_NS);
    trans.set_response_status(tlm::TLM_OK_RESPONSE);
  }
};
 
SC_MODULE(UartBlock) {
  tlm_utils::simple_target_socket<UartBlock> socket{"socket"};
  SC_CTOR(UartBlock) { socket.register_b_transport(this, &UartBlock::b_transport); }
  void b_transport(tlm::tlm_generic_payload& trans, sc_time& delay) {
    delay += sc_time(50, SC_NS);
    trans.set_response_status(tlm::TLM_OK_RESPONSE);
    if (trans.get_command() == tlm::TLM_WRITE_COMMAND) {
      std::cout << "[UART] Char: " << (char)*(trans.get_data_ptr()) << "\n";
    }
  }
};
 
// 2. The Simple Bus / Interconnect
SC_MODULE(SimpleBus) {
  tlm_utils::simple_target_socket<SimpleBus> target_socket{"target_socket"};
  tlm_utils::simple_initiator_socket<SimpleBus> init_socket_mem{"init_socket_mem"};
  tlm_utils::simple_initiator_socket<SimpleBus> init_socket_uart{"init_socket_uart"};
 
  SC_CTOR(SimpleBus) {
    target_socket.register_b_transport(this, &SimpleBus::b_transport);
  }
 
  void b_transport(tlm::tlm_generic_payload& trans, sc_time& delay) {
    uint64_t addr = trans.get_address();
    
    // Address Map: 
    // 0x0000 - 0x0FFF : Memory
    // 0x1000 - 0x1FFF : UART
    if (addr < 0x1000) {
      init_socket_mem->b_transport(trans, delay);
    } else if (addr < 0x2000) {
      // Localize address for target
      trans.set_address(addr - 0x1000); 
      init_socket_uart->b_transport(trans, delay);
      // Restore address
      trans.set_address(addr);
    } else {
      trans.set_response_status(tlm::TLM_ADDRESS_ERROR_RESPONSE);
    }
  }
};
 
// 3. The CPU (Initiator)
SC_MODULE(CpuModel) {
  tlm_utils::simple_initiator_socket<CpuModel> socket{"socket"};
  SC_CTOR(CpuModel) { SC_THREAD(run); }
  void run() {
    tlm::tlm_generic_payload trans;
    sc_time delay = SC_ZERO_TIME;
    char data = 'A';
    
    // Write to UART
    trans.set_command(tlm::TLM_WRITE_COMMAND);
    trans.set_address(0x1000);
    trans.set_data_ptr(reinterpret_cast<unsigned char*>(&data));
    trans.set_data_length(1);
    
    socket->b_transport(trans, delay);
    wait(delay); // Synchronize with simulation time
  }
};
 
int sc_main(int argc, char* argv[]) {
  CpuModel cpu("cpu");
  SimpleBus bus("bus");
  MemoryBlock mem("mem");
  UartBlock uart("uart");
  
  // Binding
  cpu.socket.bind(bus.target_socket);
  bus.init_socket_mem.bind(mem.socket);
  bus.init_socket_uart.bind(uart.socket);
  
  sc_start();
  return 0;
}

The bus checks the address, forwards the payload, and preserves timing annotations. Modifying the generic payload address is allowed, but you must restore it before the transaction returns to the caller.

Timing Strategy

Choose timing deliberately:

Untimed: fastest, good for pure software enablement.
Loosely timed (LT): good for early performance and firmware work, relies on b_transport and quantum keeping (temporal decoupling).
Approximately timed (AT): useful when ordering and protocol phases (e.g. BEGIN_REQ, END_REQ) matter.
Cycle-accurate: slower, needed for detailed microarchitecture questions.

Interrupts and Register Models

Interrupts can be signals, events, TLM messages, or register-visible state depending on the platform style. Firmware should observe a coherent interrupt controller behavior: pending bits, enables, priorities, acknowledge paths, and side effects.

Register behavior is where many virtual platforms become valuable. Model reset values, read-only bits, write-one-to-clear bits, reserved fields, side effects, and timing when needed.

Under the Hood: TLM-2.0 Interfaces (`tlm_fw_transport_if`)

Virtual Platforms rely entirely on the TLM-2.0 transport interfaces. In src/tlm_core/tlm_2/tlm_interfaces/tlm_fw_bw_ifs.h, you'll find tlm_fw_transport_if. This interface enforces the implementation of b_transport, nb_transport_fw, get_direct_mem_ptr, and transport_dbg. By making these pure virtual, the Accellera TLM kernel enforces strict decoupling. An initiator socket holds a pointer to this interface, and the target socket binds its internal implementation to it. Thus, the virtual platform initiator is executing the target's C++ function via a vtable lookup, resulting in extreme execution speed (often millions of transactions per second) without relying on heavy SystemC kernel context switches.

IEEE 1666-2023 LRM: TLM-2.0 Modeling and Architectures

When constructing a Virtual Platform (VP) in SystemC, the IEEE 1666-2023 standard provides a dedicated set of rules, interfaces, and patterns in Chapter 10 through 16 to ensure high interoperability, speed, and accuracy. The structure we created in the Simple Bus example directly aligns with these rules.

Loosely-Timed (LT) vs Approximately-Timed (AT) (LRM 10.3)

The standard defines two primary modeling styles for virtual platforms based on the accuracy required:

Loosely-Timed (LT)

Goal: Maximum simulation speed for software execution.
Mechanism: Uses the blocking transport interface (b_transport). A single function call carries the transaction from initiator to target and back.
Timing: Uses Temporal Decoupling (LRM 10.3.1). Processes run ahead of simulation time and synchronize using a quantum. The target accumulates delay (e.g., delay += sc_time(10, SC_NS)) without suspending (wait()).
Use Case: Executing operating systems, firmware, and drivers where the exact bus cycle latency is less important than the architectural logic.

Approximately-Timed (AT)

Goal: Accurate architectural exploration and performance modeling.
Mechanism: Uses the non-blocking transport interface (nb_transport_fw and nb_transport_bw). A single transaction is broken into multiple phases (e.g., Request, Response) which are passed back and forth.
Timing: Targets and initiators use wait() or timed notifications to represent cycle delays. Components contend for shared resources, and arbitration rules are modeled.
Phases (LRM 14.1.2): The base protocol defines BEGIN_REQ, END_REQ, BEGIN_RESP, and END_RESP.

The Generic Payload (LRM Chapter 14)

Interoperability in a virtual platform requires that all models agree on the data structure being transmitted. The LRM defines tlm_generic_payload (often referred to as tlm::tlm_generic_payload in the source).

Mandatory Attributes (LRM 14.8-14.16)

The standard dictates the precise semantics of the Generic Payload attributes:

Command (m_command): TLM_READ_COMMAND, TLM_WRITE_COMMAND, or TLM_IGNORE_COMMAND.
Address (m_address): A 64-bit integer (sc_dt::uint64).
Data Pointer (m_data): A pointer to the payload bytes.
Data Length (m_length): The number of bytes to read/write.
Byte Enables (m_byte_enable): Allows for sparse transfers (e.g., writing only 2 bytes in a 4-byte word).
Streaming Width (m_streaming_width): Defines burst behavior. If streaming width < data length, the transaction wraps back to the start address.
Response Status (m_response_status): Initially TLM_INCOMPLETE_RESPONSE. Must be set by the target to TLM_OK_RESPONSE, TLM_ADDRESS_ERROR_RESPONSE, etc.
DMI Allowed (m_dmi): A boolean flag indicating if Direct Memory Interface is possible.

Generic Payload Memory Management (LRM 14.4)

In a high-speed VP, allocating and deallocating tlm_generic_payload objects via new/delete for every transaction causes catastrophic performance degradation. The LRM requires the use of a memory manager (tlm_mm_interface). Initiators allocate payloads from a pool, attach the memory manager, and when the transaction's reference count drops to zero, the payload is returned to the pool without deallocation.

Direct Memory Interface (DMI) (LRM Chapter 11.2)

For virtual platforms executing code from memory, even the fast b_transport is too slow because it requires a virtual function call and address decoding for every single instruction fetch.

DMI solves this.

The initiator sends a normal b_transport with the DMI Allowed flag checked.
The target (e.g., RAM) replies and sets the flag to true.
The initiator calls get_direct_mem_ptr().
The bus translates the address, forwards the request, and returns a tlm_dmi object. This object contains a direct C++ raw pointer (unsigned char* dmi_ptr) to the host machine's memory, along with the start/end address range and read/write permissions.
The initiator caches this pointer. Future reads/writes to that address range bypass the TLM sockets entirely and use direct C++ array indexing.

This single feature allows SystemC VPs to boot Linux in seconds rather than hours.

Deep Dive: Accellera Source Code

The Core Interfaces in `tlm_core`

If you explore the Accellera source code, you'll see the separation of the SystemC kernel (sysc/) from the TLM library (src/tlm_core/).

In tlm_core/tlm_2/tlm_interfaces/tlm_fw_bw_ifs.h, you see the pure C++ translation of the LRM:

class tlm_fw_transport_if : public virtual sc_core::sc_interface {
public:
  virtual void b_transport(tlm_generic_payload& trans, sc_core::sc_time& t) = 0;
  virtual tlm_sync_enum nb_transport_fw(tlm_generic_payload& trans, tlm_phase& phase, sc_core::sc_time& t) = 0;
  virtual bool get_direct_mem_ptr(tlm_generic_payload& trans, tlm_dmi& dmi_data) = 0;
  virtual unsigned int transport_dbg(tlm_generic_payload& trans) = 0;
};

Notice that b_transport is not allowed to return a status enum; it must throw an exception if a hard failure occurs, but relies entirely on modifying the response_status of the payload object for transaction results.

The Standard Sockets vs Utils

The standard defines tlm_initiator_socket and tlm_target_socket in tlm_core/tlm_2/tlm_sockets/. These are the raw binding points.

However, the examples and common patterns use tlm_utils. In tlm_core/tlm_2/tlm_utils/simple_target_socket.h, the implementation essentially wraps a tlm_target_socket and provides the target implementation itself:

// Conceptual simple_target_socket internal binding
template <typename MODULE, ...>
class simple_target_socket : public tlm_target_socket<...> {
   // It contains a private class that implements tlm_fw_transport_if
   class fw_process : public tlm_fw_transport_if {
       // ... implements b_transport by calling the user's registered callback
   };
   fw_process m_fw_process;
 
public:
   simple_target_socket() {
       // It binds its own base class socket to its internal implementation!
       this->bind(m_fw_process); 
   }
};

This self-binding mechanism is what allows you to instantiate a simple_target_socket, register a callback, and never have to derive your module from tlm_fw_transport_if.

Implementing the Interconnect (Bus)

In our Simple Bus example, the b_transport method manipulated the trans.get_address(). The LRM explicitly permits an interconnect to modify the address to route the transaction to a target (LRM 14.14). However, Rule 14.14(e) strictly states that the interconnect MUST restore the address to its original value before returning control to the initiator. Failure to do so corrupts the generic payload for the initiator's subsequent use.

Furthermore, an interconnect handling DMI requests (get_direct_mem_ptr) must translate the addresses in the returned tlm_dmi struct. If a target returns a DMI range of 0x0000 - 0x0FFF, and the target is mapped at 0x1000 on the bus, the bus must adjust the DMI struct's range to 0x1000 - 0x1FFF before returning it to the CPU.

By adhering to these strict LRM contracts and leveraging the tlm_utils classes, SystemC Virtual Platforms can integrate CPU instruction set simulators (like QEMU or FastModels) with custom peripherals to achieve near real-time execution speeds.