Chapter 12: Virtual Platform Construction

VP Architecture & Design

Designing the memory map and overall architecture for a multi-component Virtual Platform using standard TLM patterns.

How to Read This Lesson

For virtual platforms, imagine a firmware engineer trying to boot real software on your model. Every abstraction choice should help that person move faster without lying about the hardware.

Building a Virtual Platform: Architecture

Up until this chapter, we have looked at isolated SystemC concepts: ports, events, TLM sockets, and memory management. But how do you combine all of these into a massive, bootable Virtual Platform (VP)?

In this multi-part tutorial, we will build a completely functional Virtual Platform from scratch.

We will strictly adhere to the industry-standard Doulos Simple Bus patterns and Accellera TLM 2.0 AT/LT open-source paradigms. This ensures our architecture is vendor-neutral, highly interoperable, and LRM compliant.

Source and LRM Trail

Virtual platform lessons combine standard TLM behavior with architecture practice. Use Docs/LRMs/SystemC_LRM_1666-2023.pdf for TLM and kernel rules, .codex-src/systemc/src/tlm_core/tlm_2 for sockets and payloads, .codex-src/cci for configurable platforms, and .codex-src/systemc-common-practices for reusable patterns.

The Goal

We are going to build a System-on-Chip (SoC) comprising:

  1. A CPU Wrapper (Initiator): A mock Instruction Set Simulator (ISS) that initiates TLM Loosely Timed (LT) memory-mapped read/write transactions.
  2. A Router (Interconnect): A TLM bus that routes the CPU's generic payload transactions to the correct peripheral based on the memory address.
  3. A RAM Module (Target): A contiguous block of memory.
  4. A Timer Peripheral (Target): A memory-mapped hardware timer.

The Memory Map

Every memory-mapped SoC relies on a Memory Map. When the CPU writes to physical address 0x4000_0004, the Router must determine which target socket to forward the transaction to.

Here is the hardware memory map for our Virtual Platform:

PeripheralBase AddressSizeEnd Address
RAM0x0000_0000256 KB (0x40000)0x0003_FFFF
Timer0x4000_00004 KB (0x1000)0x4000_0FFF
UART0x4001_00004 KB (0x1000)0x4001_0FFF

Virtual Platform Skeleton Example & Kernel Mechanics

The following is a complete, runnable skeleton of the VP top-level architecture. It demonstrates how to instantiate the initiator, the router, and the targets, and bind them using simple_initiator_socket and simple_target_socket.

Under the Hood (Accellera TLM Kernel): Why do we use tlm_utils::simple_initiator_socket instead of tlm::tlm_initiator_socket? The raw base sockets require you to explicitly implement both the forward transport interface (tlm_fw_transport_if) and backward path (tlm_bw_transport_if), inheriting from them manually on your sc_module. The tlm_utils sockets automatically encapsulate this boilerplate, providing register_b_transport() callback macros to cleanly route C++ member functions.

Furthermore, how does a transaction travel from the CPU to the Router? TLM 2.0 is built on sc_core::sc_port. During the elaboration phase, cpu.socket.bind(router.target_socket) resolves the port proxies. When run_program() calls socket->b_transport(), it does not involve an OS context switch or an event queue. Because the port has been statically resolved to a pointer during elaboration, the CPU's thread executes router.b_transport as a direct, blocking C++ function call. The thread context (the stack of run_program) literally extends down into the Router and RAM. This is why TLM LT simulation is so incredibly fast.

#include <systemc>
#include <tlm>
#include <tlm_utils/simple_initiator_socket.h>
#include <tlm_utils/simple_target_socket.h>
 
// 1. Mock CPU Initiator
SC_MODULE(CPU_Initiator) {
    tlm_utils::simple_initiator_socket<CPU_Initiator> socket;
    
    SC_CTOR(CPU_Initiator) : socket("cpu_socket") {
        SC_THREAD(run_program);
    }
    
    void run_program() {
        tlm::tlm_generic_payload trans;
        sc_core::sc_time delay = sc_core::SC_ZERO_TIME;
        
        // Mock writing to the RAM (Address 0x0000_0004)
        uint32_t data = 0xDEADBEEF;
        trans.set_command(tlm::TLM_WRITE_COMMAND);
        trans.set_address(0x00000004);
        trans.set_data_ptr(reinterpret_cast<unsigned char*>(&data));
        trans.set_data_length(4);
        trans.set_response_status(tlm::TLM_INCOMPLETE_RESPONSE);
        
        std::cout << "@" << sc_core::sc_time_stamp() << " [CPU] Sending Write to 0x4" << std::endl;
        socket->b_transport(trans, delay);
        
        wait(delay); // Advance time based on interconnect/target delay
    }
};
 
// 2. Mock Peripheral Target (RAM)
SC_MODULE(RAM_Target) {
    tlm_utils::simple_target_socket<RAM_Target> socket;
    
    SC_CTOR(RAM_Target) : socket("ram_socket") {
        socket.register_b_transport(this, &RAM_Target::b_transport);
    }
    
    void b_transport(tlm::tlm_generic_payload& trans, sc_core::sc_time& delay) {
        std::cout << "@" << sc_core::sc_time_stamp() << " [RAM] Received transaction at offset 0x" 
                  << std::hex << trans.get_address() << std::endl;
        
        trans.set_response_status(tlm::TLM_OK_RESPONSE);
        delay += sc_core::sc_time(10, sc_core::SC_NS); // Add RAM latency
    }
};
 
// 3. Simple Router (Interconnect)
SC_MODULE(SimpleRouter) {
    tlm_utils::simple_target_socket<SimpleRouter> target_socket;
    // Multi-port socket for multiple peripherals (Doulos Simple Bus pattern)
    tlm_utils::simple_initiator_socket<SimpleRouter> init_socket_ram;
    
    SC_CTOR(SimpleRouter) : target_socket("target_socket"), init_socket_ram("init_socket_ram") {
        target_socket.register_b_transport(this, &SimpleRouter::b_transport);
    }
    
    void b_transport(tlm::tlm_generic_payload& trans, sc_core::sc_time& delay) {
        uint64_t addr = trans.get_address();
        
        // Memory Map Decoder Logic
        if (addr >= 0x00000000 && addr <= 0x0003FFFF) {
            // Forward to RAM
            init_socket_ram->b_transport(trans, delay);
        } else {
            trans.set_response_status(tlm::TLM_ADDRESS_ERROR_RESPONSE);
        }
        delay += sc_core::sc_time(2, sc_core::SC_NS); // Add Routing Latency
    }
};
 
// 4. Top-Level Virtual Platform
SC_MODULE(VirtualPlatform) {
    CPU_Initiator cpu;
    SimpleRouter  router;
    RAM_Target    ram;
    
    SC_CTOR(VirtualPlatform) : cpu("cpu"), router("router"), ram("ram") {
        // Bind Initiator -> Router -> Targets
        cpu.socket.bind(router.target_socket);
        router.init_socket_ram.bind(ram.socket);
    }
};
 
int sc_main(int argc, char* argv[]) {
    VirtualPlatform vp("vp");
    sc_core::sc_start(100, sc_core::SC_NS);
    return 0;
}

Component Breakdown

  1. The Initiator (CPU): Creates the tlm_generic_payload, sets the command, physical address, and data pointer, and calls b_transport(). It then yields via wait(delay) to synchronize the Loosely Timed quantum.
  2. The Interconnect (Router): Implements a decode function based on the memory map. Before forwarding the transaction to a target via an initiator socket array, a production router will subtract the base address (e.g., addr - 0x4000_0000) so the peripheral only observes a relative local offset.
  3. The Targets (Peripherals): Execute the read/write logic on their internal memory arrays, advance the delay reference by their inherent processing latency, and set TLM_OK_RESPONSE.

Standard and Source Deep Dive: Port Binding

Port binding is the topological glue of a SystemC model. The IEEE 1666-2023 LRM Sections 4.2.1 (Elaboration) and Section 6.11-6.13 (Ports, Exports, Interfaces) rigidly define how structural connections are made and verified.

Inside the Accellera Source: sc_port_b and sc_port_registry

In src/sysc/communication/sc_port.h/cpp, all specialized sc_port<IF> classes derive from a non-template base class sc_port_b. When you declare sc_port<BusIf> bus{"bus"};, the constructor ultimately calls sc_simcontext::get_port_registry()->insert(this).

The sc_port_registry (located in src/sysc/kernel/sc_simcontext.cpp) is the global list of every port in the simulation.

When you write cpu.bus.bind(subsystem.target); in your C++ code, you are invoking the bind() method on sc_port. However, this does not immediately resolve the C++ pointer! Instead, the port simply stores a generic pointer to the bound object in an internal array (because a port can be bound to multiple channels if the port's N parameter is > 1).

The Elaboration Phase: complete_binding()

The real magic happens when sc_start() is called. Before simulation begins, sc_start() invokes sc_simcontext::elaborate(), which ultimately calls sc_port_registry::complete_binding().

If you trace sysc/kernel/sc_simcontext.cpp, you will see complete_binding() iterate over every single port in the design. For each port:

  1. It traverses the binding tree. If Port A is bound to Port B, and Port B is bound to Channel C, it recursively walks from A -> B -> C to find the actual sc_interface implementation.
  2. Type Checking: It uses C++ RTTI (dynamic_cast) to verify that the target object actually implements the interface required by the port.
    // Abstract representation of the kernel's check:
    sc_interface* target_if = dynamic_cast<sc_interface*>(bound_object);
    if (!target_if) { SC_REPORT_ERROR("Port binding failed: interface mismatch"); }
  3. It resolves the final interface pointer and stores it directly inside the port's m_interface pointer array.

Zero-Overhead Simulation Dispatch

Why delay pointer resolution until complete_binding()? Because once elaboration finishes, the port has an absolute, direct C++ pointer to the implementing channel.

In src/sysc/communication/sc_port.h, the overloaded operator-> is extraordinarily simple:

template <class IF>
inline IF* sc_port<IF>::operator -> () {
    return m_interface;
}

During simulation, when a thread executes bus->write(0x10, data);, there are no map lookups, no string comparisons, and no routing tables. It is exactly equivalent to a direct C++ virtual function call on the channel object.

Comments and Corrections