VP Architecture & Design
Designing the memory map and overall architecture for a multi-component Virtual Platform using standard TLM patterns.
How to Read This Lesson
For virtual platforms, imagine a firmware engineer trying to boot real software on your model. Every abstraction choice should help that person move faster without lying about the hardware.
Building a Virtual Platform: Architecture
Up until this chapter, we have looked at isolated SystemC concepts: ports, events, TLM sockets, and memory management. But how do you combine all of these into a massive, bootable Virtual Platform (VP)?
In this multi-part tutorial, we will build a completely functional Virtual Platform from scratch.
We will strictly adhere to the industry-standard Doulos Simple Bus patterns and Accellera TLM 2.0 AT/LT open-source paradigms. This ensures our architecture is vendor-neutral, highly interoperable, and LRM compliant.
Source and LRM Trail
Virtual platform lessons combine standard TLM behavior with architecture practice. Use Docs/LRMs/SystemC_LRM_1666-2023.pdf for TLM and kernel rules, .codex-src/systemc/src/tlm_core/tlm_2 for sockets and payloads, .codex-src/cci for configurable platforms, and .codex-src/systemc-common-practices for reusable patterns.
The Goal
We are going to build a System-on-Chip (SoC) comprising:
- A CPU Wrapper (Initiator): A mock Instruction Set Simulator (ISS) that initiates TLM Loosely Timed (LT) memory-mapped read/write transactions.
- A Router (Interconnect): A TLM bus that routes the CPU's generic payload transactions to the correct peripheral based on the memory address.
- A RAM Module (Target): A contiguous block of memory.
- A Timer Peripheral (Target): A memory-mapped hardware timer.
The Memory Map
Every memory-mapped SoC relies on a Memory Map. When the CPU writes to physical address 0x4000_0004, the Router must determine which target socket to forward the transaction to.
Here is the hardware memory map for our Virtual Platform:
| Peripheral | Base Address | Size | End Address |
|---|---|---|---|
| RAM | 0x0000_0000 | 256 KB (0x40000) | 0x0003_FFFF |
| Timer | 0x4000_0000 | 4 KB (0x1000) | 0x4000_0FFF |
| UART | 0x4001_0000 | 4 KB (0x1000) | 0x4001_0FFF |
Virtual Platform Skeleton Example & Kernel Mechanics
The following is a complete, runnable skeleton of the VP top-level architecture. It demonstrates how to instantiate the initiator, the router, and the targets, and bind them using simple_initiator_socket and simple_target_socket.
Under the Hood (Accellera TLM Kernel):
Why do we use tlm_utils::simple_initiator_socket instead of tlm::tlm_initiator_socket?
The raw base sockets require you to explicitly implement both the forward transport interface (tlm_fw_transport_if) and backward path (tlm_bw_transport_if), inheriting from them manually on your sc_module. The tlm_utils sockets automatically encapsulate this boilerplate, providing register_b_transport() callback macros to cleanly route C++ member functions.
Furthermore, how does a transaction travel from the CPU to the Router?
TLM 2.0 is built on sc_core::sc_port. During the elaboration phase, cpu.socket.bind(router.target_socket) resolves the port proxies. When run_program() calls socket->b_transport(), it does not involve an OS context switch or an event queue. Because the port has been statically resolved to a pointer during elaboration, the CPU's thread executes router.b_transport as a direct, blocking C++ function call. The thread context (the stack of run_program) literally extends down into the Router and RAM. This is why TLM LT simulation is so incredibly fast.
#include <systemc>
#include <tlm>
#include <tlm_utils/simple_initiator_socket.h>
#include <tlm_utils/simple_target_socket.h>
// 1. Mock CPU Initiator
SC_MODULE(CPU_Initiator) {
tlm_utils::simple_initiator_socket<CPU_Initiator> socket;
SC_CTOR(CPU_Initiator) : socket("cpu_socket") {
SC_THREAD(run_program);
}
void run_program() {
tlm::tlm_generic_payload trans;
sc_core::sc_time delay = sc_core::SC_ZERO_TIME;
// Mock writing to the RAM (Address 0x0000_0004)
uint32_t data = 0xDEADBEEF;
trans.set_command(tlm::TLM_WRITE_COMMAND);
trans.set_address(0x00000004);
trans.set_data_ptr(reinterpret_cast<unsigned char*>(&data));
trans.set_data_length(4);
trans.set_response_status(tlm::TLM_INCOMPLETE_RESPONSE);
std::cout << "@" << sc_core::sc_time_stamp() << " [CPU] Sending Write to 0x4" << std::endl;
socket->b_transport(trans, delay);
wait(delay); // Advance time based on interconnect/target delay
}
};
// 2. Mock Peripheral Target (RAM)
SC_MODULE(RAM_Target) {
tlm_utils::simple_target_socket<RAM_Target> socket;
SC_CTOR(RAM_Target) : socket("ram_socket") {
socket.register_b_transport(this, &RAM_Target::b_transport);
}
void b_transport(tlm::tlm_generic_payload& trans, sc_core::sc_time& delay) {
std::cout << "@" << sc_core::sc_time_stamp() << " [RAM] Received transaction at offset 0x"
<< std::hex << trans.get_address() << std::endl;
trans.set_response_status(tlm::TLM_OK_RESPONSE);
delay += sc_core::sc_time(10, sc_core::SC_NS); // Add RAM latency
}
};
// 3. Simple Router (Interconnect)
SC_MODULE(SimpleRouter) {
tlm_utils::simple_target_socket<SimpleRouter> target_socket;
// Multi-port socket for multiple peripherals (Doulos Simple Bus pattern)
tlm_utils::simple_initiator_socket<SimpleRouter> init_socket_ram;
SC_CTOR(SimpleRouter) : target_socket("target_socket"), init_socket_ram("init_socket_ram") {
target_socket.register_b_transport(this, &SimpleRouter::b_transport);
}
void b_transport(tlm::tlm_generic_payload& trans, sc_core::sc_time& delay) {
uint64_t addr = trans.get_address();
// Memory Map Decoder Logic
if (addr >= 0x00000000 && addr <= 0x0003FFFF) {
// Forward to RAM
init_socket_ram->b_transport(trans, delay);
} else {
trans.set_response_status(tlm::TLM_ADDRESS_ERROR_RESPONSE);
}
delay += sc_core::sc_time(2, sc_core::SC_NS); // Add Routing Latency
}
};
// 4. Top-Level Virtual Platform
SC_MODULE(VirtualPlatform) {
CPU_Initiator cpu;
SimpleRouter router;
RAM_Target ram;
SC_CTOR(VirtualPlatform) : cpu("cpu"), router("router"), ram("ram") {
// Bind Initiator -> Router -> Targets
cpu.socket.bind(router.target_socket);
router.init_socket_ram.bind(ram.socket);
}
};
int sc_main(int argc, char* argv[]) {
VirtualPlatform vp("vp");
sc_core::sc_start(100, sc_core::SC_NS);
return 0;
}Component Breakdown
- The Initiator (CPU): Creates the
tlm_generic_payload, sets the command, physical address, and data pointer, and callsb_transport(). It then yields viawait(delay)to synchronize the Loosely Timed quantum. - The Interconnect (Router): Implements a decode function based on the memory map. Before forwarding the transaction to a target via an initiator socket array, a production router will subtract the base address (e.g.,
addr - 0x4000_0000) so the peripheral only observes a relative local offset. - The Targets (Peripherals): Execute the read/write logic on their internal memory arrays, advance the delay reference by their inherent processing latency, and set
TLM_OK_RESPONSE.
Standard and Source Deep Dive: Port Binding
Port binding is the topological glue of a SystemC model. The IEEE 1666-2023 LRM Sections 4.2.1 (Elaboration) and Section 6.11-6.13 (Ports, Exports, Interfaces) rigidly define how structural connections are made and verified.
Inside the Accellera Source: sc_port_b and sc_port_registry
In src/sysc/communication/sc_port.h/cpp, all specialized sc_port<IF> classes derive from a non-template base class sc_port_b.
When you declare sc_port<BusIf> bus{"bus"};, the constructor ultimately calls sc_simcontext::get_port_registry()->insert(this).
The sc_port_registry (located in src/sysc/kernel/sc_simcontext.cpp) is the global list of every port in the simulation.
When you write cpu.bus.bind(subsystem.target); in your C++ code, you are invoking the bind() method on sc_port. However, this does not immediately resolve the C++ pointer! Instead, the port simply stores a generic pointer to the bound object in an internal array (because a port can be bound to multiple channels if the port's N parameter is > 1).
The Elaboration Phase: complete_binding()
The real magic happens when sc_start() is called.
Before simulation begins, sc_start() invokes sc_simcontext::elaborate(), which ultimately calls sc_port_registry::complete_binding().
If you trace sysc/kernel/sc_simcontext.cpp, you will see complete_binding() iterate over every single port in the design. For each port:
- It traverses the binding tree. If Port A is bound to Port B, and Port B is bound to Channel C, it recursively walks from A -> B -> C to find the actual
sc_interfaceimplementation. - Type Checking: It uses C++ RTTI (
dynamic_cast) to verify that the target object actually implements the interface required by the port.// Abstract representation of the kernel's check: sc_interface* target_if = dynamic_cast<sc_interface*>(bound_object); if (!target_if) { SC_REPORT_ERROR("Port binding failed: interface mismatch"); } - It resolves the final interface pointer and stores it directly inside the port's
m_interfacepointer array.
Zero-Overhead Simulation Dispatch
Why delay pointer resolution until complete_binding()? Because once elaboration finishes, the port has an absolute, direct C++ pointer to the implementing channel.
In src/sysc/communication/sc_port.h, the overloaded operator-> is extraordinarily simple:
template <class IF>
inline IF* sc_port<IF>::operator -> () {
return m_interface;
}During simulation, when a thread executes bus->write(0x10, data);, there are no map lookups, no string comparisons, and no routing tables. It is exactly equivalent to a direct C++ virtual function call on the channel object.
Comments and Corrections