Chapter 12: Virtual Platform Construction

VP Requirements & Abstraction Level

Defining Loosely Timed (LT) vs Approximately Timed (AT) models and gathering requirements for a Virtual Platform.

How to Read This Lesson

For virtual platforms, imagine a firmware engineer trying to boot real software on your model. Every abstraction choice should help that person move faster without lying about the hardware.

VP Requirements & Abstraction Level

When architecting a Virtual Platform (VP), the first question an Electronic System Level (ESL) architect must answer is: What is the abstraction level?

If the goal is to boot an operating system (Linux, Android) as fast as possible to develop software early, you need Loosely Timed (LT) models. If the goal is to analyze bus contention, memory bandwidth, and cache hit ratios, you need Approximately Timed (AT) models. Now let's look at how the Accellera TLM standard defines these mechanically.

Source and LRM Trail

Virtual platform lessons combine standard TLM behavior with architecture practice. Use Docs/LRMs/SystemC_LRM_1666-2023.pdf for TLM and kernel rules, .codex-src/systemc/src/tlm_core/tlm_2 for sockets and payloads, .codex-src/cci for configurable platforms, and .codex-src/systemc-common-practices for reusable patterns.

Loosely Timed (LT) vs Approximately Timed (AT)

The IEEE 1666 TLM 2.0 standard explicitly defines these two coding styles.

Loosely Timed (LT)

  • Goal: Maximum simulation speed.
  • Mechanism: Uses the b_transport blocking interface. Under the Hood: A transaction executes in a single C++ function call down the port-proxy stack, bypassing the sc_simcontext event queue entirely. Time is passed as a reference (sc_core::sc_time& delay) and accumulated, utilizing Temporal Decoupling to avoid expensive scheduler context switches.
  • Use Case: Software development, firmware validation, functional verification.

Approximately Timed (AT)

  • Goal: Cycle-approximate performance analysis.
  • Mechanism: Uses the nb_transport_fw and nb_transport_bw non-blocking interfaces. Under the Hood: A single transaction is broken into multiple phases using the tlm_phase enum (BEGIN_REQ, END_REQ, BEGIN_RESP, END_RESP). Components usually utilize a Payload Event Queue (tlm_utils::peq_with_cb_and_phase) which heavily relies on sc_event::notify() to schedule payload processing at exact future delta cycles or timestamps. This accurately models pipeline stages and bus arbitration but leads to thousands of sc_simcontext wakeups per transaction.
  • Use Case: Architectural exploration, performance bottleneck analysis.

End-to-End LT Initiator Example with Accellera Quantum Keeper

In a Doulos Simple Bus compliant VP, we predominantly use LT to boot software. Here is a perfect LT initiator utilizing the standard tlm_utils::tlm_quantumkeeper to manage Temporal Decoupling safely.

#include <systemc>
#include <tlm>
#include <tlm_utils/simple_initiator_socket.h>
#include <tlm_utils/tlm_quantumkeeper.h>
 
SC_MODULE(LT_CPU_Model) {
    tlm_utils::simple_initiator_socket<LT_CPU_Model> socket;
    tlm_utils::tlm_quantumkeeper m_qk;
 
    SC_CTOR(LT_CPU_Model) : socket("socket") {
        // Set the global quantum (e.g., sync every 1000 ns)
        m_qk.set_global_quantum(sc_core::sc_time(1000, sc_core::SC_NS));
        // Reset the local time offset
        m_qk.reset();
        
        SC_THREAD(execute_instructions);
    }
 
    void execute_instructions() {
        tlm::tlm_generic_payload trans;
        uint32_t data = 0;
 
        // Temporal Decoupling: Accumulate time locally without yielding to the SystemC scheduler
        for (int i = 0; i < 1000; i++) {
            trans.set_command(tlm::TLM_READ_COMMAND);
            trans.set_address(0x1000);
            trans.set_data_ptr(reinterpret_cast<unsigned char*>(&data));
            trans.set_data_length(4);
            trans.set_response_status(tlm::TLM_INCOMPLETE_RESPONSE);
 
            // Fetch the current local delay from the quantum keeper
            sc_core::sc_time local_delay = m_qk.get_local_time();
            
            socket->b_transport(trans, local_delay);
            
            // Add internal CPU instruction execution latency
            local_delay += sc_core::sc_time(10, sc_core::SC_NS);
 
            // Update the quantum keeper with the new accumulated local delay
            m_qk.set(local_delay);
 
            // The quantum keeper automatically checks if (local_delay >= global_quantum).
            // If it is, it calls wait() under the hood, syncing with sc_simcontext,
            // and resets local_time to SC_ZERO_TIME.
            if (m_qk.need_sync()) {
                std::cout << "@" << sc_core::sc_time_stamp() << " [CPU] Quantum exceeded. Syncing scheduler." << std::endl;
                m_qk.sync();
            }
        }
    }
};
 
int sc_main(int argc, char* argv[]) {
    // Boilerplate for standalone compilation
    return 0;
}

By explicitly gathering requirements upfront and utilizing the native tlm_quantumkeeper, you avoid the disastrous mistake of writing slow, cycle-accurate models when the software team just needs a fast functional platform.

Comments and Corrections