Multi-Processor System-on-Chip 2. Liliana Andrade
Чтение книги онлайн.

Читать онлайн книгу Multi-Processor System-on-Chip 2 - Liliana Andrade страница 12

СКАЧАТЬ to a finite number of n + 1 (n fraction, 1 sign) bits. The reference as well as other parts of the test bed are coded in double-complex precision (imag 64 − bit, real 64−bit); hence there is also type casting happening at kernel interfaces.

      where Pnoise is the average error power of data carrying subcarriers and Psignal is the average power of data carrying subcarriers. The average power of data carrying subcarriers is invariant to QAM modulations; hence the EVM measured is the same for different QAM modulations, although the budget available is not the same. Measurement results are shown in Figure 1.12 in EV MdB notation with the budget constraints overlapped over the results.

      Third, the transition region is centered around the point ACCbits = databits. The transition region extends [log2M] (rounded towards infinity as in [3.14] = 4) bits away from the region center point. M corresponds to the number of MACs per accumulator of the algorithm shown in Figure 1.9, explaining the need for extra bits logarithmically proportional to the number of MACs. Hence, the following conclusions can be drawn:

       – data type with a bit-length of 16 is sufficient for all standard required QAM modulations with a buffer budget ranging from 25.27 dB for QPSK to 8.37 dB for 1024-QAM. This leftover budget can be used in other DBB transmitter processing blocks;

       – accumulator guard bits of at least [log2M] are needed to avoid signal degradation due to the EVM transition region.

Schematic illustration of gFDM EVM for varied data and ACC complex bit-lengths compared to adjusted 3GPP EVM DBB requirements.

      Figure 1.12. GFDM EVM for varied data and ACC complex bit-lengths compared to adjusted 3GPP EVM DBB requirements (3GPP 2018b, 2019a)

      Implementation of algorithms on wide vector processors, such as the vDSP that we used, introduces a series of considerations that need to be taken into account. Furthermore, the design implementation solution space increases even further by having algorithms with multiple loops. In the solution space, we need to make choices, for example, which loop should be vectorized or which loop order is notably impactful on the kernel’s overall performance and requirements. These considerations add yet another layer of complexity and deserve a chapter of their own. Here, we will separate most important elements in an abridged manner to reach the next set of HW requirements: how does the 6G candidate waveform kernel under corner workloads map onto SotA vDSPs? How much of the vDSP core cycle budget does it require?

      Is it practical to run the kernel on the vDSP, provided that the current vDSP load is sufficiently low?

Schematic illustration of vDSP simplified HW block diagram.

      Figure 1.13. vDSP simplified HW block diagram

СКАЧАТЬ
Use Case Throughput TTI K M Kernels req. Deadline
image [μs] [#] [#] [#] [μs]
low-end LTE legacy 72 1000 128 7 1 500
high-end FR2 4 ×CA, µ = 3, 400MHz 3,168 125 4,096 7 4 62.5