# Evaluation of 3D-DiamondMesh homogeneous and heterogeneous architectures

Lakshmi Kiranmai. V<sup>1</sup>, Kota Naga Srinivasarao Batta<sup>2</sup>, and Sowjanya Kotte<sup>3</sup>

Abstract-The present work extends the 2D DiamondMesh to 3D homogeneous and heterogeneous DiamondMesh architectures. By incorporating diagonal links into the conventional mesh topology, the 2D DiamondMesh improves network performance while retaining the regular, simple and scalable properties of the Mesh topology. To further improve the performance of the network, 2D DiamondMesh has been extended to 3D DiamondMesh by stacking the 2D layers vertically and interconnecting with through silicon vias (TSVs). In this work, 3D DiamondMesh has been evaluated for different network sizes. Also, five heterogeneous 3D DiamondMesh architectures have been proposed and evaluated. The results have inferred that the heterogenity in topology across the layers have shown a remarkable reduction in latency with slight area overhead or a reduction in the area with a slight penality in performance. An average reduction of 7.83%, 9.59%, 13.18%, 4.80%, 8.85%, 10.50%, 6.56% and 10.63% in APL can be observed with 4Layer-64node XDMesh, DiamondMesh, DMesh, DiamondMesh+Mesh, DiamondMesh+XDMesh, DiamondMesh+DMesh, DMesh+Mesh, DMesh+XDMesh architectures respectively over the conventional 4Layer-64node Mesh architecture. Similarly, an average reduction of 9.26%, 21.21%, 25.00%, 10.47%, 15.21%, 23.00%, 12.50% and 16.90% can be observed with 4Layer-256node XDMesh, DiamondMesh, DMesh, DiamondMesh+Mesh, DiamondMesh+XDMesh, DiamondMesh+DMesh, DMesh+Mesh, DMesh+XDMesh architectures respectively over the conventional 4Layer-256node Mesh architecture.

#### I. INTRODUCTION

Network-on-Chip (NoC) is a modular, scalable, robust, high-performance communication infrastructure, emerged as a revolutionary approach to interconnect a large number of cores in the complex MultiProcessor System-on-Chip (MP-SoC) architectures [1]. Topology refers to the arrangement of nodes and the connecting links among the nodes of the interconnection architecture. The selection of topology is one among many key design aspects of the interconnection architecture. It has a significant effect on latency, area and power consumption of NoC based MPSoCs as it deals with the wire lengths, node degree and routing strategies [2]. Numerous NoC topologies like Mesh, Torus, Spidergon, CMesh, BFT, Tree, Ring, Hybrid HexagonalStar e.t.c. have been proposed [2]-[6]. Mesh is the widely used architecture for implementing less complex SoCs due to its simple, regular structure, scalable and short-range links. As

\*This work was not supported by any organization

Mesh has large network diameter, for large size networks it results in degraded performance. Several techniques have been proposed to improve the performance of mesh topology for enabling its application in complex SoCs. One of these is to incorporate diagonal links in the traditional Mesh topology to reduce the latency and improve the throughput of the NoC. A few diagonal Mesh based topologies like DMesh [7], XDMesh [8], ZMesh [9], DiamondMesh [10] have been proposed. Among these, DiamondMesh proposed and evaluated in [10] has been manifested to be more balanced than the other referred diagonal mesh based topologies in terms of the area-performance-power tradeoffs. The benefits of DiamondMesh topology over the other referred diagonal mesh topologies in terms of power-area-latency tradeoff can be more profoundly observed for heavy traffic rather than low traffic. The key contributions on DiamondMesh include evaluation and analysis of the topological parameters of DiamondMesh, Performance evaluation of the topology in terms of power-area-latency and the analysis of power-arealatency tradeoff over the other diagonal mesh topologies under different synthetic traffic and real time benchmarks.

In the present work, to evaluate the performance of larger architectures, DiamondMesh has been extended to 3D-DiamondMesh and the performance under different synthetic traffic patterns for different network sizes has been analysed. The key aspects of the paper include:

- Performance evaluation of 3D-DiamondMesh, comparison of the experimental findings with the 3D-Mesh topology and state-of-the-art diagonal mesh topologies
  DMesh, ZMesh, XDMesh that are extended to 3D-DMesh, 3D-ZMesh, 3D-XDMesh.
- Different heterogeneous configurations have been proposed and evaluated.

#### II. RELATED WORK

During the past a few years, different diagonal Mesh based topologies have been proposed. Chifeng Wang et al. [7] have proposed DMesh topology, constructed with diagonal links across the baseline Mesh topology. DMesh has utilised Xarchitecture routing approach to reduce latency at the expense of moderate area and power overheads. The results have inferred that incorporating diagonal links has been a more area- and power-efficient approach for NoCs to improve the network performance than using larger buffers. Md.Hasan Furhad et al. [8] have presented an extended diagonal mesh topology termed XDMesh, to improve network performance. XDMesh has outperformed other topologies such as mesh, extended-butterfly fat tree and diametrical mesh, in terms of latency, throughput, power consumption and area. Prasad et al. [9] have proposed and evaluated ZMesh topology, a

<sup>&</sup>lt;sup>1</sup>Lakshmi Kiranmai. V is with the Department of Electronics and Communication Engineering, National Institute of Technology Warangal, India kiranmaik@student.nitw.ac.in

<sup>&</sup>lt;sup>2</sup>Kota Naga Srinivasarao Batta is with the Department of Electronics and Communication Engineering, National Institute of Technology Warangal, India srinu.bkn@nitw.ac.in

<sup>&</sup>lt;sup>3</sup>Sowjanya Kotte with the Department of Electronics and Communication Engineering,Kakatiya Institute of Technology and Sciences, Warangal, India ks.ece@kitsw.ac.in

diagonal mesh based topology. The topological parameters of ZMesh such as network diameter, bisection width and the number of edges have been explored. ZMesh has been evaluated and compared with Mesh, DMesh, CMesh, PDNoC in terms of latency and power under different synthetic and real time traffic patterns. ZMesh has performed better than Mesh, PDNoC, CMesh. It has been noticed that the ZMesh has the lowest power-latency product (PLP) up to 0.3 injection rates. However, beyond 0.3, PLP of ZMesh has higher PLP compared with that of DMesh. Kiranmai et al. [10] proposed DiamondMesh, a diagonal Mesh based topology. The topological parameters of the DiamondMesh have been evaluated and analysed. For 16 and 64 node networks, the simulation findings inferred that DiamondMesh outperformed Mesh, XDMesh, and ZMesh in terms of throughput, latency and power-latency product. The findings also inferred that DiamondMesh latency and throughput characteristics were close to that of DMesh with a considerable reduction in area and power consumption.

Network-on-Chips amalgamated with 3D integrated circuits (Ics) form 3D-NoCs that improve the performance, footprint and scalability of the complex SoC architectures. Compared to 2D NoCs, 3D-NoCs have shorter links, lesser footprint, reduced latency and power [11]. Furthermore, 3D Ics enable integration of CMOS circuits with heterogeneous technologies [12]. In the recent times, research in 3D-NoCs and heterogeneous 3D-NoCs is emerging. In this context, the present work extends the 2D DiamondMesh to 3D DiamondMesh and has performed a comprehensive study on homogeneous and heterogeneous 3D DiamondMesh architectures for different network sizes under different traffic patterns.

#### III. 3D-DIAMONDMESH

2D-DiamondMesh layers have been stacked vertically and interconnected with TSVs (Through Silicon Vias) to form 3D-DiamondMesh as depicted in the Fig.1 (a). 3D architectures of Mesh, XDMesh, ZMesh and DMesh topologies under investigation are depicted in Fig.1 (b)-(e).

Heterogeneity has been introduced in a way that two alternate layers are of the same topology and the other two alternate layers are of a different topology. Five heterogeneous architectures showed in Fig.2 have been proposed and evaluated as part of the present work.

#### **IV. EVALUATION**

To carry out the simulation and performance evaluation of the DiamondMesh and other considered topologies, Ratatoskr simulator [13] has been employed. Ratatoskr simulator is an open-source framework to analyse power, performance and area of Networks-on-Chips (NoCs). It supports cycleaccurate simulation and heterogeneous 3D integration. The performance of the considered topologies has been evaluated under different synthetic traffic patterns that include uniform random, transpose and bit-reversal for different network sizes. Network size is specified in terms of total number of nodes and also as  $[X \times Y \times Z]$ . X denotes the number of nodes in X-direction, Y denotes the number of nodes in Ydirection and Z denotes the number of layers in the 3D-NoC architecture.





(a) 3D-DiamondMesh



(b) 3D-Mesh







(e) 3D-DMesh Fig. 1: Homogeneous Architectures



(a) DiamondMesh+Mesh



(b) DiamondMesh+XDMesh



(c) DiamondMesh+DMesh



(d) DMesh+Mesh



(e) DMesh+XDMesh

Fig. 2: Heterogeneous Architectures

# A. Evaluation methodology

Firstly, homogeneous 3D-NoC architectures have been simulated and the findings have been analysed for different network sizes. DXY routing proposed in [10] has been used to simulate 2D diagonal mesh based topologies. For 3Ddiagonal mesh based topologies, DXY has been extended to DXYZ routing algorithm to route the packets across the layers. Conventional XYZ algorithm has been used for 3D-Mesh topology. Pseudo code for DXYZ routing is shown in algorithm1. Secondly, the five proposed heterogeneous architectures have been simulated for 64-node and 256-node network sizes and the findings have been analysed.

## Algorithm 1 Pseudo code for DXYZ routing

**Require:** Current Source id (src) and Destination id (dst) **Ensure:** Router to which packet has to be routed

Step1: Follow DXY routing based on the X and Y co-ordinates of src and dst

Step2: Once the X and Y co-ordinates of src and dst are same then go to Step3 or else repeat Step1

#### Step3:

**if** Z co-ordinate of dst is less than that of src **then** *route DOWN* 

| else   |    |  |
|--------|----|--|
| route  | UP |  |
| end if |    |  |

### B. Discussion of simulation results

1) Homogeneous architectures: Table I shows the configuration parameters to simulate the architectures. 2D, 3D- 2 layer and 3D- 4 layer architectures with different network sizes from 16 nodes to larger network size of 256 nodes have been simulated. The architectures are simulated under uniform, transpose and bit-reversal traffic patterns.

## TABLE I: CONFIGURATION PARAMETERS FOR HO-MOGENEOUS CONFIGURATIONS

| Parameter                            | value                             |  |  |
|--------------------------------------|-----------------------------------|--|--|
| Simulation time                      | 10000ns                           |  |  |
| No. of Virtual channels              | 2                                 |  |  |
| Buffer depth                         | 4                                 |  |  |
| Router                               | Input-Buffered                    |  |  |
| Routing algorithm                    | DXYZ for Diagonal topologies      |  |  |
|                                      | XYZ for Mesh                      |  |  |
|                                      | 16 nodes $(4 \times 4 \times 1)$  |  |  |
|                                      | 36 nodes $(6 \times 6 \times 1)$  |  |  |
|                                      | 64 nodes $(8 \times 8 \times 1)$  |  |  |
| Network Size $(X \times Y \times Z)$ | 32 nodes $(4 \times 4 \times 2)$  |  |  |
| X – No. of nodes in X-direction      | 72 nodes $(6 \times 6 \times 2)$  |  |  |
| Y - No. of nodes in Y-direction      | 128 nodes $(8 \times 8 \times 2)$ |  |  |
| Z – No. of layers                    | 64 nodes $(4 \times 4 \times 4)$  |  |  |
|                                      | 144 nodes $(6 \times 6 \times 4)$ |  |  |
|                                      | 256 nodes $(8 \times 8 \times 4)$ |  |  |

| Parameter                            | value                             |
|--------------------------------------|-----------------------------------|
| Simulation time                      | 10000ns                           |
| No. of Virtual channels              | 2                                 |
| Buffer depth                         | 4                                 |
| Router                               | Input-Buffered                    |
| Routing algorithm                    | DXYZ for Diagonal topologies      |
|                                      | XYZ for Mesh                      |
| Network Size $(X \times Y \times Z)$ |                                   |
| X – No. of nodes in X-direction      |                                   |
| Y - No. of nodes in Y-direction      | 64 nodes $(4 \times 4 \times 4)$  |
| Z – No. of layers                    | 256 nodes $(8 \times 8 \times 4)$ |
| Heterogeneous architectures          | DiamondMesh+Mesh                  |
|                                      | DiamondMesh+XDMesh                |
| (Topology1+Topology2)                | DiamondMesh+DMesh                 |
| Topology1 for Layers '0' & '2'       | DMesh+Mesh                        |
| Topolog2 for Layers '1'& '3'         | DMesh+XDMesh                      |

## TABLE II: CONFIGURATION PARAMETERS FOR HET-EROGENEOUS CONFIGURATIONS

The simulation findings of Average Packet Latency (APL) for uniform traffic pattern have been plotted as shown in Fig. 3 (a)-(i). From the plots, it can be observed that APL of DiamondMesh is intermediate to Mesh, ZMesh that have maximum APL and DMesh that has minimum APL among all other topologies. Except DMesh, DiamondMesh surpasses Mesh, ZMesh and XDMesh topologies in terms of APL for all network sizes. DMesh has shown reduced latency compared with DiamondMesh but at the expense of more number of links. More the number of links, faster may be the performance but it leads to more area overhead and power consumption. It can be observed that the APL of ZMesh has been lower than that of Mesh upto 0.2 or 0.3 injections rates only. As the load increases, the APL of ZMesh increases and is higher than that of Mesh. APL of DiamondMesh has been found to be consistently lower than that of Mesh, ZMesh, XDMesh and is closer to that of DMesh for all the loads and for all network sizes. Even though DiamondMesh and ZMesh have same number of links, DiamondMesh has performed superior to ZMesh because the network diameter of DiamondMesh has been smaller than that of ZMesh. This is because of the difference in the pattern in which the diagonal links are incorporated in DiamondMesh and ZMesh. A similar trend as that of uniform traffic pattern has been observed for transpose and bitreversal traffic patterns. As such, the plots for transpose and bitreversal traffic patterns have been omitted for brevity. Table III shows the number of links required to construct the topology. Table IV shows the tradeoff between the number of links and APL for DiamondMesh and DMesh topologies.

It can be observed from the plots, at 0.8 injection rate, for 16-node and 256-node networks, there is 14.75% and 34.66% reduction of latency with DiamondMesh compared to that of an identically configured Mesh and there is 20.45% and 41.7% reduction of latency with DMesh compared to that of an identically configured Mesh. APL reduction is slightly better in DMesh compared to that of DiamondMesh but at the expense of more number of links. It can be observed from Table IV, that at 0.8 injection rate, for 16-node and 256-node networks, that the APL with DiamondMesh is 7.15% and 12.06% respectively higher than that of DMesh but with nearly 15% lesser number of links in DiamondMesh compared to that of DMesh.

979-8-3503-0219-6/23/\$31.00 ©2023 IEEE

It can be observed from the plots of 2D 64-node network and an identically configured 3D-4 layer 64-node network, 16 nodes per layer, that there is an average reduction of 40.18%, 39.07%, 38.21%, 48.44%, 19.14%, 13.49% in APL with 3D- Mesh, XDMesh, ZMesh, DiamondMesh, DMesh architectures respectively compared with their corresponding 2D architecures. The reduction in APL with 3D-Mesh, 3D-XDMesh, 3D-ZMesh over their corresponding 2D architectures is higher compared to that of DiamondMesh and DMesh topologies. Reduction of APL with DiamondMesh compared to that of Mesh for 2D-64node network is 22.81% where as for 3D-64 node network, it is only 11.28%. Similarly, reduction of APL with DMesh compared to that of Mesh for 2D-64node network is 26.98% where as for 3D-64 node network, it is only 1.475%. Thus, with 3D-NoCs technology, the 2D architectures with larger network diameter can be benefited more compared with that of the architectures with smaller network diameter. Precisely, it can be inferred that the 3D architectures outperform their corresponding identically configured 2D architectures.

2) Heterogeneous architectures: Five heterogeneous architectures shown in Fig. 2, formed by interconnecting the nodes in the alternate layers with two different topologies have been simulated under uniform traffic pattern. Table II shows the configuration parameters of simulation. The simulation findings have been plotted as shown in Fig.4. An average reduction of 7.83%, 9.59%, 13.18%, 4.80%, 8.85%, 10.50%, 6.56% and 10.63% in APL can be observed with 4Layer-64node XDMesh, DiamondMesh, DMesh, DiamondMesh+Mesh, DiamondMesh+XDMesh, DiamondMesh+DMesh, DMesh+Mesh, DMesh+XDMesh architectures respectively over the conventional 4Layer-64node Mesh architecture. Similarly, an average reduction of 9.26%, 21.21%, 25.00%, 10.47%, 15.21%, 23.00%, 12.50% and 16.90% can be observed with 4Layer-256node XDMesh, DiamondMesh, DMesh, DiamondMesh+Mesh, DiamondMesh+XDMesh, DiamondMesh+DMesh, DMesh+Mesh, DMesh+XDMesh architectures respectively over the conventional 4Layer-256node Mesh architecture.

It can be observed that DiamondMesh+XDMesh has higher reduction in APL over Mesh when compared to homogeneous XDMesh over Mesh and also the reduction in APL with DiamondMesh+XDMesh architecture is close to that of DiamondMesh with the benefit of lesser number of links. Thus, heterogenity in topology across the layers of 3D-NoC architecture could be a possible approach to bring a tradeoff among the key metrics like area, latency and power of the architecture. There could be several possible variations in the heterogenity across the layers, out of which five architectures have been suggested and analysed in the present work.

#### V. CONCLUSION

Performance of 3D-DiamondMesh of different network sizes have been analysed comprehensively. Among the investigated topologies, the proposed 3D-DiamondMesh has been manifested to be a balanced topology in terms of performance and area overhead. Also, the findings have inferred that the 3D architectures outperform their corresponding identically configured 2D architectures. Further,



Fig. 3: Average Packet Latency characteristics of the considered topologies (a)16 nodes(1L) (b)36 nodes(1L) (c)64 nodes(1L) (d)32 nodes(2L) (e)72 nodes(2L) (f)128 nodes(2L) (g)64 nodes (4L) (h)144 nodes (4L) (i)256 nodes (4L) 1L:1 Layer, 2L: 2 Layers, 4L: 4 Layers



Fig. 4: Average Packet Latency characteristics of heterogeneous architectures (a) 64 nodes(4L) (b) 256 nodes(4L)

## TABLE III: NUMBER OF LINKS

|             | 16 nodes                | 36 nodes                | 64 nodes                | 32 nodes                | 72 nodes                | 128 nodes               | 64 nodes                | 144 nodes               | 256 nodes               |
|-------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|
|             | $(4 \times 4 \times 1)$ | $(6 \times 6 \times 1)$ | $(8 \times 8 \times 1)$ | $(4 \times 4 \times 2)$ | $(6 \times 6 \times 2)$ | $(8 \times 8 \times 2)$ | $(4 \times 4 \times 4)$ | $(6 \times 6 \times 4)$ | $(8 \times 8 \times 4)$ |
| Mesh        | 40                      | 96                      | 176                     | 96                      | 228                     | 416                     | 208                     | 492                     | 896                     |
| XD          | 46                      | 106                     | 190                     | 108                     | 248                     | 444                     | 232                     | 532                     | 952                     |
| ZMesh (or)  |                         |                         |                         |                         |                         |                         |                         |                         |                         |
| DiamondMesh | 49                      | 121                     | 225                     | 114                     | 278                     | 514                     | 244                     | 592                     | 1092                    |
| DMesh       | 58                      | 146                     | 274                     | 132                     | 328                     | 612                     | 280                     | 692                     | 1288                    |

#### TABLE IV: TRADEOFF BETWEEN THE NUMBER OF LINKS AND APL

|                        | 16 nodes                | 36 nodes                | 64 nodes                | 32 nodes                | 72 nodes                | 128 nodes               | 64 nodes                | 144 nodes               | 256 nodes               |
|------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|
|                        | $(4 \times 4 \times 1)$ | $(6 \times 6 \times 1)$ | $(8 \times 8 \times 1)$ | $(4 \times 4 \times 2)$ | $(6 \times 6 \times 2)$ | $(8 \times 8 \times 2)$ | $(4 \times 4 \times 4)$ | $(6 \times 6 \times 4)$ | $(8 \times 8 \times 4)$ |
| %Reduction of          | 15.52                   | 17.12                   | 17.88                   | 13.64                   | 15.24                   | 16.01                   | 12.86                   | 14.45                   | 15.22                   |
| No. of Links in        |                         |                         |                         |                         |                         |                         |                         |                         |                         |
| DiamondMesh when       |                         |                         |                         |                         |                         |                         |                         |                         |                         |
| compared to DMesh      |                         |                         |                         |                         |                         |                         |                         |                         |                         |
| % Increase of Latency  | 6.83                    | 5.88                    | 5.15                    | 6.12                    | 5.12                    | 4.47                    | 4.86                    | 4.34                    | 3.99                    |
| for DiamondMesh        |                         |                         |                         |                         |                         |                         |                         |                         |                         |
| compared to DMesh      |                         |                         |                         |                         |                         |                         |                         |                         |                         |
| for injection rate=0.1 |                         |                         |                         |                         |                         |                         |                         |                         |                         |
| % Increase of Latency  | 7.15                    | 10.07                   | 15.31                   | 6.12                    | 8.07                    | 12.23                   | 4.92                    | 7.35                    | 12.06                   |
| for DiamondMesh        |                         |                         |                         |                         |                         |                         |                         |                         |                         |
| compared to DMesh      |                         |                         |                         |                         |                         |                         |                         |                         |                         |
| for injection rate=0.8 |                         |                         |                         |                         |                         |                         |                         |                         |                         |

five heterogeneous 3D architectures have been suggested and evaluated. The analysis of the results have inferred that the heterogenity in topology across the layers reduces latency with slight area overhead or it reduces the area with a slight penality in performance. Future scope includes the study of heterogeneous 3D-NoCs with efficient and adaptive routing techniques.

### REFERENCES

- L. Benini and G. De Micheli, "Networks on chips: a new SoC paradigm", in Computer, vol. 35, no. 1, pp. 70-78, Jan. 2002, doi: 10.1109/2.976921.
- [2] Kundu, S., and Chattopadhyay, "S.Network-on-Chip: The Next Generation of System-on-Chip", Integration (1st ed.). CRC Press. (2015). https://doi.org/10.1201/9781315216072
- [3] S. Kumar et al, "A Network on Chip Architecture and Design Methodology", Proc. of ISVLSI, pp. 117-124, 2002.
- [4] Kundu, S. and Chattopadhyay, S., "Network-on-chip architecture design based on mesh-of-tree deterministic routing topology", International Journal of High Performance Systems Architecture, 1(3), pp.163-182. (2008).
- [5] Pande et al., "High-Throughput Switch-based Interconnect for future SoCs", Proc. Third IEEE Int'l Workshop System-onChip for real time Applications, pp. 304-310, 2003.
- [6] Lakshmi Kiranmai, V. and Srinivasarao, B.K.N., 2022. "A Novel Hybrid Hexagonal Star Topology for On-Chip Interconnection Networks". Journal of Circuits, Systems and Computers, p.2350076.
- [7] Wang C, Hu W-H, Lee SE, Bagherzadeh N, "Area and powerefficient innovative congestion-aware network-on-chip architecture". J Syst Archit 57(1):24–38. https://doi.org/10.1016/j.sysarc.2010.10.009
- [8] Furhad MH, Kim J-M, "An extended diagonal mesh topology for network-on-chip architectures", Int J Multimed Ubiquitous Eng 10(10):197–210, (2015)
- [9] Prasad N, Mukherjee P, Chattopadhyay S, Chakrabarti I "Design and evaluation of ZMesh topology for on-chip interconnection networks", J Parallel Distrib Comput 113:17–36 (2018)
- [10] Varanasi, Lakshmi Kiranmai, and B.K.N. Srinivasarao. "Design and evaluation of an energy efficient DiamondMesh topology for on-chip interconnection networks", Design Automation for Embedded Systems 26, no. 3-4 (2022): 161-187.
- [11] Vasilis F. Pavlidis, and G. Friedman, "3-D Topologies for Networkson-Chip", IEEE transactions on very large scale integration (vlsi) systems, vol. 15, no. 10, october 2007.

#### 979-8-3503-0219-6/23/\$31.00 ©2023 IEEE

- [12] W. R. Davis et al., "Demystifying 3D ICs: The pros and cons of going vertical", IEEE Design Test Comput., vol. 22, no. 6, pp. 498–510, Nov./Dec. 2005.
- [13] J. M. Joseph, L. Bamberg, I. Hajjar, B. R. Perjikolaei, A. G. Ortiz and T. Pionteck, "Ratatoskr: An open-source framework for indepth power, performance and area analysis in 3d nocs", CoRR abs/1912.05670, 2019