What properties should a supercomputer have that is particularly fast? How should its memory be structured? To what extent does it differ from conventional computer architectures? How can this be illustrated using the examples of basic arithmetic operations and matrix multiplication?

In the following, a bit length of 64 is used for calculation. In practice, however, this can be higher. The basis are i. a. particularly fast counting lines that are used bit by bit for addition. A counting line is run through e. g. of particles such as photons or electrons in a suitable medium, which is not important here, since only the theoretical concept is described. These count the number of set bits of equal value from several adjacent binary numbers.

The decisive factor is the high speed of the particles. The 64 numbers per bit are again stored as binary numbers and added (bitwise shifted). Since this happens in parallel, an addition speed of \(2^{32}\) binary numbers with bit length 32 is achieved in \(\mathcal{O}(1)\). Subtractions are implemented using the two’s complement. Continued addition enables multiplication and, via forming the inverse, division. A hierarchical memory enables quick comparisons (see Radio Computing).

For fast multiplication of two wlog. square matrices, each with \(2^n\) entries for \(n \in {}^{\nu}\mathbb{N}\), these are written into a memory ball by simultaneous replication (i. e. simultaneous sending and storing). The multiplication in \(\mathcal{O}(1)\) takes place in its nodes, in which both entries of one matrix are stored. Then the entries of a row are added in parallel in \(\mathcal{O}(1)\) as described above via counting lines, which calculates the result matrix.

Here, the root is in the innermost shell, its leaves in the next outer shell, etc. The shells are rotatably mounted to prevent the innermost nodes from being fret or overused. The contacts (pins) are in their own rotatable shells. The memory addresses are managed like in a virtual memory, which mathematically can also correspond to a memory cube. From time to time the shells are rotated, also if there are failures.

The memory concept presented is also suitable for other hardware components such as hard disks. It enables shorter distances than the conventional one. This is beneficial for indexing database tables, swarm exploration, and threads. The replaceable spherical root unit is designed with multiple redundancies to have a reasonable failure concept. To get to them from the outside, the memory unit can be swung out. A memory tree starts from each root.

Overall, a single memory tree can be (virtually) addressed (so-called *memory duality*). This allows fast sorting methods such as Bitsort (see Theoretical Informatics). For this purpose, the memory cells can possibly be implemented as simpler processor units with their own intelligence. For technical or computational reasons, they can also be designed twice. This concept can be used in a reduced form with PCs. In any case, there is a clear gain in speed.

© 2023 by Boris Haase