The following section presupposes Set Theory and answers for \(k, m, n \in {}^{\nu}\mathbb{N}\): How can \(n \le k^m\) values be sorted in \(\mathcal{O}(1)\) that are coded in the base-\(k\) system? How can two matrices \(\in {}^{\nu}\mathbb{R}^{n \times n}\) be multiplied in \(\mathcal{O}({_e}n)\)? How can the minimum or maximum framework of a graph with \(n\) nodes be calculated in \(\mathcal{O}({_e}n)\)? What does answer the halting problem? Why is \(P = NP\) true?

Let the memory be partially organised in the form of a binomial tree of order \(k \in {}^{\nu}\mathbb{N}\) with \(k^{\grave{m}}\) memory locations, where \(k^{\grave{m}} = 2^h\) for \(h \in {}^{\nu}\mathbb{N}\). Each memory location is able to completely accommodate a value to be sorted and each edge corresponds to a data line. A continuous index is appended to multiple values and forms together with the old value the new value. Each \(j\)-th intermediate node corresponds hierarchically with the \(j\)-th position of the value on base \(k\) to be sorted.

If leaves store (in one step) the values’ addresses instead of the values themselves, they can be retrieved, sorted, into the root in a further step. They are maximally pushed into the tree until they separate. If a value passes through an intermediate node, this sets its bit to 1, but to 0 if the intermediate node has no successor to read. The sorting order determines the first value. Saving, checking the successor, masking bits and reading out takes \(\mathcal{O}(n)\).

Here and in the following, a memory architecture with autonomous memory units is used that has simple computational capabilities. It can also be simulated by interconnected, distributed computers. Sorting in an additional value increases the complexity by \(\mathcal{O}({_e}n)\) since this corresponds to a binary search. It is inversely proportional to the number of tree nodes in the memory. If \(d \in {}^{\nu}\mathbb{N}\) data lines are used, it decreases by the factor \(\tilde{d}\). If the values occupy \(g\) times the number of memory space, it increases by the factor \(\tilde{d}\).

If the memory does not contain a tree of degree \(k\), it increases by the factor \({_k}n\), because then every intermediate node must be read to a value. Using the first \(k^p\) memory locations in another way with \(p \in {}^{\nu}\mathbb{N}\), requires all intermediate nodes of the \(p\)-th stage to be searched through. The complexity increases marginally if the memory is well filled, otherwise, by the worst-case factor \(\mathcal{O}(k^p)\).

Matrices to be multiplied are to be placed in such a way \(n\)-square by \(n\)-square into a memory cube that every memory unit multiplies two values, and every entry of the product matrix is determined as sum with \(n\) summands in \(\mathcal{O}({_e}n)\) (vector processor). Further simplifications such as the Strassen algorithm are possible. Tensor products can be efficiently represented by storage \(n\)-cubes, which remain in three-dimensional space through corresponding data lines.

The optimal sorting algorithm *bit sort* sorts stably with the time complexity \(\mathcal{O}(n)\) and requires \(\mathcal{O}(n)\) additional storage space. Requirements: Two values are given as bit sequence, can be compared and all their bits processed in \(\mathcal{O}(1)\). Further is assumed that the memory is partially realised as a binary tree and that the path from root to leaf can also be traversed in \(\mathcal{O}(1)\).

If \(p \in {}^{\nu}\mathbb{N}\) is the number of parallel steps for a parallel bit sort, the time complexity is \(\mathcal{O}(\tilde{p}n)\). Each insertion into the always sorted tree or output of a value has the time complexity \(\mathcal{O}(1)\). The memory is provided with the following intelligence: each tree node stores the value reaching it independently and can pass it on to the next tree node if it has already been visited, or, otherwise, can return its address to the central unit. The root does not correspond to any bit value.

Starting with the most significant bit of a value, each of its consecutively processed bits determines the next node within the binary tree. If all bits have been processed or if there is a difference from the compared bit and the associated node has not yet been visited, the non-tree address of the remaining bit string of the value is stored in the following node if any, or the number of values for this node, given that equal values need not be ordered.

The time complexity of all methods which sort values using a bit length \(b = q a \in {}^{\nu}\mathbb{N}\) alongside an address length \(a \in {}^{\nu}\mathbb{N}\) increases from \(\mathcal{O}(1)\) to \(\mathcal{O}(q)\) for each value. If \(r\) is the average value of \(q\), at least one \(\mathcal{O}(n)\) must be replaced by \(\mathcal{O}(r n)\) in the time complexity of the sorting procedure. If \(s \in {}^{\nu}\mathbb{N}\) is the average number of different (!) values that end up in a bucket after sorting to bit length \(a\) and whose bit length is greater than \(a\), then the time complexity of the procedure is \(\mathcal{O}(({_e}s + r) n)\).

In the process, the mentioned values of a bucket are stored in one AVL tree per bucket or in a B-tree, if they are stored in a memory which is relatively slow compared to the main memory. There, only the significant part value of the address length of a value is processed. If is dispensed with completely sorted values after every processing of a value, the complete sort is temporally reduced to \(\mathcal{O}(r n)\) by using two main memory areas.

The second main memory area is cleared after each pass so as to be available for the next pass until a bucket has been processed. In the first main memory area, the addresses are always just written back in the current sorting order before further sorting the values with the same part value just processed. Each value must be stored contiguously in memory, so that each part value can be accessed quickly.

The use of the \(\mathcal{O}(r n)\) method for the master data and the \(\mathcal{O}(({_e}s + r) n)\) method for the update data both combine well. In particular, table indexes can be created quickly. For large tables, it makes a significant difference whether an index is created in \(\mathcal{O}(r n)\) instead of in \(\mathcal{O}(r {_e}n n)\), and whether a value is then on average searched for in \(\mathcal{O}({_e}s + r)\) instead of in \(\mathcal{O}({_e}n + r)\) or worse. Sorting by mixing sorted lists is also possible in \(\mathcal{O}(r n)\).

Discussion of bit sort: It is optimal under the given conditions for parallel programming, and well suited both for the rapid ongoing building of table indexes in databases and for search. Its hardware implementation is rather unusual.

Let \(n \in {}^{\nu}\mathbb{N}\) be the node number of a finite simple connected graph. (If it is not simple, only the smallest parallel edge and no loops are allowed by sorting out in \(\mathcal{O}(n)\). If it is not connected, all isolated nodes are removed in \(\mathcal{O}(n)\).) If negative weights are excluded, Dijkstra’s algorithm needs \(\mathcal{O}(n)\) steps for every minimum path in the graph, since sorting as well as parallel adding and comparing is possible in \(\mathcal{O}(1)\).

Its minimum or maximum framework can be calculated in \(\mathcal{O}({_e}n n)\) or even in \(\mathcal{O}({_e}n)\). The edges of a node can be sorted in parallel in \(\mathcal{O}(n)\) or \(\mathcal{O}(1)\). Let all edges be coloured white at the beginning. Each node appends the smaller and then the larger number of participating nodes of the edge with the smallest weight of all the white-coloured edges that emanate from it to this bit by bit, as a value. It stores the smallest value that reaches it. This is possible in \(\mathcal{O}(n)\) or \(\mathcal{O}(1)\).

Blackening of the edges between every two nodes of the same minimal framework requires at most the same time. Each node then has a nearest neighbouring node. Continuing with the white coloured edges in \(\mathcal{O}({_e}n)\) steps, the smallest value of a minimal framework can be determined in \(\mathcal{O}(n)\) or \(\mathcal{O}(1)\) until the entire minimal framework is formed in \(\mathcal{O}({_e}n n)\) or \(\mathcal{O}({_e}n)\). For a maximum framework, subtract all edge weights from a sufficiently large constant.

Halting theorem: To reach a fix \(r_m\) of the number \(0.r_{1}r_{2}…r_{m}\), a programme halts for arbitrary (pseudo-) random numbers \(r_j \in {}^{\nu}\mathbb{N}\) with \(k \in {}^{\nu}\mathbb{N}^{*}\) positions and base \(p := 2^k\) in \(m \in \mathbb{N}^{*}\) steps giving probability \(\tilde{p}\).

Proof: This is an optimal result of probability calculation^{1}cf. Knuth, Donald Ervin: *The Art of Computer Programming Volume 2*; 3rd Ed.; 1997; Addison Wesley; Reading, p. 10 – 40.\(\square\)

Remark: Chance is undecidable. This means that there are programmes halting with any probability, and Gödel’s incompleteness theorems follow. The Gödel sentence “I am not provable.” is sophistical in that it has the bogus argument petitio principii ex negativo. In order to solve the halting problem, a self-reference is senseless, which commands and prohibits halting at the same time. The self-reference makes the diagonal argument for the holding problem undue.

Theorem: If \(\mathcal{L}\) is the set of all (type 0) languages \(L_k\) of finite words with maximum length \(\nu\) for mid-finite \(|\mathcal{L}| = 2^{({\nu}^{\grave{\nu}}-1)/\acute{\nu}}\) (using the GS) and \(k \in \{1, …, |\mathcal{L}|\}\), then \(P = NP = \mathcal{L}\).

Proof: Since every deterministic Turing machine can also be considered non-deterministic^{2}cf. Hoffmann, Dirk W.: *Theoretische Informatik*; 4., akt. Aufl.; 2018; Carl Hanser; München, p. 162, 318 and 361 f., \(P \subseteq NP\) is true. It contains \(P\) all linear languages \(L_k\), and \(NP\) only those ones.\(\square\)

© 2006-2019 by Boris Haase