Linear Programming

Linear Programming
Linear Programming

The following likewise presupposes set theory, topology and nonstandard analysis.

Diameter theorem for polytopes and polyhedra: Every diameter of an \(n\)-dimensional polytope or polyhedron given by \(m\) constraints with \(m,n\in{}^{\omega}\mathbb{N}_{\ge 2}\) is at most \(2(m + n – 3)\).

Proof: At most \(\acute{m}\) hyperplanes can be combined to form an incomplete cycle of dimension \(2\), and there are at most \(n-2\) options to deviate sideways in the remaining dimensions. Traversing each minimal segment requires at most two edges and thus yields the factor \(2\).\(\square\)

Theorem on Strassen’s algorithm: For sufficiently large \(n := 2^{\ell}\), \(\ell \in {}^{\nu}\mathbb{N}^*\), \(ß := {}_27\) and \(A \in {}^{\nu}\mathbb{C}^{n \times n}\), the GS proves the reduction of the running time \(T(n) = \mathcal{O}(n^{ß})\)1 by approximately \(\tilde{3}\) for the computation of \[ AA^H = \begin{pmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{pmatrix} \begin{pmatrix} A_{11}^H & A_{21}^H \\ A_{12}^H & A_{22}^H \end{pmatrix} = \begin{pmatrix} A_{11}A_{11}^H+A_{12}A_{12}^H & A_{11}A_{21}^H+A_{12}A_{22}^H \\ A_{21}A_{11}^H+A_{22}A_{12}^H & A_{21}A_{21}^H+A_{22}A_{22}^H \end{pmatrix}.\;\square \] Theorem on fast matrix multiplication: With \(A\) and \(B = (b_{ij}) \in {}^{\nu}\mathbb{C}^{n \times n}\) as above and \(n\) as before, \(AB\) can be computed for \[ s := \min\{2^k > \max_{i,j} (|a_{ij}|^2, |b_{ij}|^2)\hat{n} : k \in {}^{\nu}\mathbb{Z}\} \] from \[ A_{11}(s B_{11} + B_{12}),\; A_{21}(s B_{11} + B_{12}),\; A_{12}(s B_{21} + B_{22}),\; A_{22}(s B_{21} + B_{22}) \] by computing modulo \(s\) in running time \(\mathcal{O}(p^2)\) with \(p = \ell n\) (see Matrix Multiplication).\(\square\)

Theorem: For all regular \(C\;(= A^HA)\) with \(n\) as before, \(C^{-1} \in {}^{\nu}\mathbb{C}^{n\times n}\) can be computed via the Schur complement2 and identity matrices \(I\) multiplied with \(\varepsilon \in \mathcal{O}(\tilde{\nu})\) in running time \(T(n) = \mathcal{O}(p^2)\), since \[ T(n) = \hat{T}(\check{n}) + \mathcal{O}(p^2) = 4T(\tilde{4}n) +\mathcal{O}(p^2 + \tilde{2}p^2) = \dots = \mathcal{O}(p^2) \] holds by virtue of the GS.\(\square\)

Corollary: Every solvable linear system \(Ax = b\) can be solved for regular or, as above, \(\varepsilon\)-perturbed singular \(A^HA\) with accuracy in \(]0, 1[\) in running time \(\mathcal{O}(p^2)\).\(\square\)

Theorem: The intex method solves every solvable LP with inter-/extrapolations in \(\mathcal{O}(p^2)\).

Proof and algorithm: Let \(z := m + n\) and \(d \in [0, 1]\) be the density of \(A\). First we normalise and scale \[ b^{T}y – c^{T}x \le 0,\quad Ax \le b,\quad A^{T}y \ge c. \] Let \[ P_r := \{(x, y)^T \in {}^{\nu}\mathbb{R}_{\ge 0}^{z} : b^{T}y – c^{T}x \le r \in [0, \rho],\; Ax – b \le r{\upharpoonright}_m,\; c – A^{T}y \le r{\upharpoonright}_n\} \] have radius \[\rho := s|\min \{b_1, …, b_m, -c_1, …, -c_n\}|\] and scaling factor \(s \in [1, 2]\). Then \(0{\upharpoonright}_z \in \partial P_{\rho}\) holds. By the strong duality theorem3 the LP \[ \min \{ r \in [0, \rho] : (x, y)^T \in P_r\} \] also solves the LPs \[ \max \{{c}^{T}x : c \in {}^{\nu}\mathbb{R}^{n}, x \in {P}_{\ge 0}\} \quad\text{and}\quad \min \{{b}^{T}y : y \in {}^{\nu}\mathbb{R}_{\ge 0}^{m}, A^{T}y \ge c\}. \] Its solution is the geometric centroid \(g\) of the polytope \(P_0\). With \[ v_k^* := \min \check{v}_k + \max \check{v}_k,\quad k = 1, …, \grave{z}, \] \(g\) is approximated by \(v_0 := (x_0, y_0, r_0)^T\) until \(||\Delta v||_1\) is sufficiently small. The solution \(t^o(x^o, y^o, r^o)^T\) of the two-dimensional LP \[ \min \{r \in [0, \rho] : t \in {}^{\nu}\mathbb{R}_{> 0},\; t(x_0, y_0)^T \in P_r\} \] gives a better approximation to \(g\). One obtains \(r \le \check{\rho}\), and the procedure is repeated with \(t^o(x^o, y^o)^T\) until, if applicable, \(g \in P_0\) has been computed in \(\mathcal{O}({}_2\rho^2 dmn)\). Solving all two-dimensional LPs \(\min_k r_k\) by bisection for \(r_k \in {}^{\nu}\mathbb{R}_{\ge 0}\), \(k = 1, …, z\), each in \(\mathcal{O}(p^2)\), yields \(u \in {}^{\nu}\mathbb{R}^k\) with \( u_k := \Delta v_k \Delta r_k/r\) and \(r := \min_k \Delta r_k. \) For simplicity assume \(|\Delta v_1| = \dots = |\Delta v_{z}|\). In this setting \(\min r_{\grave{z}}\) for \(v^* := v + wu\) with \(w \in {}^{\nu}\mathbb{R}_{\ge 0}\) is likewise to be solved. If \(\min_k \Delta r_k\, r = 0\) follows, the computation stops; otherwise it is repeated until \(\min r = 0\) or \(\min r > 0\) is established.\(\square\)

Remark: The running time for \(m, n \in {}^{\nu}\mathbb{N}^*\) is \(\mathcal{O}(dmn)\), where \(d\) is the density of the matrix \(A\). Deliberate rounding errors in \(b\) and \(c\) can make a solution unique. If the method is optimised for distributed computing in \({}^{\nu}\mathbb{R}^{\nu}\), the running time is only \(\mathcal{O}(1)\). If \(\acute{q}_0 \le \lfloor q_j \rfloor – q_j\) holds for integer-valued solutions \(q_j\), the method performs well for (mixed-)integer problems and (non-)convex (Pareto) optimisation.

Remark: The method exploits the coincidence of the computed geometric centroid and the inball centre of \(P_0\). Pairwise rotation of the coordinate axes of the former and of \(c\) by a simple rotation matrix ensures this in a comparable running time. If necessary, all constraints can temporarily be relaxed by the same small amount. Numbers of length \(\mathcal{O}(\nu)\) can only be processed in \(\mathcal{O}(p)\). The following statements can also be transferred to the complex case without difficulty:

Conclusion: The LP \( \max \{{||x – {x}^{o}||}_1 : {c}^{T}x = {c}^{T}{x}^{o},\; Ax \le b,\; x – {x}^{o} \in {[-1, 1]}^{n},\; x \in {}^{\nu}\mathbb{R}_{\ge 0}^{n}\} \) can, starting from a first solution \({x}^{o}\), compute a second one in \(\mathcal{O}(p^2)\), with \({y}^{o}\) treated analogously.\(\square\)

Conclusion: The LP \( \max \{\nu|\lambda_j| + ||x_j||_1 : Ax_j = \lambda_j x_j,\; |\lambda_j| \in [0, r_j],\; x_j \in {[-1, 1]}^{n*}\} \) can, for each \(j = 1, …, n\), determine the eigenvalue \(\lambda_j \in {}^{\nu}\mathbb{R}\) and eigenvector \(x_j \in {}^{\nu}\mathbb{R}^{n}\) of the matrix \(A \in {}^{\nu}\mathbb{R}^{n \times n}\) in \(\mathcal{O}(p^2)\).\(\square\)

Conclusion: The LPs \(\max \{x_{j} : Ax = 0\}\) yield all solutions of \(Ax = b\) in \(\mathcal{O}(p^2)\). The matrix \(A\) is regular if and only if the LP \(\max \{||x||_{1} : Ax = 0\} = 0\).\(\square\)

Conclusion: Let \(\alpha_{j} := A_{.j}^{-1}\) for \(j = 1, …, n\) be the columns of \(A^{-1} \in {}^{\nu}\mathbb{R}^{n \times n}\) and \(\delta_{ij}\) the Kronecker~delta. For regular \(A\) the eigenproduct is non-zero and each system \( A \alpha_{j} = (\delta_{1j}, …, \delta_{nj})^{T} \) can be solved in \(\mathcal{O}(p^2)\).\(\square\)

Conclusion: The polynomial \(c_{\acute{n}}x^{\acute{n}} + \dots + c_1x + c_0\) can be determined in \(\mathcal{O}(p^2)\) from measurements \((x_j, y_j) \in {}^{\nu}\mathbb{R}^2\) and residuals \(r_j \in {}^{\nu}\mathbb{R}\) for \(j = 1, …, m\) as well as \((c, x)^T \in {}^{\nu}\mathbb{R}^{\grave{n}}\) from \[ y = r + Xc,\quad X^Tr = 0,\quad X = (x_j^k)_{j,k},\ k = 0, …, \acute{n}, \] including additional constraints.\(\square\)

Conclusion: The discrete Fourier transform (see scientific computing) can determine the Padé approximation \(\frac{a(x)}{b(x)} = {\LARGE{\textbf{+}}}_{m=0}^n a_m x^m / {\LARGE{\textbf{+}}}_{m=0}^n b_m x^m\) of \(f(x)\) for \(x \in {}^{\nu}\mathbb{R}\) in \(\mathcal{O}(p^2)\).\(\square\)

Corollary: Intex and two-dimensional bisection or Newton methods solve most solvable convex programmes \[ \min \{f_{1}(x) : x \in {}^{\nu}\mathbb{R}^{n},\ (f_{2}(x), …, f_{m}(x))^{T} \le 0\} \] with convex functions \(f_{k} \in {}^{\nu}\mathbb{R}\) for \(k = 1, …, m\) in polynomial time, provided that the number of operands \(x_{j}\) of the \(f_{k}\) is \(\le {\nu}^{\nu-3}\) and there exists an \(x\) with \(f_{k}(x) < 0\) for all \(k > 1\).\(\square\)

Corollary: The LP \( \max \{x : y_{\acute{m}} – xy_m = b_{\acute{m}},\; y_n = y_0 = 0,\; x \le x_0 \in {}^{\nu}\mathbb{R}\} \) can, for \(m = 1, …, n\), determine in \(\mathcal{O}(p^2)\) a value \(x \in {}^{\nu}\mathbb{R}\) that solves \(b_{\acute{n}}x^{\acute{n}} + \dots + b_1x + b_0 = 0\) via Horner’s scheme.\(\square\)

The intex method solves most solvable LPs in \(\mathcal{O}(p^2)\) by combining inter- and extrapolations with centroid projections. It exploits the geometry of the feasible polyhedron and its sections in projections and is independent of pivot rules. In situations where additional algebraic structure is present (Toeplitz, Hankel, block-circulant or generally convolution-like), a geometric search is still possible, but typically less efficient than a direct structured factorisation. In such cases the FTD and the FSR methods are complementary to the Intex method.

The FTD exploits the fact that a regular matrix \(A\in{}^{\nu}\mathbb{R}^{n\times n}\) whose entries vary only slowly along the diagonals or can be described by convolution operators is well approximated by the product of two Toeplitz matrices \(T_{1},T_{2}\). Initially, \(x_{0}=\max|A_{ij}|\) is chosen; subsequently, the sequences \(x_{k},y_{k}\) are formed by local arithmetic means of neighbouring entries. The diagonal sums \[ z_{1}=x_{0}y_{0}+x_{1}y_{1}+\ldots,\qquad z_{2}=x_{0}y_{-1}+x_{1}y_{0}+\ldots \] are computed recursively until \(A\approx T_{1}T_{2}\) is achieved. Normalising the columns of \(T_{1}\) and \(T_{2}\) stabilises the recursion. Backward substitution along the diagonals yields a unique approximation. The number of steps per diagonal is \(\mathcal{O}(\ell)\). Once \(T_{1}^{-1}\) and \(T_{2}^{-1}\) have been computed, the purely structured operations imply \[ T(n)=\mathcal{O}(p^{2}),\qquad p=\ell n. \] For symmetric systems, starting from \(H=T_{1}^{T}T_{1}\), one obtains the factorisation \(H\approx P^{T}DP\) with diagonal matrix \(D\) and block-Toeplitz matrix \(P\), which is used in the LUX method (Linear Universal Exact) for solving structured LPs. It replaces classical Cholesky or QR factorisations and preserves Toeplitz, Hankel and circulant structures exactly.

In parallel, the FSR methods are based on recursive shifts and modulo reconstruction. A matrix \(A\) is decomposed into low- and high-weight parts, \[ A=(A_{\mathrm{hi}}\ll s)+A_{\mathrm{lo}}, \] where \(\ll s\) denotes a generative shift. The products of the individual parts are computed recursively and then recombined modulo a suitably chosen \(s\) to yield an exact result. In this way all intermediate values remain dynamically small, the recursion is fully parallelisable, and BigInt or BigFloat arithmetic does not lead to overflow problems. Thus, for matrix multiplication, inversion and Schur-complement steps the same complexity as for FTD holds, \[ T_{\mathrm{FSR}}(n)=\mathcal{O}(p^{2}). \] The three methods complement one another in a natural way. Intex solves LPs by reducing them to the computation of a centroid in the feasible polyhedron and is particularly suitable for problems with weak or no algebraic structure. FTD solves structured linear systems and symmetric factorisations exactly and requires only convolution operators and diagonal updates. FSR performs the arithmetic operations recursively and in a memory-efficient manner and is suitable for very large integers and high precision. If a Toeplitz, Hankel or general convolution structure is present in an LP or in a linear system, the composition
\(A\approx T_{1}T_{2},\qquad T_{1}^{-1},T_{2}^{-1}\)   recursively via FSR,    projection or LP cut via intex,
reduces the overall effort to \(\mathcal{O}(p^{2})\) for the structured linear systems and the LPs. In this way, large problems in the generative number spaces \({}^{\nu}\mathbb{R}^{n\times n}\) can be solved stably and exactly. The methods are independent of the word length of the entries and scale almost linearly for numbers of size \(\mathcal{O}(\nu)\).

The combination of FTD, FSR and intex thus forms a universal toolkit for structured linear algebra and optimisation. It combines structure (Toeplitz, Hankel, circulant), recursion (shift, convolution) and geometry (inter-/extrapolation) into a method that guarantees both regularity and exact solutions in short running time. This provides a coherent approach to problems that previously were only treated separately via numerical or geometric methods.

The following orders of magnitude summarise the intersection points between classical and new methods and the resulting newly computable regimes. The thresholds given are to be understood heuristically and depend on implementation, hardware and required accuracy; however, they indicate regions in which the new methods could systematically outperform classical approaches.

Problem class    \((p = \ell n)\)classical method / complexity, \(T = \mathcal{O}(n^3)\)new method / complexity, \(T = \mathcal{O}(p^2)\)intersection region and newly computable regimes
Dense general LPSimplex, Interior-Point, Face algorithmsIntex For small \(m,n \lesssim 10^2\) classical methods are usually faster. For \(m+n \gtrsim 10^3\) (in particular for dense matrices) the quadratic cost of the intex method dominates. Newly computable: very large dense LPs with \(m,n \in {}^{\nu}\mathbb{N}^*\).
Structured linear systems (Toeplitz, Hankel, block-circulant)Cholesky, QR, SVD, loss of structure and high memory demandFTD (Toeplitz factorisation) combined with FSR Already for \(n \gtrsim 10^3\) structured factorisation becomes advantageous compared with unstructured methods. Newly computable: large Toeplitz and Hankel systems with exact structure and memory conservation, including BigInt and BigFloat arithmetic.
Matrix inversion and Schur complement on \(C = A^HA\)Direct inversion, classical block methodsFast matrix multiplication, Schur complement with FSR Comparable for moderate dimensions; for \(n \gtrsim 10^3\) clear advantage of recursive methods.
Newly computable: high-dimensional regular systems with regular \(A^HA\), in particular for very long word lengths.
Eigenvalues, eigenvectors, polynomial and Padé approximationQR iteration, classical least-squares methodsIntex-based LP formulations for eigenvalue and interpolation problems For small matrices classical methods are preferable.
For \(n \gtrsim 10^3\) and structured or sparse coefficient matrices, intex-based formulations perform better. Newly computable: large spectral problems and high-order approximations with exact constraints.
Structured LPs with convolution or Toeplitz matrixInterior-Point without exploiting the structure; effectively close toCombination of FTD, FSR and intex; overall cost Intersection region arises once the structure of \(A\) is more dominant than its mere density, typically from \(n \gtrsim 10^3\) onwards.
Newly computable: large convex programmes with Toeplitz, Hankel or circulant structures in \(A\) that previously could only be treated roughly or stochastically.

code of the simplex method

© 2008-2025 by Boris Haase

top

References

  1. see Golub, Gene H.; van Loan, Charles F.: Matrix Computations; 3rd Ed.; 1996; Johns Hopkins University Press; Baltimore, p. 31 ff.
  2. see Knabner, Peter; Barth, Wolf: Lineare Algebra; 2., überarb. und erw. Aufl.; 2018; Springer; Berlin, p. 223
  3. Vanderbei, Robert J.: Linear Programming; 3rd Ed.; 2008; Springer; New York., p. 60 – 65