Scientific Computing

Scientific Computing
Scientific Computing (image credit: AI-generated)

The following section presupposes Set Theory and Nonstandard Analysis.

From Crown Series to Euler-Maclaurin Formulas

Definition: Let \(\varepsilon \in [\tilde{\nu}, 1 – \tilde{\nu}]\). Let according to the trapezoidal rule \({}^{T}{\uparrow}_{z\in A}{f(z){\downarrow}z:={\LARGE{\textbf{+}}}_{z\in A}{\tilde{2}(f(z)+f(\overset{\rightharpoonup}{z}))(\overset{\rightharpoonup}{z}-z)}}\) and according to the midpoint rule \({}^{M}{\uparrow}_{z\in A}{f(z){\downarrow}z:={\LARGE{\textbf{+}}}_{z\in A}{f(\tilde{2}(z\,+\overset{\rightharpoonup}{z}}))(\overset{\rightharpoonup}{z}-z)}\). For \(j, k \in {}^{\omega}\mathbb{N}, f \in \mathcal{C}_{\pi}^{j+2}\) and Fourier coefficients \(c_k := \tilde{\hat{\pi}}{\uparrow}_{-\pi}^{\pi}{f(t)\tilde{\epsilon}^{\underline{k}t}{{\downarrow}t}}\), the series \({\LARGE{\textbf{+}}}_{k=-\omega}^{\omega}{c_k\epsilon^{\underline{k}t}}\) is called Fourier series, with possibly other period lengths than \(\hat{\pi }\)1. Let \(f_n^*(z) = f(\eta_nz)\) sisters of the TS \(f(z) \in \mathcal{O}(D)\) centred on 0 on the domain \(D \subseteq {}^{\omega}\mathbb{C}\) where \(m, n \in {}^{\omega}\mathbb{N}^{*}\) and \(\eta_n^m := \underline{1}^{2^{\lceil m/n \rceil}}\). Then let \(\delta_n^*f = \check{f} – \check{f}_n^*\) the halved sister distances of \(f\). Let \(s \in {}^{\omega}\mathbb{Z}^{*}, \mu_s^m := z^{-s}m!/(m – s)!\) and \(\widetilde{|s|\text{-}!} := 0\). On the level of TS, \(\mu\) and \(\eta\) form a \(\mu\)-\(\eta\)-calculus.\(\triangle\)

Remark: The latter allows representing integrals and derivatives easily and finitely closed2. It holds for the TS \({}^2f(x){\uparrow}_0^x{\uparrow}_0^{y}{g(v){\downarrow}v{\downarrow}y} = f(x\mu_2)g(x\mu_{-2}).\)

Example: \(\text{Re} \; c \in [\tilde{\nu}, 1 + \tilde{\nu}[, c \in {}^{\omega}\mathbb{C}\) and \(\varepsilon := \tilde{2}^j, j \in {}^{\omega}\mathbb{N}^{*}\) imply \(\zeta (n+c)=a_n{\LARGE{\textbf{+}}}_{k=1}^{n}{\delta_n^* v_c(\varepsilon u^k)}\) for \(z \in {}^{1-\tilde{\nu}}\dot{\mathbb{C}} \subset \mathbb{D}\)3 and \(v_c(z):={\LARGE{\textbf{+}}}_{m=1}^{\omega }{\zeta (m+c){{z}^{m}}}=z{\LARGE{\textbf{+}}}_{m=1}^{\omega }{{{{\widetilde{m}}}^{c}}\widetilde{z-m}}.\square\)

Example: A Richardson extrapolation determines the digamma function \(\psi\) and\[\zeta(\grave{n}) =\widetilde{\varepsilon^n\hat{n}}{\LARGE{\textbf{+}}}_{k=1}^n{(\psi(\varepsilon u^k \underline{1}^{2\tilde{n}}) – \psi(\varepsilon u^k))} + \mathcal{O}(\varepsilon^{\hat{n}})\] for \(n = 2\) and \(\varepsilon = \tilde{2}^{13}\) as \(\zeta(3) = 2^{24}{\LARGE{\textbf{+}}}_{k=1}^2{(\psi(\underline{\varepsilon} u^k) – \psi(\varepsilon u^k))} + \mathcal{O}(10^{-16})\).

Representation theorem for integrals: The TS (see below) \(f(z) \in \mathcal{O}(D)\) centred on 0 for \(D \subseteq {}^{\omega}\mathbb{C}\) gives for \(\grave{m}, n \in {}^{\omega}\mathbb{N}^*\)\[{\uparrow}_0^z…{\uparrow}_0^{\zeta_2}{f(\zeta_1){\downarrow}\zeta_1\;…\;{\downarrow}\zeta_n} = f(z\mu_{-n}).\square\]Representation theorem for derivatives: For \({}^{\widetilde{\nu}}\dot{\mathbb{C}} \subset \mathbb{D} \subseteq {}^{\omega}\mathbb{C}, n\)-th unit roots, TS\[f(z):=f(0) + {\LARGE{\textbf{+}}}_{m=1}^{\omega }{\widetilde{m!}\ {}^mf(0){z^m}},\]\(\varepsilon := \tilde{2}^j\tilde{r}, j \in {}^{\omega}\mathbb{Z}, n = {\epsilon}^{\sigma} \in {}^{\omega}\mathbb{N}^{*}, u :={\epsilon}^{\tilde{n} \hat{\underline{\pi}}}\) and \(f\)’s radius of convergence \(r \in {}^{\nu}{\mathbb{R}}_{>0}\) imply\[{}^nf(0)=2^{jn}\acute{n}!{\LARGE{\textbf{+}}}_{k=1}^{n}{\delta_n^* f(\tilde{2}^j u^k)}.\square\]Universal multistep theorem: For \(n \in {}^{\nu}\mathbb{N}_{\le p}, k, m, p \in {}^{\nu}\mathbb{N}^{*}, {\downarrow}\overset{\rightharpoonup}{x} \in\, ]0, 1[, x \in [a, b] \subseteq {}^{\omega}\mathbb{R}, y : [a, b] \rightarrow {}^{\omega}\mathbb{R}^q,\) \(f : [a, b]\times{}^{\omega}\mathbb{R}^{q \times n} \rightarrow {}^{\omega}\mathbb{R}^q, g_k(\overset{\rightharpoonup}{x}) := g_{\acute{k}}(x)\), and \(g_0(a) = f(\overset{\leftharpoonup}{a}, y_0, … , y_{\acute{n}})\), the TS of the initial value problem \(y^\prime(x) = f(x, y((\rightharpoonup)^0 x), … , y((\rightharpoonup)^{\acute{n}} x))\) of order \(n\) implies\[y(\overset{\rightharpoonup}{x}) = y(x) + {\downarrow}\overset{\rightharpoonup}{x}{\LARGE{\textbf{$\pm$}}}_{k=1}^{p}{\left (g_{p-k}(\overset{\rightharpoonup}{x}){\LARGE{\textbf{+}}}_{m=k}^{p}{\widetilde{m!}\tbinom{\acute{m}}{\acute{k}}}\right )} + \mathcal{O}(({\downarrow}\overset{\rightharpoonup}{x})^{\grave{p}}).\square\]Theorem for (anti-) derivatives of CS: For \(m \in {}^{\omega}\mathbb{Z}\), \(q = \tilde{\varepsilon}(z-a)\), \(a \in \mathbb{D}\) and \(j, k \in \mathbb{N}_{<n}\), modular arithmetic4 and \(n\)-th unit roots result in the analogous TS:\[{}^mf_{\tilde{n}}(z) := \tilde{n}(q^j)^T(\delta_{jk}\widetilde{\varepsilon q}^jj!/(j-m)!)({\tilde{u}}^{jk})(f({\varepsilon u}^k+a))+R_n^{m}(z).\square\]Reversion theorem of CS: For \(y \in f(\mathbb{D}), y(a) = b\) and \({}^1y(a) \ne 0\), Bürmann’s theorem yields5 :\[f_{\tilde{n}}^{-1}(y) := a + \tilde{n} {\LARGE{\textbf{+}}}_{m=1}^n{\widetilde{m}{\tilde{\varepsilon}}^{\acute{m}}(y – b)^m({\tilde{u}}^{\acute{m}k})^T(f(\varepsilon u^k + a)^{-m})}+R_n^{-}(y).\square\]Remark: Put \(y = 0\) for a sufficiently small \(\tilde{n}|a – z_0|\) such that \(z_0 \in {}^{\omega}\mathbb{C}\) is a zero of \(f(z) \in {}^{\omega}\mathbb{C}\). The previous theorem allows to easily transform implicit differential equations for CS into explicit ones.

Theorem for derivatives of Fourier series: For \(f \in \mathcal{C}_{\pi}^{m+2}\)6, \(m \in {}^{\omega}\mathbb{N}, j \in [-\acute{n}, n] \cap \mathbb{Z}, t \in [-\pi, \pi]\) and \(k \in \mathbb{N}_{<\hat{n}}\), the following CS analogue is implied allowing integration of individual terms:\[{}^m\check{f}_{\tilde{n}}(t):=\tilde{n}(u^{\tilde{\pi}jnt})^T(\delta_{(j+\acute{n})k}\underline{j}^m)(\tilde{u}^{jk})(\check{f}(\tilde{n}\pi k – \pi))+\mathcal{O}(\tilde{n}).\square\]Conclusion: Supporting points \(kr := \tilde{n}\pi k\) of the smooth \(f(kr)\) yield for \(j\) like \(k\) interpolating in \(\mathcal{O}(\sigma n)\):\[{}^m\check{f}_{\tilde{n}}(t):=\tilde{n}(u^{\tilde{r}jt})^T(\delta_{jk}\underline{j}^m)(\tilde{u}^{jk})(\check{f}(kr)).\square\]Remark: The identity instead of \(\delta_n^*\) provides arbitrarily precise approximations for the \({}^nf\). The error is comparable for analogously defined \(m\)-dimensional CS with \(\binom{m+n}{n}\) derivatives.

Theorem (fixed-point method for CS): Every \(z \in {}^{\omega}\mathbb{C}\) of an arbitrary \(m\)-polynomial \(p(z) = 0\) with \(m \in [2, n] \cap \mathbb{N}\) for \(n := 2^r, r \in {}^{\nu}\mathbb{N}^*\) and coefficients from \({}^{\nu}\mathbb{C}\) can be (initiating) determined also in \(\mathcal{O}(\sigma n).\)

Proof and algorithm: Let \(U = (\tilde{u}^{jk})\) for \(j, k \in \mathbb{N}_{<n}, u :={\epsilon}^{\tilde{n} \hat{\underline{\pi}}}, q := \hat{z}\) and \(s_k := p(\tilde{2}u^k)\). A simple transform achieves \(|q| < \tilde{2}\) for all zeros \(\zeta\) of \(p(z)\) and \(p(0) = 1\). It follows from \(p(z) = \tilde{n}(q^j)^TUs = \tilde{n}\mu^Ts = 0\) the simplified iteration \(\mu^* = U_{1}^{-T}\mu U((\delta_{jk}\tilde{u}^j)U^{-1}\mu-(U_{\acute{n}}^{-T}\mu+\beta s^T\mu, \beta s^T\mu, …, \beta s^T\mu)^T)\). Here let \(\delta_{jk}\) the Kronecker delta, starting point \(q := \tilde{2}\) and \(\beta \in {}^{\nu}\mathbb{C}^*\) such that each time \(||\mu^*-\mu||\) is roughly halved and \(\mu^Ts = 0\) holds. Finish by polynomial division where \(m > 2.\square\)

Approximation theorem: The \({}^sf(x) \in {}^{\omega}\mathbb{R}\) for \(x \in {}^{\omega}\mathbb{R}\) as above allow computing the interpolant

\(g(x) := {\LARGE{\textbf{+}}}_{r=0}^{\acute{m}}{\chi_{]x_r, x_{\grave{r}}[}(x)((x_{\grave{r}}-x)p_r(x)+(x-x_r)p_{\grave{r}}(x))/(x_{\grave{r}}-x_r)}+\)\({\LARGE{\textbf{+}}}_{r=0}^m{\chi_{\{x_r\}}(x)p_r(x)}\)

for \(m, n \in {}^{\nu}\mathbb{N}\) and \(p_r(x) := {\LARGE{\textbf{+}}}_{s=0}^n{{}^sf(x_r){(x-x_r)}^s/s!}\) in \(\mathcal{O}(\sigma mn)\) where \({}^sf(x_r) = {}^sg(x_r)\) holds for every \(x_r \in {}^{\omega}\mathbb{R}\). Replace in the complex case \({}^{\omega}\mathbb{R}\) by \({}^{\omega}\mathbb{C}\) and put \(x = \gamma(t) \in {}^{\omega}\mathbb{C}\) for the path \(\gamma(t)\) where \(t \in {}^{\omega}\mathbb{R}.\square\)

Theorem (differential equations): Let \(d(z) \in {}^{\omega}\mathbb{C}^{n+2}\) vector of CS for \(n \in {}^{\omega}\mathbb{N}\). If the system \({}^{\grave{n}}y = \left(\complement_0^n\ {}^{m}y, 1\right)d(z) \in {}^{\omega}\mathbb{C}\) where (not) \(\left(\complement_0^n\ {}^{m}y(a)\right)^T\in {}^{\omega}\mathbb{C}^{\grave{n}}\) and \(a, z \in {}^{\omega}\mathbb{C}\) has the solution \(y_0\), this can be determined (also by automatic differentiation) in \(\mathcal{O}(n^3)\) giving the error mentioned above.\(\square\)

Remark: If function values of crown points \(\varepsilon u^m + a\) around the crown centre \(a\) are evaluated, where \(\varepsilon = \tilde{2}\) and \(n = 64\) are well-chosen, and terminating CS (possibly by using Laurent and Fourier series) are used, both a polynomial division may be executed and (at least in sections) analytic, nonlinear or partial (integro-) differential equation systems can be solved.

Example: For \(b = (f(\varepsilon u^0 + a), …, f(\varepsilon u^{\acute{n}} + a), 1)^T \in {}^{\omega}\mathbb{C}^{\grave{n}}\) and an upper triangle matrix \(T(z) \in {}^{\omega}\mathbb{C}^{\grave{n}\times\grave{n}}\), ordinary differential equations have only solutions if \(T(z)b = 0\) holds.

Series theorem for integrals: For \(c \in [0, a], f \in {}^{\nu}\mathcal{C}^{n}\) and \(n, \hat{p}, r\,(= 2^p), t\,(=2^v), \hat{v} \in {}^{\nu}2\mathbb{N}^*\), CS show

\({\uparrow}_0^a f(w){\downarrow}w = {\uparrow}_0^{\tilde{t}a}{\LARGE{\textbf{+}}}_{s=1}^t{f(x+\acute{s}\tilde{t}a){\downarrow}x} = {\uparrow}_0^1 g(y){\downarrow}y\)\(= {\LARGE{\textbf{+}}}_{\check{q}=1}^{\check{r}}{{\LARGE{\textbf{+}}}_{\check{m}=0}^{\check{n}-1}{\widetilde{\grave{m}!}\ {}^m\hat{g}\left (\tilde{r}\acute{q} \right){\tilde{r}}^{\grave{m}}}}+\mathcal{O}\left(\widetilde{\grave{n}!}\ {}^ng(\tilde{a}c){\tilde{r}}^n\right).\square\)

Remarks: CS equivalents may replace any \({}^mg\) for \(g(y) := af(ay)\) and \(a > 0\). Transforming \(z := {\epsilon}^{\pm x}\) makes here infinite bounds of integration finite. A finite integral decreases the remainder’s modulus sufficiently. The midpoint rule is here more advantageous than the trapezoidal rule.

Example: For \({}_{\mp}g(\vartheta) := \check{\pi}(1 – {\varepsilon}^2 {\sin}^2(\check{\pi}\vartheta))^{\mp\tilde{2}}\), complete elliptic integrals of the first and second kind are \[{\uparrow}_0^{1} {}_{\mp}g(\vartheta) {\downarrow}\vartheta = {}_{\mp}g(\check{1}) + \widetilde{24}\ {}_{\mp}^2g(\check{1}) + \widetilde{1920}\ {}_{\mp}^4g(\check{1}) + \mathcal{O}\left(\widetilde{9!}\ {}_{\mp}^6g(\check{1})\right).\]Second Euler-Maclaurin formula: For \(f(\check{q}) = g(\tilde{r}\acute{q})\) and \(k = \grave{a} = \grave{r}\), the preceding theorem yields in \(\mathcal{O}(\sigma n)\)\[{\LARGE{\textbf{+}}}_{\check{q}=1}^{\check{r}} f(\check{q}) = {\uparrow}_{\check{1}}^{\check{k}} f(x){\downarrow}x + {\LARGE{\textbf{+}}}_{\check{m}=1}^{\check{n}-1} H_m \left ({}^{\acute{m}}f(\check{k}) – {}^{\acute{m}}f(\check{1}) \right ) + \mathcal{O}\left (H_n \left ({}^{\acute{n}}f(\check{k}) – {}^{\acute{n}}f(\check{1}) \right )\right ).\]Proof: For \(h(x)=x/\)sin \(x\) and \(H_m := \underline{1}^m\widetilde{m!}\ {}^mh(0)=\widetilde{m!}B_m(2-2^m) \rightarrow 2\tilde{\underline{\pi}}\text{-}^m\),\[{\LARGE{\textbf{+}}}_{\check{q}=1}^{\check{r}} g(\tilde{r}\acute{q}) = \check{r}{\uparrow}_0^1 g(y){\downarrow}y – {\LARGE{\textbf{+}}}_{\check{q}=1}^{\check{r}}{\LARGE{\textbf{+}}}_{\check{m}=1}^{\check{n}-1} {\widetilde{\grave{m}!}\ {}^mg(\tilde{r}\acute{q})} + \mathcal{O}\left(\widetilde{\grave{n}!}\ {}^ng(\tilde{a}c){\tilde{r}}^n\right)\]results in the claim by inserting and combining the factorials that are counted depending on partitions where \(B_m\) (cf. Nonstandard Analysis) are Bernoulli numbers and \(B_{\grave{m}} = 0, B_0=1\) as well as \(B_1=-\tilde{2}.\square\)

First Euler-Maclaurin formula: Mathematical induction for \(n\) shows also that7

\({\LARGE{\textbf{+}}}_{\check{q}=0}^{\check{r}}f(\check{q})=\uparrow_0^{\check{r}}{f\left(x\right){\downarrow}x}+\check{f}(\check{r})+\check{f}(0)+{\LARGE{\textbf{+}}}_{\check{m}=1}^{\check{n}-1}{\widetilde{m!}}B_m\left({}^{\acute{m}}f(\check{r})-{}^{\acute{m}}f(0)\right)+\)\(\mathcal{O}\left({\widetilde{n!}}B_n\left({}^{\acute{n}}f(\check{r})-{}^{\acute{n}}f(0)\right)\right).\square\)

Remark: Replace derivatives of the Euler-Maclaurin formulas by CS equivalents, too.

The Crown-FFT-Aberth Method for Root-finding

The Crown-FFT-Aberth method for root-finding

Let

\[ p(z)= {\LARGE{\textbf{+}}}_{j=0}^{m}{c_jz^j} \in{}^{\nu}\mathbb C[z], \qquad m\in{}^{\nu}\mathbb N^*, \qquad c_m\ne0. \]

We seek the zeros of \(p\). In contrast to a local Newton method, it is not assumed that a starting value close to an individual zero is already known. Instead, the polynomial is first represented globally around a freely chosen centre by means of a crown shift.

Let \( a\in{}^{\nu}\mathbb C \) be a crown centre. Furthermore, let \( n=2^r, r\in{}^{\nu}\mathbb N^*, n>m, \) and let \( u:=\epsilon^{\tilde n\hat{\underline{\pi}}} \) be a primitive \(n\)-th root of unity. For \(\sigma\in{}^{\nu}\mathbb Z\), let the crown radius be given by \(\rho:=2^\sigma\). This choice is numerically favourable, since scaling by powers of \(\rho\) is binary exact, provided that no overflow or underflow occurs.

The crown points are

\[ z_k:=a+\rho u^k, \qquad k=0,\dots,\acute n. \]

Set

\[ s_k:=p(z_k)=p(a+\rho u^k). \]

Scaled CS

The scaled variable \( z=a+\rho q \) implies

\[ P(q):=p(a+\rho q) = {\LARGE{\textbf{+}}}_{j=0}^{m}{d_jq^j}. \]

The coefficients \(d_j\) are obtained from the crown values by projection onto the roots of unity:

\[ d_j = \tilde n {\LARGE{\textbf{+}}}_{k=0}^{\acute n} {s_k\tilde u^{\,jk}}, \qquad j=0,\dots,m. \]

The unscaled TS or CS in \( h=z-a \) with \( b_j=\tilde\rho^{\,j}d_j \) is

\[ p(a+h)= {\LARGE{\textbf{+}}}_{j=0}^{m}{b_jh^j}. \]

For root-finding, the scaled form \(P(q)\) is preferable, since, with a suitable choice of \(\rho\), the desired \(q\)-zeros lie in a moderate region, often close to the unit circle.

FFT formulation

The sequence of crown values \( s=(s_0,\dots,s_{\acute n})^\top \) is projected by means of the fast Fourier transform. Up to the sign convention of the FFT used, one has

\[ (d_0,\dots,d_{\acute n})^\top = \tilde n\,\operatorname{FFT}(s). \]

For polynomials with \(m<n\), the reconstruction is alias-free over the exact field. Numerically, it is advisable to use \( n\ge \widehat m+1, \) usually again as a power of two.

Initial values for Aberth-Ehrlich

For the scaled polynomial

\[ P(q)= {\LARGE{\textbf{+}}}_{j=0}^{m}{d_jq^j} \]

one needs \(m\) distinct initial values. These are not obtained from the neighbourhoods of individual zeros, but are chosen uniformly on a starting circle.

Let \( v:=\epsilon^{\tilde m\hat{\underline{\pi}}} \) be a primitive \(m\)-th root of unity, and let \( R_A\in{}^{\nu}\mathbb R_{>0} \) be an Aberth starting-circle radius. The initial values are then set as

\[ q_i^{(0)} = R_A v^{\,i-\check 1}, \qquad i=1,\dots,m. \]

In ordinary notation this is

\[ q_i^{(0)} = R_A \exp\left(\tilde{m}\hat{\underline{\pi}}(i-\check 1)\right). \]

The half angular step \(i-\check 1\) avoids simple symmetry degeneracies. If \(\rho\) is chosen such that the relevant zeros of \(p\) are approximately captured by \( |z_\ell-a|\le \rho \), then the corresponding scaled zeros \( q_\ell=\tilde\rho(z_\ell-a) \) typically lie in the unit circle. In that case, \( R_A=1 \) is a natural choice.

If no geometric information is available, a Cauchy bound for \(P\) may be used:

\[ \acute R_A = |\tilde{d}_m|\max_{0\le j<m}|d_j|, \qquad d_m\ne0. \]

Then all zeros of \(P\) lie in the disc \( |q|\le R_A. \)

Aberth-Ehrlich step without eigenvalue decomposition

For current approximations \( q_1,\dots,q_m \), let

\[ \Delta_i:= \frac{P(q_i)}{{}^1P(q_i)}. \]

The Aberth-Ehrlich step8 is then

\[ q_i^+ = q_i – \frac{\Delta_i} { 1-\Delta_i {\LARGE{\textbf{+}}}_{\substack{\ell=1\\ \ell\ne i}}^{m} {\widetilde{q_i-q_\ell}} }. \]

The Newton component \(\Delta_i\) pulls \(q_i\) towards a zero of \(P\). The second term in the denominator acts as a mutual repulsion between the approximations. This prevents several approximations from collapsing onto the same zero without further control. After the iteration, the rescaling \(z_i=a+\rho q_i\) is performed.

Stopping criterion

A stopping criterion based purely on step size is not sufficient for closely neighbouring zeros or clusters. Hence a scaled evaluation error is also used:

\[ \eta(q):= \frac{|P(q)|} {{\LARGE{\textbf{+}}}_{j=0}^{m}{|d_j||q|^j}}. \]

After a minimum number of iterations, the iteration is stopped if both

\[ \max_i |q_i^+-q_i| \le \tau_{\mathrm{step}}\max(1,\max_i|q_i|) \]

and

\[ \max_i\eta(q_i) \le \tau_{\mathrm{res}} \]

hold. This prevents premature termination for numerically sensitive clusters.

Refinement in application mode

In a real application, \(p\) is given either by coefficients, by an evaluation rule, or by a locally reconstructed CS. The final refinement may only use this actually available representation for \(p_1(z_i) := {}^1p(z_i) \ne 0\):

\[ z_i \leftarrow z_i-\tilde p_1(z_i) p(z_i). \]

For coefficient input, the evaluation may be carried out, for example, by Horner’s scheme. For a black-box function, the original function is evaluated.

Multiple zeros

Multiple zeros are intrinsically ill-conditioned. A small perturbation of the coefficients can split an \(r\)-fold zero into \(r\) nearby simple zeros. This is not a weakness of the crown shift, but a property of the root map itself.

If coefficients are available with sufficient accuracy, a square-free factorisation is useful. Let \( g:=\gcd(p,{}^1p). \) Then \( p_{\mathrm{sf}}:=p/g \) is the square-free part of \(p\). The zeros of \(p_{\mathrm{sf}}\) are simple. The multiplicities are then reconstructed from the separated factors.

If a multiplicity \(r\) is locally known or numerically estimated, then instead of the ordinary Newton step one may use the modified form

\[ z \leftarrow z- r\tilde p_1(z)p(z). \]

Precision-adaptive formulation

For clusters, nearly multiple zeros, or very poorly scaled polynomials, the working precision can be increased. Let \(\beta\) be the number of bits used. If the residual criterion is not met, or if a cluster remains unstable, one sets \(\beta\leftarrow\hat\beta\), i.e. the working precision is doubled. The crown evaluation, the Fourier projection, the Aberth-Ehrlich step, and the refinement are then repeated with increased precision. The scaling \( z=a+\rho q \) with \( \rho=2^\sigma \) is retained.

Benchmark mode

For synthetic tests, zeros \(r_1,\dots,r_m\) may be prescribed and the test polynomial \[ p(z)={\LARGE{\textbf{$\times$}}}_{j=1}^{m}{(z-r_j)} \] may be evaluated in factorised form. This factorised evaluation avoids the ill-conditioned inverse passage from zeros to monomial coefficients. In particular, the Wilkinson effect of the test generation is not confused with the error of the root-finding method.

A refinement at the known factors \[ p(z)={\LARGE{\textbf{$\times$}}}_{j=1}^{m}{(z-r_j)} \] which is possible in benchmark mode, is only a test oracle. It serves to isolate the quality of the initial approximations produced by crown-FFT and Aberth-Ehrlich. It is not part of the real solver when the zeros \(r_j\) are unknown.

Algorithm

\[ \begin{array}{ll} 1. & \text{Choose } a,\rho,n \text{ with } n>m, \text{ preferably } n=2^r,\ \rho=2^\sigma. \\[0.1em] 2. & \text{Compute the crown values } s_k=p(a+\rho u^k), \quad k=0,\dots,\acute n. \\[0.1em] 3. & \text{Compute the coefficients by FFT: } d_j= \tilde n {\LARGE{\textbf{+}}}_{k=0}^{\acute n} {s_k\tilde u^{\,jk}}. \\[0.1em] 4. & \text{Set } P(q)= {\LARGE{\textbf{+}}}_{j=0}^{m}{d_jq^j}. \\[0.1em] 5. & \text{Initialise } q_i^{(0)}=R_Av^{\,i-\check 1}, \quad i=1,\dots,m. \\[0.1em] 6. & \text{Solve } P(q)=0 \text{ by Aberth-Ehrlich.} \\[0.1em] 7. & \text{Set } z_i=a+\rho q_i. \\[0.1em] 8. & \text{In application mode, refine by } z_i\leftarrow z_i-p(z_i)/{}^1p(z_i). \\[0.1em] 9. & \text{Check the residuals and increase the precision if necessary.} \end{array} \]

Justification

From

\[ p(a+\rho u^k) = {\LARGE{\textbf{+}}}_{q=0}^{m}{d_qu^{qk}} \]

and the orthogonality of the roots of unity,

\[ \tilde n {\LARGE{\textbf{+}}}_{k=0}^{\acute n} {u^{(q-j)k}} = \delta_{jq}, \]

it follows that

\[ \tilde n {\LARGE{\textbf{+}}}_{k=0}^{\acute n} {p(a+\rho u^k)\tilde u^{\,jk}} = d_j. \]

With \(z=a+\rho q\), the scaled CS \(P(q)=p(z)\) is therefore reconstructed exactly, provided that \(m<n\). Since this is a bijective affine transformation, one has

\[ p(z)=0 \quad\Longleftrightarrow\quad P(q)=0. \]

The Aberth-Ehrlich iteration supplies simultaneous approximations to the zeros of \(P\). Rescaling then gives approximations to the zeros of \(p\). The refinement at the original \(p\) only reduces the numerical residual error and does not change the principle.

Cost

Let \(C_p\) be the cost of one evaluation of \(p\). The crown evaluation costs \(\mathcal O(nC_p)\), and the FFT costs \(\mathcal O(n\;{}_2n)\). For \(\ell\) Aberth-Ehrlich iterations, one obtains \(\mathcal O(\ell m^2)\) for degree \(m\), since the mutual repulsion terms contain quadratically many differences. For \(n=\mathcal O(m)\) this gives the total cost

\[ \mathcal O(mC_p+m\;{}_2m+\ell m^2). \]

For coefficient-given polynomials, typically \( C_p=\mathcal O(m). \) For simple, well-separated zeros, \(\ell\) is usually moderate in practice. For clusters, multiple zeros, or nearly multiple zeros, \(\ell\) may become much larger; in that case, adapted precision or a square-free factorisation is preferable.

Classical comparison

In the classical Aberth method applied to the original polynomial

\[ p(z)= {\LARGE{\textbf{+}}}_{j=0}^{m}{c_jz^j}, \]

the starting values are often chosen directly on a Cauchy circle with \(\acute R= |\tilde{c}_m| \max_{0\le j<m}|c_j|\):

\[ z_i^{(0)} = R \epsilon^{\tilde m\hat{\underline{\pi}}(i-\check 1)}, \qquad i=1,\dots,m. \]

By contrast, the crown-FFT-Aberth method first shifts and scales the problem by \(z=a+\rho q\). This often makes the initialisation in the \(q\)-plane simpler, typically with \(R_A=1\). The starting values do not need to be close to individual zeros; they merely need to cover the relevant zero region adequately.

Classification

The components of the method should be considered separately. The crown-FFT step produces a shifted global polynomial representation. Aberth-Ehrlich replaces the eigenvalue decomposition of the companion matrix. Newton, or modified Newton, refines at the original problem. The method is therefore not a local Newton method with a guessed starting value, but a global initialisation method by crown shifting. It avoids the assumption that one has already started near an individual zero.\(\square\)

From the Extraction of Linear Factors to Runge–Kutta

Algorithmic Refoundation: Extraction of Linear Factors Instead of Blind Evaluation

Standard approximation algorithms often operate in a structurally blind manner. They evaluate the target function at discrete nodes or calculate local derivatives without questioning their global topology. If a classical algorithm encounters a pole, it numerically interprets it simply as an extremely steep ascent. The result is a procedural breakdown: linear systems become singular, rounding errors escalate uncontrollably into NaN (Not a Number) values, and the resulting polynomial loses its mathematical validity.

The approach pursued here abandons this purely reactive path and implements a proactive structural extraction (into linear factors). The computational process is thereby divided into three strictly separated phases:

  1. Structural Scan: Prior to any series expansion, a preliminary algorithm actively searches for true singularities and roots. Through the rigorous evaluation of the modulus of complex numbers, singularities on the imaginary axis are also identified flawlessly.
  2. Analytical Extraction of Linear Factors: The recognised algebraic structures are extracted directly from the original function. Poles are multiplied out, and roots are divided out. What remains is a mathematically smoothed, analytical residual function that avoids numerical escalation.
  3. Approximation and Reconstruction: The actual approximation tools are applied only to this stable residual function. Since the behaviour of the residual function is generally well-behaved, significantly lower polynomial degrees are sufficient for the approximation. In the final reconstruction step, the previously isolated algebraic poles and roots are exactly and losslessly reintegrated into the numerator and denominator polynomials of the final model.

The Blind Spot of the Real Axis: Complex Near-Singularities

An one-dimensional scan along the real axis is highly efficient, yet it possesses a fundamental blind spot: complex near-singularities. Let the rational function \(f(z) = \frac{z^2 + \epsilon^2}{z^2 + \delta^2}\) be considered for very small, real values of \(\epsilon\) and \(\delta\).

On the real axis, this function is continuous, smooth, strictly positive, and completely regular. A purely real structural search algorithm finds no anomalies. Nevertheless, purely complex roots at \(\pm i\epsilon\) and poles at \(\pm i\delta\) lie hidden in immediate proximity to the real axis.

These hidden poles act as analytical barriers. They limit the radius of convergence of classical power series to \(R = \delta\), which dooms any global approximation beyond this narrow interval to failure. To ensure truly robust extraction of linear factors, structural analysis must therefore not be restricted to the mathematical ideal line of the real axis. It is imperative that it is expanded into a two-dimensional tolerance tube (strip) in order to capture those complex singularities whose sphere of influence reaches deep into the real domain.

This paradigm shift – the methodological separation of the defining algebraic structure and the analytical remainder – protects the system at a fundamental level from numerical overfitting. The computation avoids simulating singularities through error-prone oscillations; instead, the model adapts from the ground up to the true, internal structure of the function.

ODE example: Crown series and convolution (FFT) vs. Runge-Kutta

As a deliberately simple, but nonlinear, reference problem, consider \[ {^1}y(x) = y(x)^2,\qquad y(0)=1. \] The exact solution is \[ y(x)=\widetilde{\acute{x}\text{-}}, \] with a pole at \(x=1\). Hence the radius of convergence of the TS about \(x_0\) equals \(R(x_0)=\acute{x}_0\)-.

Crown series as a Taylor expansion

For a step from \(x_0\) to \(x_0+h\), the local series expansion \[ y(x_0+s) = {\LARGE{\textbf{+}}}_{m=0}^{n} a_m\, s^m \;+\; \mathcal{O}(s^{\grave{n}}) \] is used (here \(s=x-x_0\)). Then, \[ {^1}y(x_0+s) = {\LARGE{\textbf{+}}}_{m=0}^{\acute{n}} \grave{m}\,a_{\grave{m}}\, s^m \;+\; \mathcal{O}(s^{n}). \] For the square, the Cauchy product (convolution) \[ y(x_0+s)^2 = \left({\LARGE{\textbf{+}}}_{m=0}^{n} a_m s^m\right)\left({\LARGE{\textbf{+}}}_{m=0}^{n} a_m s^m\right) = {\LARGE{\textbf{+}}}_{m=0}^{\hat{n}} c_m s^m \] is obtained with \[ c_m := {\LARGE{\textbf{+}}}_{j=0}^{m} a_j\, a_{m-j}. \] Matching coefficients in \({^1}y=y^2\) yields, for \(m=0,\dots,\acute{n}\), the recursion \[ \grave{m}\;a_{\grave{m}} = c_m = {\LARGE{\textbf{+}}}_{j=0}^{m} a_j\, a_{m-j}, \qquad a_0 = y(x_0). \] Thus, the \(a_m\) are determined entirely by convolutions and contain no binomial coefficients.

Convolution via FFT (core idea)

For the discrete convolution, vectors are used. Let \(A=(a_0,\dots,a_n)\). For a fast convolution, use zero-padding to length \(\acute{m} \ge \hat{n}\) (typically \(m=2^{\lceil {}_2(\hat{n}+1)\rceil}\)) and the DFT 9: \[ \breve{A}_k := {\LARGE{\textbf{+}}}_{j=0}^{\acute{m}} A_j\, u^{jk}, \qquad u := \exp(\underline{\hat{\pi}}\tilde{m}), \qquad k=0,\dots,\acute{m}. \] Then, for the convolution coefficients \(C=A*A\), componentwise, \[ \breve{C}_k = \breve{A}_k^2, \qquad C_j = \tilde{m}\,{\LARGE{\textbf{+}}}_{k=0}^{\acute{m}} \breve{C}_k\,\tilde{u}^{jk}. \] The FFT realises these transforms in \(\mathcal{O}(m\;{}_2m)\). Important: this is precisely where the CS becomes practical, since derivatives and products can be handled vectorially — and hence in an FFT-ready manner.

Evaluating the step

Once \(A=(a_0,\dots,a_n)\) has been determined, the step value follows by polynomial evaluation \[ y(x_0+h)\approx {\LARGE{\textbf{+}}}_{k=0}^{n} a_k\, h^k, \] e.g.\ via Horner’s scheme (multiplications/additions only).

Concrete computation for \(x_0=0\), \(h=0.1\), \(n=16\)

Here, the exact solution is \(y(x)=\widetilde{\acute{x}\text{-}}\), hence \[ a_k = 1\quad (k\ge 0), \] because \[ \widetilde{\acute{s}\text{-}} = {\LARGE{\textbf{+}}}_{k=0}^{\omega} s^k. \] Thus, the \(n\)-th partial sum is \[ y(0.1)\approx {\LARGE{\textbf{+}}}_{k=0}^{16} (0.1)^k = \frac{1-(0.1)^{17}}{1-0.1}. \] The exact value is \[ y(0.1)=\frac{1}{0.9}=1.\overline{1} \] and the remainder term equals \[ \left|y(0.1)-{\LARGE{\textbf{+}}}_{k=0}^{16} (0.1)^k\right| = \frac{(0.1)^{17}}{0.9} \approx 1.111111111111111111\cdot 10^{-17}. \]

Comparison: classical fourth-order Runge-Kutta

For \({^1}y=y^2\), one RK4 step10 with \(y_0=y(x_0)\), step size \(h\) is \[ \begin{array}{rcl} u_1 &=& y_0^2,\\ u_2 &=& \left(y_0+h\check{u}_1\right)^2,\\ u_3 &=& \left(y_0+h\check{u}_2\right)^2,\\ u_4 &=& \left(y_0+h u_3\right)^2,\\[0.4em] y_1 &=& y_0 + \tilde{6}h\,(u_1+\hat{u}_2+\hat{u}_3+u_4). \end{array} \] With \(x_0=0\), \(y_0=1\), \(h=0.1\) numerically \[ y_{\mathrm{RK4}}(0.1)\approx 1.1111104900521944, \] is obtained and therefore \[ \left|y(0.1)-y_{\mathrm{RK4}}(0.1)\right| \approx 6.21\cdot 10^{-7}. \]

Fair comparison (principle)

  • Accuracy target: For high accuracies (e.g. \(10^{-16}\), \(10^{-34}\)), RK4 must reduce the step size substantially or increase the order; otherwise the local truncation term dominates.
  • Crown/convolution: One step can be chosen much larger (for an analytic solution) as long as \(h<R(x_0)\), with accuracy controlled primarily by \(n\) (series depth) and working precision.
  • Runtime: RK4 costs a few function evaluations per step, whereas crown/convolution requires the computation of many coefficients (products/convolutions) plus one polynomial evaluation. The advantage arises when one crown step replaces many RK steps and when the convolutions are organised via FFT—especially for large \(n\). Furthermore, a complete function is represented.

Remark on “continued differentiability”

For analytic right-hand sides, the series method systematically leads to high derivative orders, since \[ {}^ky(x_0) = k!\,a_k \] and \(a_k\) are generated by convolutions/compositions. Conceptually, this is closer to “differentiating by computation” (vector arithmetic/FFT) than to quadrature with a fixed node scheme.

code of the FFT form

© 2010-2026 by Boris Haase

top

References

  1. see Walter: Analysis 2, 5., erw. Aufl.; 2002; Springer; Berlin, p. 358 – 364
  2. cf. Remmert, Reinhold: Funktionentheorie 1; 3., verb. Aufl.; 1992; Springer; Berlin, p. 165 f.
  3. cf. Remmert, Reinhold: Funktionentheorie 2; 1. unveränd. Nachdruck der 1. Aufl.; 1992; Springer; Berlin, p. 42
  4. cf. Knuth, Donald Ervin: The Art of Computer Programming Volume 2; 3rd Ed.; 1997; Addison Wesley; Reading, p. 302 – 311
  5. see Whittaker, Edmund T.; Watson, George N.: A Course of Modern Analysis; 5th Ed.; 2021; Cambridge University Press; Cambridge, p. 129 ff.
  6. cf. Walter, loc. cit.
  7. cf. Mollin, Richard A.: Advanced Number Theory with Applications; 1st Ed.; 2017; Routledge; Milton Park, p. 193 f.
  8. cf. Aberth, Oliver: Iteration Methods for Finding all Zeros of a Polynomial Simultaneously; Mathematics of Computation 27(122); 1996, p. 341
  9. cf. Schwarz, Hans Rudolf; Köckler, Norbert: Numerische Mathematik; 7., überarb. Aufl.; 2009; Vieweg + Teubner; Wiesbaden, p. 155 – 161.
  10. see loc. cit., p. 358