The Shape of the Differences of the Complex Zeros of Three-Term Exponential Polynomials
In my last blog, I looked at the distribution of the distances of the real zeros of functions of the form with incommensurate , . And after analyzing the real case, I now want to have a look at the differences of the zeros of three-term exponential polynomials of the form for real , , . (While we could rescale to set and for the zero set , keeping and will make the resulting formulas look more symmetric.) Looking at the zeros in the complex plane, one does not see any obvious pattern. But by forming differences of pairs of zeros, regularities and patterns emerge, which often give some deeper insight into a problem. We do not make any special assumptions about the incommensurability of , , .
The differences of the zeros of this type of function are all located on oval-shaped curves. We will find a closed form for these ovals. Using experimental mathematics techniques, we will show that ovals are described by the solutions of the following equation:
… where:
Here this situation is visualized for the function , meaning and , , , , . We calculate a few dozen exact zeros and plot the described curve.
✕
edzs = Sort[ N[z /. Solve[ Exp[I z] + 2/3 Exp[I Sqrt[2] z] + 1/2 Exp[I Sqrt[3] z] == 0 \[And] -5 < Im[z] < 5 \[And] 0 < Re[z] < 500, z]]]; |
✕
Show[{(* curves of zeros *) RegionPlot[(2/3)^(2 (Sqrt[3] - 1)) (1/2)^(2 (1 - Sqrt[2])) * (Cosh[(Sqrt[2] - 1) y] - Cos[(Sqrt[2] - 1) x])^( Sqrt[2] - 1) (Cosh[(1 - Sqrt[3]) y] - Cos[(1 - Sqrt[3]) x])^( 1 - Sqrt[ 3]) (Cosh[(Sqrt[3] - Sqrt[2]) y] - Cos[(Sqrt[3] - Sqrt[2]) x])^( Sqrt[3] - Sqrt[2]) > 1, {x, 0, 55}, {y, -3, 3}, PlotPoints -> 60, PlotStyle -> Directive[RGBColor[0.88, 0.61, 0.14], Opacity[0.3]], BoundaryStyle -> None], (* numerically calculated zeros *) ListPlot[-ReIm[Apply[Subtract, Subsets[edzs, {2}], {1}]], PlotRange -> {{0, 55}, {-3, 3}}]}, AspectRatio -> 1/3] |
While one easily sees the ovals emerge from numerically calculated zeros, how does one find a closed form for the curves on which they all fall? Using an experimental mathematics approach that includes symbolic polynomial manipulations as well as numerical techniques, including high-precision calculations, one can find the previously shown closed form of the curves. In this blog, I will show how to find these curves.
Expressions containing for occur from time to time in complex analysis---for instance, for the Dirichlet kernel of a strip (Widder, 1961) or in fluid dynamics (see e.g. Baker and Pham, 2006).
A natural generalization of the function used in the last blog is for real . There is a lot of literature on exponential polynomials that are prime examples of almost periodic functions (see e.g. Jessen and Tornehave, 1945, Moreno, 1973, Sepulcre, 2016, and Mora, Sepulcre and Vidal, 2013.)
Let us have a quick look at this function. Like in the last blog, we use the special instance .
✕
fGoldenExp[z_] := Exp[I z] + Exp[I GoldenRatio z] + Exp[I GoldenRatio^2 z] |
We calculate the first zeros, extrema and inflection points.
✕
zeis = Table[ z /. Solve[ D[fGoldenExp[z], {z, k}] == 0 \[And] -5 < Im[z] < 5 \[And] 0 < Re[z] < 50, z], {k, 0, 2}]; |
Plotting the zeros, extrema and inflection points in the complex plane shows that the real parts are nearly identical for each "group" of zero, extrema and inflection point triples.
✕
Legended[ContourPlot[ Evaluate[# == 0 & /@ ReIm[fGoldenExp[x + I y] ]], {x, 000, 050}, {y, -3, 3}, PlotPoints -> 50, AspectRatio -> 1/2, Epilog :> {Purple, PointSize[0.01], Point[N[ReIm[#]] & /@ zeis[[1]]], Darker[Green], Point[N[ReIm[#]] & /@ zeis[[2]]], Darker[Red], Point[N[ReIm[#]] & /@ zeis[[3]]]}], LineLegend[{Directive[Thick, Purple], Darker[Green], Darker[Red]}, {"zeros", "extrema", "inflection points"}]] |
(The "nice" vertical alignment of the zeros of the function and their derivatives is not always the case---for instance, when and have different signs, the alignment is broken.)
I now calculate ~5k zeros of . This time, we can't use the differential equation technique; instead we use Solve. We sort the zeros by increasing real part.
✕
Monitor[fGoldenExpZeros = SortBy[N@ Flatten[Table[ z /. Solve[fGoldenExp[z] == 0 \[And] -5 < Im[z] < 5 \[And] 100 k <= Re[z] < 100 k + 100, z], {k, 0, 200}]], Re];, k] |
✕
Length[fGoldenExpZeros] |
The values of the three exponential summands at the zeros form interesting shapes in the complex plane.
✕
Legended[ Graphics[{Thickness[0.001], Transpose[{{RGBColor[0.36, 0.50, 0.71], RGBColor[0.88, 0.61, 0.14], RGBColor[0.56, 0.69, 0.19]}, Line /@ Transpose[Function[z, {{{0, 0}, ReIm[Exp[I z]]} , {{0, 0}, ReIm[Exp[I GoldenRatio z]]}, {{0, 0}, ReIm[Exp[I GoldenRatio^2 z]]}}] /@ RandomSample[fGoldenExpZeros, 500]]}]}, Frame -> True], LineLegend[{Directive[Thick, RGBColor[0.36, 0.50, 0.71]], Directive[Thick, RGBColor[0.88, 0.61, 0.14]], Directive[Thick, RGBColor[0.56, 0.69, 0.19]]}, {Exp[I z], Exp[I GoldenRatio z], Exp[I GoldenRatio^2 z]}]] |
As one can already see from the graphic, the term is never the smallest and never the largest at a zero.
✕
(Function[z, Sort[{{Abs[Exp[I z]], 1}, {Abs[Exp[I GoldenRatio z]], 2}, {Abs[Exp[I GoldenRatio^2 z]], 3}}][[2, 2]]] /@ fGoldenExpZeros) // Union |
Looking at the number of curves for vanishing real and imaginary parts for large positive and negative real parts of z shows that the slowest and fastest oscillating terms dominate the function behavior in the upper and lower half-plane. The mean spacing along the real axis between zeros follows from this observation as .
✕
2 Pi/(GoldenRatio^2 - 1) // N |
This value agrees well with the spacing derived from the calculated zeros.
✕
Re[fGoldenExpZeros[[-1]]]/Length[fGoldenExpZeros] |
Plotting the zeros with the real parts modulo the mean spacing confirms the calculated value. All roots are within a constant distance from the reduced center.
✕
Manipulate[ Graphics[{Thickness[0.001], RGBColor[0.36, 0.51, 0.71], Opacity[0.3], Line[{Mod[Re[#], \[CapitalDelta]], Im[#]} & /@ Take[fGoldenExpZeros, 1000]]}], {{\[CapitalDelta], 2 Pi/(GoldenRatio^2 - 1.)}, 3, 5, Appearance -> "Labeled"}, TrackedSymbols :> True, SaveDefinitions -> True] |
At the zeros, is strongly correlated with . For a given real part, there is a unique imaginary part.
✕
Histogram3D[{Mod[Re[#1], 2 Pi], Im[#2]} & @@@ Partition[fGoldenExpZeros, 2, 1], 100] |
The real and imaginary parts of have the following distributions at the zeros.
✕
Histogram[#[fGoldenExp' /@ fGoldenExpZeros], 100] & /@ {Re, Im} |
The distribution of the complex values of the three summands , , has some unexpected shapes.
✕
Function[p, Histogram3D[{Mod[#1, p], #2} & @@@ (ReIm /@ (-Subtract @@@ Subsets[RandomSample[fGoldenExpZeros, 2000], {2}])), 100]] /@ (2 Pi/{1, GoldenRatio, GoldenRatio^2 }) |
Following the main thread of the distribution of zero distances, I sort the zeros by increasing real parts and calculate the differences. Interestingly, one gets a Stonehenge-like distribution for the differences in the complex plane.
✕
differencesGoldenExp = Differences[fGoldenExpZeros]; |
✕
Histogram3D[ReIm /@ differencesGoldenExp, 100] |
The figure looks like a perfect circle. I fit an ellipse to the data.
✕
ellipseData = ((Re[#] - xm)^2/a^2 + (Im[#] - ym)^2/b^2 - 1^2) & /@ differencesGoldenExp; |
✕
fmEllipse = FindMinimum[ellipseData.ellipseData, {{xm, Mean[ReIm /@ differencesGoldenExp][[1]]}, {ym, 0}, {a, 1}, {b, 1}}, PrecisionGoal -> 10] |
Interestingly, the figure is nearly a circle. The blue circle is the best-fit ellipse, and the black points are the observed differences. (Trying to fit a rotated ellipse does not improve the fit.)
✕
{#, Show[#, PlotRange -> {{4, 4.1}, {1.11, 1.15}}]} &[ Graphics[{Blue, Circle[{xm, ym}, Abs[{a, b}]] /. fmEllipse[[2]], PointSize[0.002], Black, Point[ReIm /@ differencesGoldenExp]}, Axes -> True]] |
But it is not quite a circle or even an ellipse; zooming in, one sees some small oscillating deviations from the circle. The following plot shows the difference in local radius of the fitted circle to the calculated zeros as a function of the complex argument.
✕
ListPlot[({Arg[# - (xm + I ym)], Abs[# - (xm + I ym)]} /. fmEllipse[[2]]) & /@ differencesGoldenExp] |
Successive differences are often quite different. I connect successive differences in the complex plane. The angles between two successive line segments seem to be approximately constant.
✕
Graphics[{Thickness[0.001], RGBColor[0.36, 0.51, 0.71], Line[ReIm /@ Take[differencesGoldenExp, 200]]}] |
The angle between successive line segments in the last image is quite localized to a narrow range, with the minimal and maximal angles occurring the most frequently.
✕
Histogram[ VectorAngle[#1 - #2, #3 - #2] & @@@ Partition[ReIm /@ differencesGoldenExp, 3, 1], 100, PlotRange -> {{0, Pi}, All}] |
The pair correlation of successive differences shows a strict correlation of successive zeros.
✕
Histogram3D[Partition[Arg /@ differencesGoldenExp, 2, 1], 100] |
These patterns observed for successive differences are all also present in the differences of next-neighboring zeros. The following graphics array shows the zero differences for j=3,4,...,11.
✕
Table[Graphics[{Thickness[0.001], RGBColor[0.36, 0.51, 0.71], Line[ReIm[#[[-1]] - #[[1]]] & /@ Partition[Take[differencesGoldenExp, 500], k, 1]]}, ImageSize -> 160], {k, 3, 11}] // Partition[#, 3] & |
The observed properties are: 1) zeros, extrema and inflection points line up with nearly identical real parts; and 2) the differences of successive zeros that are approximately on an ellipse are not special to the exponential polynomial with =, but hold for general . For generic , we still see these ellipses. And, similar to the previous Stonehenge image, the rightmost parts of the histogram are often the largest.
Similar to what I did in my last blog for , we shift the argument of and show how the function behaves in the neighborhoods of zeros. The following graphic shows the curves of the vanishing real part in gray and the vanishing imaginary part in blue in the neighborhood of the first 30 zeros. The genesis of the zero accumulation on near-circles is clearly visible.
✕
Show[{Table[ ContourPlot[Evaluate[# == 0 & /@ ReIm[fGoldenExp[z0 + (x + I y)]]], {x, 0, 10}, {y, -2, 2}, ContourStyle -> {Directive[Gray, Opacity[0.4], Thickness[0.001]], Directive[RGBColor[0.36, 0.51, 0.71], Opacity[0.6], Thickness[0.001]]}], {z0, Take[fGoldenExpZeros, 30]}], Graphics[{Purple, PointSize[0.004], Point[ReIm /@ Table[fGoldenExpZeros[[j + 1]] - fGoldenExpZeros[[j]], {j, 30}]] , Point[ ReIm /@ Table[ fGoldenExpZeros[[j + 2]] - fGoldenExpZeros[[j]], {j, 30}]]}]}, AspectRatio -> Automatic] |
Comparing the last graphic with a version that uses randomized phases between the three exponential terms does not reproduce the circle patterns of the zeros.
✕
fGoldenExpRandomPhases[ z_, {\[CurlyPhi]2_, \[CurlyPhi]3_}] := (-Exp[I \[CurlyPhi]2] - Exp[I \[CurlyPhi]3]) Exp[I z] + Exp[I \[CurlyPhi]2] Exp[I GoldenRatio z] + Exp[I \[CurlyPhi]3] Exp[I GoldenRatio^2 z] |
✕
Module[{\[CurlyPhi]2, \[CurlyPhi]3}, plData = Table[\[CurlyPhi]2 = RandomReal[{0, 2 Pi}]; \[CurlyPhi]3 = RandomReal[{0, 2 Pi}]; {ContourPlot[ Evaluate[# == 0 & /@ ReIm[fGoldenExpRandomPhases[ x + I y, {\[CurlyPhi]2, \[CurlyPhi]3}]]], {x, 0, 10}, {y, -2, 2}, ContourStyle -> {Directive[Gray, Opacity[0.4], Thickness[0.001]], Directive[RGBColor[0.36, 0.51, 0.71], Opacity[0.6], Thickness[0.001]]}], Point[ ReIm /@ (z /. Solve[fGoldenExpRandomPhases[ z, {\[CurlyPhi]2, \[CurlyPhi]3}] == 0 \[And] 0 < Re[z] < 10 \[And] -3 < Im[z] < 3, z])] // Quiet} , {30}]; Show[{First /@ plData, Graphics[{Purple, PointSize[0.004], Last /@ plData}]}, AspectRatio -> Automatic]] |
This time, it does not make sense to form the envelope with, say, because the resulting equation is independent of and so has no family of solutions, but rather just isolated points.
✕
Factor[Subtract @@ Eliminate[{fGoldenExpRandomPhases[ z, {\[CurlyPhi]2, \[CurlyPhi]3}] == 0, D[fGoldenExpRandomPhases[ z, {\[CurlyPhi]2, \[CurlyPhi]3}], \[CurlyPhi]2] == 0} /. \[CurlyPhi]2 -> Log[C]/I, C] ] |
It is possible to derive a closed form for the circle-shaped curves on which the differences of the zeros are located.
The locations of the zero differences on the ellipse-shaped curves are curious. Can we get a closed-form equation for these shapes? As it turns out, we can. Since the derivation is a bit longer, we carry it out in this appendix. Rather than dealing with the general , situation, we will deal with , and then generalize to generic , by guessing based on the golden ratio result.
✕
f\[Alpha][z_] := Exp[I z] + Exp[I \[Alpha] z] +(* \[Equal]\[ThinSpace]Exp[ I \[Alpha]^2 z] for \[Alpha]\[ThinSpace]\[Equal]\[ThinSpace]\[Phi] *) Exp[I z] Exp[I \[Alpha] z] |
We start by writing down the conditions for and , which should both be zeros of .
✕
{f\[Alpha][z0], f\[Alpha][z0 + \[Delta]z]} // ExpandAll |
We separate real and imaginary parts using and ; supplement with two trigonometric identities; and rewrite , , and as polynomial variables.
✕
eqs1 = {Re[f\[Alpha][x0 + I y0]] // ComplexExpand // TrigExpand, Im[f\[Alpha][x0 + I y0]] // ComplexExpand // TrigExpand, Re[f\[Alpha][x0 + I y0 + \[Delta]x + I \[Delta]y]] // ComplexExpand // ExpandAll // TrigExpand, Im[f\[Alpha][x0 + I y0 + \[Delta]x + I \[Delta]y]] // ComplexExpand // ExpandAll // TrigExpand, Cos[x0]^2 + Sin[x0]^2 - 1, Cos[\[Alpha] x0]^2 + Sin[\[Alpha] x0]^2 - 1 } /. {Cos[x0] -> c0, Sin[x0] -> s0, Cos[\[Alpha] x0] -> c\[Alpha], Sin[\[Alpha] x0] -> s\[Alpha]} |
This system of equations does describe the possible positions of the zeros at . A quick numerical experiment confirms this.
✕
of1 = eqs1.eqs1 /. \[Alpha] -> GoldenRatio; Monitor[nsol1 = Module[{fm}, Table[ fm = FindMinimum[Evaluate[of1], {c0, RandomReal[{-1, 1}]}, {s0, RandomReal[{-1, 1}]}, {c\[Alpha], RandomReal[{-1, 1}]}, {s\[Alpha], RandomReal[{-1, 1}]}, {\[Delta]x, RandomReal[{2, 4}]}, {\[Delta]y, RandomReal[{-1.5, 1.5}]}, {y0, RandomReal[{-1.5, 1.5}]}, PrecisionGoal -> 12, AccuracyGoal -> 15, WorkingPrecision -> 30, Method -> "Newton"] // Quiet; If[fm[[1]] < 10^-8, fm, Sequence @@ {}], {j, 100}]];, j] |
✕
Graphics[{RGBColor[0.36, 0.51, 0.71], Point[{\[Delta]x, \[Delta]y} /. N[nsol1[[All, 2]]]]}, PlotRange -> {{0, 8}, {-2, 2}}, Axes -> True] |
Now we want to eliminate the variables and to obtain universally valid formulas for any root. We introduce some more polynomial variables and eliminate the four terms that contain .
✕
eqs2 = Numerator[ Together[eqs1 /. \[Alpha] y0 -> Log[Y\[Alpha]0] /. y0 -> Log[Y0] /. \[Alpha] \[Delta]y -> Log[\[Delta]Y\[Alpha]] /. \[Delta]y -> Log[\[Delta]Y]]] |
✕
gb2 = GroebnerBasis[eqs2, {}, { c0, s0, c\[Alpha], s\[Alpha]}, MonomialOrder -> EliminationOrder] // Factor; |
Now we have 15 equations.
✕
Length[gb2] |
These equations still describe the positions of the roots.
✕
of2 = Evaluate[ gb2.gb2 /. {\[Delta]Y -> Exp[\[Delta]y], \[Delta]Y\[Alpha] -> Exp[\[Alpha] \[Delta]y]} /. Y0 -> Exp[y0] /. Y\[Alpha]0 -> Exp[\[Alpha] y0] /. \[Alpha] -> GoldenRatio]; Monitor[nsol2 = Module[{fm}, Table[ fm = FindMinimum[Evaluate[of2], {\[Delta]x, RandomReal[{2, 4}]}, {\[Delta]y, RandomReal[{-1.5, 1.5}]}, {y0, RandomReal[{-2, 2}]}, PrecisionGoal -> 12, AccuracyGoal -> 15, WorkingPrecision -> 30, Method -> "Newton"] // Quiet; If[fm[[1]] < 10^-8, fm, Sequence @@ {}] , {j, 100}]];, j] |
✕
Graphics[{RGBColor[0.36, 0.51, 0.71], Point[{\[Delta]x, \[Delta]y} /. N[nsol2[[All, 2]]]]}, PlotRange -> {{0, 8}, {-2, 2}}, Axes -> True] |
Let's hope that we do not need 15 equations to find an equation that describes the three values , and . Fortunately, using just the three smallest elements of the GroebnerBasis still yields the desired zero shape.
✕
gb2Sorted = SortBy[gb2, Length]; |
✕
of3 = Evaluate[#.# &[ Take[gb2Sorted, 3]] /. {\[Delta]Y -> Exp[\[Delta]y], \[Delta]Y\[Alpha] -> Exp[\[Alpha] \[Delta]y]} /. Y0 -> Exp[y0] /. Y\[Alpha]0 -> Exp[\[Alpha] y0] /. \[Alpha] -> GoldenRatio]; Monitor[nsol3 = Module[{fm}, Table[ fm = FindMinimum[Evaluate[of3], {\[Delta]x, RandomReal[{2, 4}]}, {\[Delta]y, RandomReal[{-1.5, 1.5}]}, {y0, RandomReal[{-2, 2}]}, PrecisionGoal -> 12, AccuracyGoal -> 15, WorkingPrecision -> 30, Method -> "Newton"] // Quiet; If[fm[[1]] < 10^-8, fm, Sequence @@ {}] , {j, 100}]];, j] |
✕
Graphics[{RGBColor[0.36, 0.51, 0.71], Point[{\[Delta]x, \[Delta]y} /. N[nsol3[[All, 2]]]]}, PlotRange -> {{0, 8}, {-2, 2}}, Axes -> True] |
These three equations can be further reduced to just two equations.
✕
eqs3 = List @@ FullSimplify[And @@ (# == 0 & /@ Take[gb2Sorted, 3])] |
Plotting these two equations together as functions of , and shows that the regions where both equations are fulfilled are just the ellipse-shaped rings we are after.
✕
eqs3A = (List @@ eqs3) /. {\[Delta]Y -> Exp[\[Delta]y], \[Delta]Y\[Alpha] -> Exp[\[Alpha] \[Delta]y], Y0 -> Exp[y0], Y\[Alpha]0 -> Exp[\[Alpha] y0]} |
✕
ContourPlot3D[Evaluate[eqs3A /. \[Alpha] -> GoldenRatio], {\[Delta]x, 0, 8}, {\[Delta]y, -2, 2}, {y0, -2, 2}, MeshFunctions -> {#3 &}, BoxRatios -> Automatic, ViewPoint -> {0.39, 3.059, 1.39}] |
To obtain one equation that describes the rings, we also have to eliminate the imaginary parts of the reference zero, meaning . Unfortunately, because the two terms and are not algebraically related, we cannot use GroebnerBasis or Resultant to eliminate . But we are lucky and can solve the first equation for .
✕
sol3A = Solve[eqs3A[[1]], y0] |
The resulting implicit equation for the rings is a bit ugly.
✕
(Subtract @@ eqs3A[[2]] == 0) /. sol3A[[2]] |
But it can be simplified to a quite nice-looking closed form.
✕
FullSimplify[% /. sol3A[[2]], \[Delta]x > 0 \[And] \[Delta]y \[Element] Reals \[And] \[Alpha] > 0] |
Plotting this equation together with the zeros calculated above shows a perfect match of the zeros and the closed forms of the curves.
✕
ContourPlot[ Evaluate[Cos[\[Delta]x] + (-Cos[\[Alpha] \[Delta]x] + Cosh[\[Alpha] \[Delta]y])^\[Alpha] (-Cos[\[Delta]x - \[Alpha] \ \[Delta]x] + Cosh[\[Delta]y - \[Alpha] \[Delta]y])^(1 - \[Alpha]) == Cosh[\[Delta]y] /. \[Alpha] -> GoldenRatio], {\[Delta]x, 0, 8}, {\[Delta]y, -2, 2}, AspectRatio -> Automatic, Prolog -> {RGBColor[0.88, 0.61, 0.14], PointSize[0.01], Opacity[0.5], Table[Point[ ReIm[#[[-1]] - #[[1]]] & /@ Partition[Take[fGoldenExpZeros, 200], j, 1]], {j, 2, 6}]}, AxesLabel -> {"\[Delta]x", "\[Delta]y"} ] |
Now, taking into account that for general , the resulting formula that describes the roots must be symmetric in and and that for the general three-term sums , it is not difficult to conjecture a closed form for the rings. We have the following implicit description for the relative zero positions. (We use , , to make the equation fully symmetric.)
✕
zeros\[Alpha]\[Beta]\[Gamma][{\[Alpha]_, \[Beta]_, \[Gamma]_}, {\ \[Delta]x_, \[Delta]y_}] := (Cosh[(\[Beta] - \[Alpha]) \[Delta]y] - Cos[(\[Beta] - \[Alpha]) \[Delta]x])^(\[Beta] - \[Alpha]) \ (Cosh[(\[Gamma] - \[Beta]) \[Delta]y] - Cos[(\[Gamma] - \[Beta]) \[Delta]x])^(\[Gamma] - \[Beta]) \ (Cosh[(\[Gamma] - \[Alpha]) \[Delta]y] - Cos[(\[Gamma] - \[Alpha]) \[Delta]x])^(\[Alpha] - \[Gamma]) == 1 |
A quick check for the exponential polynomial confirms the conjectured equation.
✕
Monitor[fSqrt3EPiExpZeros = SortBy[Flatten[Table[ z /. Solve[Exp[I Sqrt[3] z] + Exp[I E z] + Exp[I Pi z] == 0 \[And] -5 < Im[z] < 5 \[And] 50 k <= Re[z] < 50 k + 50, z], {k, 0, 20}]], Re];, k] |
✕
ContourPlot[ Evaluate[ zeros\[Alpha]\[Beta]\[Gamma][{Sqrt[3], E, Pi}, {\[Delta]x, \[Delta]y}] ], {\[Delta]x, 0, 25}, {\[Delta]y, -2, 2}, AspectRatio -> Automatic, PerformanceGoal -> "Quality", PlotPoints -> 40, MaxRecursion -> 1, PlotPoints -> {120, 40}, WorkingPrecision -> 40, Prolog -> {RGBColor[0.88, 0.61, 0.14], PointSize[0.005], Opacity[0.5], Table[Point[ ReIm[#[[-1]] - #[[1]]] & /@ Partition[Take[N@fSqrt3EPiExpZeros, 200], j, 1]], {j, 2, 6}]}, AxesLabel -> {"\[Delta]x", "\[Delta]y"} ] |
In addition to this visual check, we should perform a more stringent test. To do this, we have a look at the difference of the two sides of the equation zeros.
✕
zeros\[Alpha]\[Beta]\[Gamma]Difference[{\[Alpha]_, \[Beta]_, \ \[Gamma]_}, {\[Delta]x_, \[Delta]y_}] = Subtract @@ zeros\[Alpha]\[Beta]\[Gamma][{\[Alpha], \[Beta], \[Gamma]}, {\ \[Delta]x, \[Delta]y}] |
Checking the identity with all zeros calculated to one thousand digits shows that the conjectured identity indeed holds. While this is not a proof, it is a very comforting check.
✕
With[{zerosHP = N[fSqrt3EPiExpZeros, 1000]}, Table[zeros\[Alpha]\[Beta]\[Gamma]Difference[{Sqrt[3], E, Pi}, ReIm[#[[-1]] - #[[1]]]] & /@ Partition[zerosHP, j, 1], {j, 2, 6}]] // Abs // Max |
The function that appears in the last formula has the following shape in space.
✕
Y[\[Sigma]_, {x_, y_}] := (Cosh[\[Sigma] y] - Cos[\[Sigma] x])^\[Sigma] |
✕
ContourPlot3D[ Y[\[Sigma], {x, y}] == 1 , {x, -4 Pi, 4 Pi}, {y, -4, 4}, {\[Sigma], 0, 3}, AxesLabel -> {x, y, \[Sigma]}, ViewPoint -> {0.64, -3.02, 1.37}, MeshFunctions -> {#3 &}, BoxRatios -> {2, 1, 1}, PlotPoints -> {80, 40, 60}, MaxRecursion -> 0] |
As a function of and , the function Y obeys two symmetric differential equations:
✕
{D[f[x, y], x]^2 - D[f[x, y], y]^2 + 2 D[f[x, y], y]^2 \[Sigma] - 2 f[x, y] D[f[x, y], y, y] \[Sigma] + f[x, y]^2 \[Sigma]^4, D[f[x, y], x]^2 - D[f[x, y], y]^2 - 2 D[f[x, y], x]^2 \[Sigma] + 2 f[x, y] D[f[x, y], x, x] \[Sigma] + f[x, y]^2 \[Sigma]^4} /. f -> Function[{x, y}, Y[\[Sigma], {x, y}] ] // Simplify |
And the even simpler equation:
✕
\[Sigma] f[x, y] D[f[x, y], x, y] + D[f[x, y], x] (D[f[x, y], y] - \[Sigma] D[f[x, y], y]) /. f -> Function[{x, y}, Y[\[Sigma], {x, y}] ] // Simplify |
One can easily generalize the previous formula that describes the location of the zero differences to the case .
✕
zeroABC\[Alpha]\[Beta]\[Gamma]Curve[{A_, B_, C_}, {\[Alpha]_, \[Beta]_, \[Gamma]_}, {x_, y_}] := Abs[A]^(2 (\[Beta] - \[Gamma])) Abs[B]^(2 (\[Gamma] - \[Alpha])) Abs[C]^(2 (\[Alpha] - \[Beta])) Y[\[Alpha] - \[Gamma], {x, y}] Y[\[Gamma] - \[Beta], {x, y}] Y[\[Beta] - \[Alpha], {x, y}] |
Here is a random example of a three-term sum with prefactors.
✕
f2[{A_, B_, C_}, {\[Alpha]_, \[Beta]_, \[Gamma]_}, z_] := A Exp[I \[Alpha] z] + B Exp[I \[Beta] z] + C Exp[I \[Gamma] z] |
The numerically calculated zero differences all are on the implicitly described curve zeroABCCurve.
✕
With[{\[Alpha] = Sqrt[2], \[Beta] = Sqrt[3], \[Gamma] = E/3, A = 1/2 Exp[2 I], B = 3/4 Exp[3^(1/3) I], C = 5/4}, edzs = Sort[ N[z /. Solve[ f2[{A, B, C}, {\[Alpha], \[Beta], \[Gamma]}, z] == 0 \[And] -5 < Im[z] < 5 \[And] 0 < Re[z] < 500, z], 100]]; zeroPairs = -Subtract @@@ Subsets[edzs, {2}]; lp = ListPlot[ReIm /@ zeroPairs, PlotRange -> {{0, 30}, {-2, 2}}]; Show[{RegionPlot[ Evaluate[ zeroABC\[Alpha]\[Beta]\[Gamma]Curve[{A, B, C}, {\[Alpha], \[Beta], \[Gamma]}, {x, y}] > 1], {x, 0, 30}, {y, -2, 2}, PlotPoints -> 60, PlotStyle -> Directive[RGBColor[0.88, 0.61, 0.14], Opacity[0.3]], BoundaryStyle -> None], lp}, AspectRatio -> 1/3]] |
Phases in the three exponents have no influence on the positions and shapes of the ovals. Here is an example that demonstrates this. The blue points with zero phases are on the same curve as the yellow/brown points that come from the exponential polynomial with phases. Just their position on the curve depends on the phases.
✕
f3[{A_, B_, C_}, {\[Alpha]_, \[Beta]_, \[Gamma]_}, {\[CurlyPhi]\[Alpha]_, \ \[CurlyPhi]\[Beta]_, \[CurlyPhi]\[Gamma]_}, z_] := A Exp[I \[Alpha] z + I \[CurlyPhi]\[Alpha]] + B Exp[I \[Beta] z + I \[CurlyPhi]\[Beta]] + C Exp[I \[Gamma] z + I \[CurlyPhi]\[Gamma]] |
✕
With[{\[Alpha] = Sqrt[2], \[Beta] = Sqrt[3], \[Gamma] = E/3, A = 1/2, B = 3/4, C = 5/4, \[CurlyPhi]\[Alpha] = 1, \[CurlyPhi]\[Beta] = 2, \[CurlyPhi]\[Gamma] = 3}, edzs1 = Sort[ N[z /. Solve[ f3[{A, B, C}, {\[Alpha], \[Beta], \[Gamma]}, {0, 0, 0}, z] == 0 \[And] -5 < Im[z] < 5 \[And] 0 < Re[z] < 500, z], 100]]; edzs2 = Sort[ N[z /. Solve[ f3[{A, B, C}, {\[Alpha], \[Beta], \[Gamma]}, {\[CurlyPhi]\[Alpha], \ \[CurlyPhi]\[Beta], \[CurlyPhi]\[Gamma]}, z] == 0 \[And] -5 < Im[z] < 5 \[And] 0 < Re[z] < 500, z], 100]]; ListPlot[{ReIm /@ (-Subtract @@@ Subsets[edzs1, {2}]), ReIm /@ (-Subtract @@@ Subsets[edzs2, {2}])}, PlotRange -> {{0, 30}, {-2, 2}}] ] |
The ovals don't always have to be separated. For appropriate parameter values , , , , and the ovals can melt onto strips. Here is an example.
✕
ContourPlot[ Evaluate[zeroABC\[Alpha]\[Beta]\[Gamma]Curve[{1, 2, 1}, {0, Log[2], Log[3]}, {x, y}] == 1], {x, 0, 40}, {y, -5, 5}, PlotPoints -> {80, 20}, AspectRatio -> 1/2] |
If we use , , that are not incommensurable, the zeros still lay on the curve described by zeroABCCurve. In this case, we sometimes can get closed forms for all zeros. Here is a simple example that brings us back to the golden ratio shown previously.
✕
Solve[ 2 + 2 Exp[1/2 I z] + I Exp[1 I z] == 0, z] |
For C[1]==0, we form the difference of the two zeros.
✕
diff = (Subtract @@ (z /. % /. ConditionalExpression[x_, _] :> x /. C[1] -> 0)) // FullSimplify |
✕
zeroABC\[Alpha]\[Beta]\[Gamma]Curve[{2, 2, I}, {0, 1/2, 1}, ReIm[diff]] /. (ri : (_Re | _Im)) :> ComplexExpand[ri, TargetFunctions -> {Re, Im}] // FullSimplify |
✕
N[%, 50] |
The two expressions in the denominator are exotic representations of and 1/.
✕
N[{Cosh[ArcTan[Im[((1 + I) - Sqrt[-1 + 2 I])^I]/ Re[((1 + I) - Sqrt[-1 + 2 I])^I]]], Cos[Log[Abs[((1 + I) - Sqrt[-1 + 2 I])^I]]]} - {GoldenRatio, 1/GoldenRatio}, 20] // Quiet |
Unfortunately there is no similar equation that describes the zeros of the sum of four exponentials. The addition of the fourth exponential term changes the behavior of the zero differences dramatically. We calculate 100+ zeros of the three-term sum .
✕
zs = With[{L = 200}, Monitor[Flatten@Table[ N[z /. Solve[Exp[I Sqrt[2] z] + Exp[I Zeta[3] z] + Exp[I E/2 z] == 0 \[And] -30 < Im[z] < 20 \[And] j L < Re[z] < j L + L, z], 20] , {j, 0, 20}] , j]]; |
We calculate the -dependent zeros of using the differential equations of these zeros.
✕
nds = NDSolveValue[ {D[Exp[I Sqrt[2] z[\[CurlyEpsilon]]] + Exp[I Zeta[3] z[\[CurlyEpsilon]]] + Exp[I E/2 z[\[CurlyEpsilon]]] + \[CurlyEpsilon] Exp[ I 3^(1/3) z[\[CurlyEpsilon]]], \[CurlyEpsilon]] == 0, z[0] == zs}, z, {\[CurlyEpsilon], 0, 1}] |
Graphically, the zeros on their own mostly change the real part.
✕
pt[\[CurlyEpsilon]_Real] := ReIm[nds[\[CurlyEpsilon]]] ParametricPlot[pt[\[CurlyEpsilon]], {\[CurlyEpsilon], 0, 1}, AspectRatio -> 1/3] |
But the differences of the zeros show a much more complicated dependence of .
✕
diffs[\[CurlyEpsilon]_Real] := With[{zeros = SortBy[nds[\[CurlyEpsilon]], Re]}, ReIm[Flatten[ Table[zeros[[i + j]] - zeros[[i]], {j, 1, 4}, {i, Length[zeros] - j}]]]] |
✕
ListPlot[Transpose[ Table[diffs[N[\[CurlyEpsilon]]], {\[CurlyEpsilon], 0, 1, 1/100}]], Joined -> True, PlotStyle -> Thickness[0.002]] |
Generically, in the case of four exponentials, the differences between zeros are no longer located on curves, but fill regions of the complex plane densely. The following input calculated about 75,000 zeros of a four-term exponential polynomial. (In the notebook, this cell is set to unevaluatable because it will run a few hours.)
✕
L = 200; counter = 0; Monitor[ zs = Flatten@Table[ N[z /. Solve[Exp[I Sqrt[2] z] + Exp[I Zeta[3] z] + Exp[I E/2 z] + Exp[I 3^(1/3) z] == 0 \[And] -30 < Im[z] < 20 \[And] jj L < Re[z] < jj L + L, z], 20], counter = counter + Length[zeros];, {jj, 0, 10^4}]; , { jj, counter}] |
Plotting the first few differences shows how the zero differences fill out a stripe along the real axis.
✕
ListPlot[Table[ ReIm[#[[-1]] - #[[1]]] & /@ ( Partition[SortBy[N[zs], Re], jk, 1]), {jk, 8}], PlotStyle -> {PointSize[0.001]}, Frame -> True, PlotRange -> {{10, All}, All}, AspectRatio -> 1/2] |
Reduced forms of a sum of four exponentials, e.g. one constant term and the remaining three terms algebraically dependent, show an intermediate degree of complexity in the zero differences. Here is an example.
✕
f\[Alpha][z_] := A + 2 Exp[I z] + Exp[I \[Alpha] z] Exp[-I z] + Exp[I \[Alpha] z] Exp[I z] |
✕
Monitor[sol = Flatten[ Table[ Solve[(f\[Alpha][z] == 0 /. {A -> 1/3, \[Alpha] -> Sqrt[3]}) && -10 < Im[z] < 10 && 20 k < Re[z] < 20 k + 20, z], {k, 0, 50}], 1], k]; |
✕
zs = Cases[N[z /. sol], _Complex]; |
✕
ListPlot[ReIm[Subtract @@@ (Reverse /@ Subsets[zs, {2}])], PlotRange -> {{0, 12}, All}, Frame -> True, AspectRatio -> 1/3] |
Finding a closed form for these curves is surely possible, but the symbolic expressions that are needed in intermediate steps are quite large. So we will postpone this calculation to a later time.
To summarize: continuing some mathematical experiments about the positions of zeros of sums of complex trigonometric functions, "strange" circles were observed. Using a mixture of visualization, numerical experiments and algebraic computations, all of which work seamlessly together in the Wolfram Language, we were able to determine a closed form equation for the positions of these "circles."
Identifying Peaks in Distributions of Zeros and Extrema of Almost-Periodic Functions: Inspired by Answering a MathOverflow Question
One of the Holy Grails of mathematics is the Riemann zeta function, especially its zeros. One representation of is the infinite sum . In the last few years, the interest in partial sums of such infinite sums and their zeros has grown. A single cosine or sine function is periodic, and the distribution of its zeros is straightforward to describe. A sum of two cosine functions can be written as a product of two cosines, . Similarly, a sum of two sine functions can be written as a product of . This reduces the zero-finding of a sum of two cosines or sines to the case of a single one. A sum of three cosine or sine functions, , is already much more interesting.
Fifteen years ago, in the notes to chapter 4 of Stephen Wolfram’s A New Kind of Science, a log plot of the distribution of the zero distances…
… of the zero distribution of —showing characteristic peaks—was shown.
And a recent question on MathOverflow.net asked for the closed forms of the positions of the maxima of the distribution of the distances of successive zeros of almost-periodic functions of the form for incommensurate and . The post showed some interesting-looking graphics, the first of which was the following ( is the golden ratio) for the distance between successive zeros (taking into account the first 100k positive zeros):
At first, one might be skeptical seeing such an unusual-looking histogram, but a quick one-liner that calculates all zeros in the intervals and does confirm a distribution of the shape shown above as a universal shape independent of the argument range for large enough intervals.
✕
Function[x0, Histogram[Differences[ t /. Solve[ Cos[t] + Cos[GoldenRatio t] + Cos[GoldenRatio^2 t] == 0 \[And] x0 < t < x0 + 10^3 \[Pi], t]], 200]] /@ {0, 10^6} |
And the MathOverflow post goes on to conjecture that the continued fraction expansions of are involved in the closed-form expressions for the position of the clearly visible peaks around zero distance , with 1, 1.5, 1.8 and 3.0. In the following, we will reproduce this plot, as well as generate, interpret and analyze many related ones; we will also try to come to an intuitive as well as algebraic understanding of why the distribution looks so.
It turns out the answer is simpler, and one does not need the continued fraction expansion of . Analyzing hundreds of thousands of zeros, plotting the curves around zeros and extrema and identifying envelope curves lets one conjecture that the positions of the four dominant singularities are twice the smallest roots of the four equations.
Or in short form: .
The idea that the concrete Diophantine properties of and determine the positions of the zeros and their distances seems quite natural. A well-known example where the continued fraction expansion does matter is the difference set of the sets of numbers for a fixed irrational (Bleher, 1990). Truncating the set of for , one obtains the following distributions for the spacings between neighboring values of . The distributions are visibly different and do depend on the continued fraction properties of .
✕
sumSpacings[\[Alpha]_, n_] := With[{k = Ceiling[Sqrt[n/\[Alpha]]]}, Take[#, UpTo[n]] &@N@ Differences[ Sort[Flatten[ Table[N[i + \[Alpha] j, 10], {i, Ceiling[\[Alpha] k]}, {j, k}]]]]] |
✕
ListLogLogPlot[Tally[sumSpacings[1/#, 100000]], Filling -> Axis, PlotRange -> {{10^-3, 1}, All}, PlotLabel -> HoldForm[\[Alpha] == 1/#], Ticks -> {{ 0.01, 0.1, 0.5}, Automatic}] & /@ {GoldenRatio, Pi, CubeRoot[3]} |
Because for incommensurate values of and , one could intuitively assume that all possible relative phases between the three cos terms occur with increasing . And because most relative phase situations do occur, the details of the (continued fraction) digits of and do not matter. But not every possible curve shape of the sum of the three cos terms will be after a zero, nor will they be realized; the occurring phases might not be equally or uniformly distributed. But concrete realizations of the phases will define a boundary curve of the possible curve shapes. At these boundaries (envelopes), the curve will cluster, and these clustered curves will lead to the spikes visible in the aforementioned graphic. This clustering together with the almost-periodic property of the function leads to the sharp peaks in the distribution of the zeros.
The first image in the previously mentioned post uses the function , with being the golden ratio. Some structures in the zero distribution are universal and occur for all functions of the form for generic nonrational , , but some structures are special to the concrete coefficients , , the reason being that the relation slips in some dependencies between the three summands , and . (The three-parameter case of can be rescaled to the above case with only two parameters, and .) At the same time, the choice , has most of the features of the generic case. As we will see, using , for other quadratic irrationals generates zero spacing distributions with additional structures.
In this blog, I will demonstrate how one could come to the conjecture that the above four equations describe the positions of the singularities using static and interactive visualizations for gaining intuition of the behavior of functions and families of functions; numerical computations for obtaining tens of thousand of zeros; and symbolic polynomial computations to derive (sometimes quite large) equations.
Although just a simple sum, three real cosines with linear arguments, the zeros, extrema and function values will show a remarkable variety of shapes and distributions.
After investigating an intuitive approach based on envelopes, an algebraic approach will be used to determine the peaks of the zero distributions as well as the maximal possible distance between two successive zeros.
Rather than stating and proving the result for the position of the peaks, in this blog I want to show how, with graphical and numerical experiments, one is naturally led to the result. Despite ( being the golden ratio) being such a simple-looking function, it turns out that the distribution and correlation of its function values, zeros and extrema are rich sources of interesting, and for most people unexpected, structures. So in addition to finding the peak position, I will construct and analyze various related graphics and features, as well as some related functions.
While this blog does contain a fair amount of Wolfram Language code, the vast majority of inputs are short and straightforward. Only a few more complicated functions will have to be defined.
The overall plan will be the following:
From time to time, we will take a little detour to have a look at some questions that come up naturally while carrying out the outlined steps. Although simple and short looking, the function has a lot of interesting features, and even in this long blog (the longest one I have ever written!), not all features of interest can be discussed.
The distribution of function values of the sums of the three sine/cosine function is discussed in Blevins, 1997, and the pair correlation between successive zeros was looked at in Maeda, 1996.
Here are some almost-periodic functions. The functions are all of the form for incommensurate and .
✕
fGolden[x_] := Cos[x] + Cos[GoldenRatio x] + Cos[GoldenRatio^2 x] |
✕
fPi[x_] := Cos[x] + Cos[Pi x] + Cos[Pi^2 x] |
✕
fSqrt[x_] := Cos[x] + Cos[Sqrt[2] x] + Cos[Sqrt[3] x] |
✕
fCbrt[x_] := Cos[x] + Cos[CubeRoot[2] x] + Cos[CubeRoot[3] x] |
Here is a plot of these four functions. Note that they have a visibly different character, e.g. the blue curve, on average, doesn’t seem to extend as much toward negative values as the other curves.
✕
Plot[{fGolden[x], fPi[x], fSqrt[x], fCbrt[x]} // Evaluate, {x, 0, 10 Pi}, PlotStyle -> Thickness[0.002], PlotLegends -> "Expressions"] |
The following interactive demonstration allows one to change the parameters and as well as the ranges over which the function is plotted. (Try to find parameter values such that four or more consecutive extrema are all above or below the real axis.)
✕
Manipulate[ Plot[Evaluate[ Cos[x0 + \[Delta]x] + Cos[\[Alpha]\[Beta][[1]] (x0 + \[Delta]x)] + Cos[\[Alpha]\[Beta][[2]] (x0 + \[Delta]x)]], {\[Delta]x, -X, X}, PlotPoints -> 60, PlotRange -> 3.2, PlotLabel -> Row[{{"\[Alpha]", "\[Beta]"}, "\[Equal]\[ThinSpace]", NumberForm[\[Alpha]\[Beta], 3]}]], {{\[Alpha]\[Beta], {3., 2.5}, "\[Alpha],\[Beta]"}, {0, 0}, {3, 3}}, {{X, 22.7}, 0, 30, Appearance -> "Labeled"}, {{x0, 473, Subscript["x", 0]}, 0, 1000, Appearance -> "Labeled"}, TrackedSymbols :> True] |
And here is a 2D plot of the function over the plane. The interplay between periodicity and broken periodicity is nicely visible.
✕
With[{m = 600}, ReliefPlot[ Log[Abs[Table[ Evaluate[N[1. Cos[x] + Cos[\[Alpha] x] + Cos[\[Alpha]^2 x]]] , {\[Alpha], 0, 3/2, 3/2/m}, {x, 0, 48 Pi, 48 Pi/(2 m)}]] + 1], FrameTicks -> True, AspectRatio -> 1/3, DataRange -> {{0, 48 Pi}, {0, 3/2}}, FrameLabel -> {x, \[Alpha]}]] |
The interplay between translation symmetry along the axis and symmetry violations becomes even more visible if one colors the connected regions where the function value is negative.
✕
rp = RegionPlot[ Cos[x] + Cos[\[Alpha] x] + Cos[\[Alpha]^2 x] < 0, {x, 0, 48 Pi}, {\[Alpha], 0, 5/4}, PlotPoints -> {200, 80}, AspectRatio -> 1/3, FrameLabel -> {x, \[Alpha]}] |
We color the connected regions with different colors. The connected regions of one color will gain more meaning below when we construct parts of the Riemann surface of .
✕
Graphics[{EdgeForm[], Antialiasing -> False, Blend[{RGBColor[0.88, 0.61, 0.14], RGBColor[0.36, 0.51, 0.71]}, RandomReal[]], MeshPrimitives[#, 2]} & /@ ConnectedMeshComponents[ MeshRegion[#1, Cases[#2, _Polygon, \[Infinity]]] & @@ Cases[rp, _GraphicsComplex, \[Infinity]][[1]]], AspectRatio -> 1/3, Frame -> True, FrameLabel -> {x, \[Alpha]}] |
The zero set of the more general surface forms a regular network-like surface. A 3D plot of the zero surface shows this nicely. The irregularities in the last image with the colored regions arise from slicing the 3D surface from the next image with the surface .
✕
ContourPlot3D[ Cos[x] + Cos[\[Alpha] x] + Cos[\[Beta] x] == 0, {x, 0, 80}, {\[Alpha], 0, 3/2}, {\[Beta], 0, 3/2}, RegionFunction -> (#2 < #3 &), MeshFunctions -> {Norm[{#1, #2}] &}, PlotPoints -> {60, 60, 160}, MaxRecursion -> 0, BoxRatios -> {3, 1, 1}, ViewPoint -> {-1.04, -3.52, 0.39}, AxesLabel -> {x, \[Alpha], \[Beta]}, ImageSize -> 400] |
Let’s now focus on the first function from above, . As already mentioned, this function is a bit special compared to the generic case where and are totally unrelated. Expanding the two golden ratios, we have the following representation.
✕
fGolden[x] // FunctionExpand // Simplify |
This function will generate a distribution of zero distances with sharp discontinuities, something that will not happen in the generic case.
✕
Plot[fGolden[x], {x, 0, 33 Pi}] |
Interestingly, and probably unexpectedly, one sees many positions with , but no function values with .
The special nature of the sum of the three-cosine term comes from the fundamental identities that define the golden ratio.
✕
Entity["MathematicalConstant", "GoldenRatio"]["Identities"] // Take[#, 5] & // TraditionalForm |
While will never strictly repeat itself (its period is zero)...
✕
FunctionPeriod[fGolden[x], x] |
... when we calculate the squared difference of and over , one sees that the function nearly repeats itself many, many times. (To make the values where and are nearly identical, I plot the negative logarithm of the difference, meaning spikes correspond to the values of where over a domain of size .)
✕
overlapDiff = (Integrate[#, {x, 0, X}] & /@ Expand[(fGolden[x] - fGolden[T + x])^2]); cfOverlapDiff = Compile[{T, X}, Evaluate[overlapDiff]]; |
✕
Plot[-Log[Abs[cfOverlapDiff[T, 2 Pi]]], {T, 1, 10000}, AxesOrigin -> {0, -4}, PlotPoints -> 2000, PlotRange -> All, AxesLabel -> {T, None}] |
Locating the peak around 2,400, more precisely, one sees that in this neighborhood the two functions nearly coincide. The difference over a interval is quite small, on the order of 0.005. We can easily locate the exact location of this local maximum.
✕
FindMaximum[-Log[Abs[overlapDiff /. X -> 2 Pi]], {T, 2369}, WorkingPrecision -> 50, PrecisionGoal -> 20] // N[#, 20] & |
In the left graphic, the two curves are not distinguishable; the right plot shows the difference between the two curves.
✕
With[{T = 2368.763898630}, {Plot[{fGolden[x], fGolden[T + x]}, {x, 0, 2 Pi}], Plot[fGolden[x] - fGolden[T + x], {x, 0, 2 Pi}, PlotLabel -> "difference"]}] |
The choice , results in some nontypical (compared to the case of general ) correlations between the last two summands. The following interactive demonstration shows a phase space–like plot of and its derivative with changeable and .
✕
Manipulate[ Column[{Row[{Row[{"\[Alpha]", "\[ThinSpace]\[Equal]\[ThinSpace]", \[Alpha]}], " | ", Row[{"\[Beta]", "\[ThinSpace]\[Equal]\[ThinSpace]", \[Beta]}]}], If[d == 2, ParametricPlot[ Evaluate[{Cos[x] + Cos[\[Alpha] x] + Cos[\[Beta] x], D[Cos[x] + Cos[\[Alpha] x] + Cos[\[Beta] x], x]}], {x, 0, 10^xMax}, AspectRatio -> 1, PlotPoints -> Ceiling[10^xMax 4], PlotStyle -> Directive[Thickness[0.001], RGBColor[0.36, 0.51, 0.71]], ImageSize -> 360, PlotRange -> {3 {-1, 1}, (1 + \[Alpha] + \[Beta]) {-1, 1}}, Frame -> True, Axes -> False], ParametricPlot3D[ Evaluate[{Cos[x] + Cos[\[Alpha] x] + Cos[\[Beta] x], D[Cos[x] + Cos[\[Alpha] x] + Cos[\[Beta] x], x], D[Cos[x] + Cos[\[Alpha] x] + Cos[\[Beta] x], x, x]}], {x, 0, 10^xMax}, AspectRatio -> 1, PlotPoints -> Ceiling[10^xMax 4], PlotStyle -> Directive[Thickness[0.001], RGBColor[0.36, 0.51, 0.71]], ImageSize -> 360, Axes -> False, PlotRange -> {3 {-1, 1}, (1 + \[Alpha] + \[Beta]) {-1, 1}, (1 + \[Alpha]^2 + \[Beta]^2) {-1, 1}} ]]}, Alignment -> Center] // TraditionalForm, {{d, 2, ""}, {2 -> Row[{Style["x", Italic], ",", Style["x", Italic], "'"}], 3 -> Row[{Style["x", Italic], ",", Style["x", Italic], "'", ",", Style["x", Italic], "''"}]}}, {{xMax, 2.8, Subscript["x", "max"]}, 0.1, 4}, {{\[Alpha], GoldenRatio}, 1/2, 3}, {{\[Beta], GoldenRatio^2}, 1/2, 3}, TrackedSymbols :> True] |
Among the positions where the function will nearly repeat will also be intervals where . Let’s find such values. To do this, one needs to find values of such that simultaneously the three expressions , and are very near to integer multiples of . The function rationalize yields simultaneous rational approximations with an approximate error of . (See Zhang and Liu, 2017 for a nice statistical physics-inspired discussion of recurrences.)
✕
rationalize[\[Alpha]s_List, \[CurlyEpsilon]_] := Module[{B, RB, q, T, Q = Round[1/\[CurlyEpsilon]]}, B = Normal[SparseArray[Flatten[ {{1, 1} -> 1, MapIndexed[({1, #2[[1]] + 1} -> Round[Q #]) &, \[Alpha]s], Table[{j, j} -> Q, {j, 2, Length[\[Alpha]s] + 1}]}]]]; RB = LatticeReduce[B]; (* common denominator *) q = Abs[RB[[1, 1]]]; -Round[Rest[RB[[1]]]/Q - q \[Alpha]s]/q ] |
Here is an example of three rationals (with a common denominator) that within an error of about are approximations of 1, , . (The first one could be written as .)
✕
rationalize[{1, GoldenRatio, GoldenRatio^2}, 10^-40] |
✕
Block[{$MaxExtraPrecision = 1000}, N[% - {1, GoldenRatio, GoldenRatio^2}, 10]] |
Given these approximations, one can easily calculate the corresponding “almost” period of (meaning ).
✕
getPeriod[\[Alpha]s_, \[CurlyEpsilon]_] := Module[{rat, den, period}, rat = rationalize[\[Alpha]s, \[CurlyEpsilon]]; den = Denominator[Last[rat]]; period = 2 Pi den ] |
The arguments are listed in increasing order, such that assumes values very close to 3.
✕
(nearlyThreeArguments = Table[period = getPeriod[{1, GoldenRatio, GoldenRatio^2}, 10^-exp]; {period, N[3 - fGolden[period], 10]}, {exp, 2, 20, 0.1}] // DeleteDuplicates) // Short[#, 6] & |
✕
ListLogLogPlot[nearlyThreeArguments] |
Around , the function again takes on the value 3 up to a difference on the order of .
✕
TLarge = getPeriod[{1, GoldenRatio, GoldenRatio^2}, 10^-1001] |
✕
Block[{$MaxExtraPrecision = 10000}, N[3 - fGolden[TLarge], 10]] |
Now let us look at the function in more detail, namely the distribution of the function values of at various scales and discretizations. The distribution seems invariant with respect to discretization scales (we use the same amount of points in each of the four intervals.).
✕
(With[{max = Round[500 #/0.001]}, Histogram[Table[Evaluate[N[fGolden[x]]], {x, 0, max Pi, #}], 500, PlotLabel -> Row[{"[", 0, ",", max Pi, "]"}]]] & /@ {0.001, 0.01, 0.1, 1}) // Partition[#, 2] & |
Interestingly, the three terms , and are always partially phase locked, and so not all function values between –3 and 3 are attained. Here are the values of the three summands for different arguments. Interpreted as 3-tuples within the cube , these values are on a 2D surface. (This is not the generic situation for general and , in which case the cube is generically densely filled; see below).
✕
Graphics3D[{PointSize[0.001], RGBColor[0.36, 0.51, 0.71], Opacity[0.4], Point[Union@ Round[Table[ Evaluate[ N[{Cos[x], Cos[GoldenRatio x], Cos[GoldenRatio^2 x]}]], {x, 0, 500 Pi, 0.02}], 0.01]]}, PlotRange -> {{-1, 1}, {-1, 1}, {-1, 1}}, Axes -> True, AxesLabel -> {x, Cos[GoldenRatio x], Cos[GoldenRatio^2 x]}] |
The points are all on a tetrahedron-like surface. Plotting the images of the intervals in different colors shows nicely how this surface is more and more covered by repeatedly winding around the tetrahedron-like surface, and no point is ever assumed twice.
✕
Show[Table[ ParametricPlot3D[ Evaluate[N[{Cos[x], Cos[GoldenRatio x], Cos[GoldenRatio^2 x]}]], {x, k 2 Pi, (k + 1) 2 Pi}, PlotStyle -> Directive[ Blend[{{0, RGBColor[0.88, 0.61, 0.14]}, {30, RGBColor[0.36, 0.51, 0.71]}, {60, RGBColor[0.561, 0.69, 0.19]}}, k], Thickness[0.001]]], {k, 0, 60}], AxesLabel -> {x, Cos[GoldenRatio x], Cos[GoldenRatio^2 x]}] |
Ultimately, the fact that the points are all on a surface reduces to the fact that the points are not filling the cube densely, but rather are located on parallel planes with finite distance between them.
✕
Graphics3D[{PointSize[0.002], RGBColor[0.36, 0.51, 0.71], Point[Table[ Mod[{x, GoldenRatio x, GoldenRatio^2 x}, 2 Pi], {x, 0., 10000}]]}, AxesLabel -> (HoldForm[Mod[# x, 2 Pi]] & /@ {1, GoldenRatio, GoldenRatio^2}), PlotRange -> {{0, 2 Pi}, {0, 2 Pi}, {0, 2 Pi}}] |
As we will look at distances often in this blog, let us have a quick look at the distances of points on the surface. To do this, we use 1,000 points per intervals, and we will use 1,000 intervals. The left graphic shows the distribution between consecutive points, and the right graphic shows the distance to the nearest point for all points. The peaks in the right-hand histogram come from points near the corners as well as from points of bands around diameters around the smooth parts of the tetrahedron-like surface.
✕
Module[{pts = Table[ {Cos[x], Cos[GoldenRatio x], Cos[GoldenRatio^2 x]}, {x, 0. 2 Pi, 1000 2 Pi, 2 Pi/1000}], nf}, nf = Nearest[pts]; {Histogram[EuclideanDistance @@@ Partition[pts, 2, 1], 1000, PlotRange -> All], Histogram[EuclideanDistance[#, nf[#, 2][[-1]]] & /@ pts, 1000, PlotRange -> All]}] |
The tetrahedron-like shape looks like a Cayley surface, and indeed, the values fulfill the equation of a Cayley surface. (Later, after complexifying the equation for , it will be obvious why a Cayley surface appears here.)
✕
x^2 + y^2 + z^2 == 1 + 2 x y z /. {x -> Cos[t], y -> Cos[GoldenRatio t], z -> Cos[GoldenRatio^2 t]} // FullSimplify |
While the three functions , , are algebraically related through a quadratic polynomial, any pair of these three functions is not related through an algebraic relation; with ranging over the real numbers, any pair covers the square densely.
✕
With[{c1 = Cos[1 x], c\[Phi] = Cos[GoldenRatio x], c\[Phi]2 = Cos[GoldenRatio^2 x], m = 100000}, Graphics[{PointSize[0.001], RGBColor[0.36, 0.51, 0.71], Point[Table[#, {x, 0., m, 1}]]}] & /@ {{c1, c\[Phi]}, {c1, c\[Phi]2}, {c\[Phi], c\[Phi]2}}] |
This means that our function for a given contains only two algebraically independent components. This algebraic relation will be essential for understanding the special case of the cosine sum for , and for getting some closed-form polynomials for the zero distances later.
This polynomial equation of the three summands also explains purely algebraically why the observed minimal function value of is not –3, but rather –3/2.
✕
MinValue[{x + y + z, x^2 + y^2 + z^2 == 1 + 2 x y z \[And] -1 < x < 1 \[And] -1 < y < 1 \[And] -1 < z < 1}, {x, y, z}] |
✕
cayley = ContourPlot3D[ x^2 + y^2 + z^2 == 1 + 2 x y z, {x, -1, 1}, {y, -1, 1}, {z, -1, 1}, Mesh -> None]; |
And the points are indeed on it. Advancing the argument by and coloring the spheres successively, one gets the following graphic.
✕
With[{M = 10000, \[Delta] = 1}, Show[{cayley, Graphics3D[Table[{ColorData["DarkRainbow"][x/M], Sphere[N[{Cos[1. x], Cos[GoldenRatio x], Cos[GoldenRatio^2 x]}], 0.02]}, {x, 0, M, \[Delta]}]]}, Method -> {"SpherePoints" -> 8}]] |
Here I will make use of the function , which arises from by adding independent phases to the three cosine terms. The resulting surface formed by as varies and forms a more complicated shape than just a Cayley surface.
✕
Manipulate[ With[{pts = Table[Evaluate[ N[{Cos[x + \[CurlyPhi]1], Cos[GoldenRatio x + \[CurlyPhi]2], Cos[GoldenRatio^2 x + \[CurlyPhi]3]}]], {x, 0, M Pi, 10^\[Delta]t}]}, Graphics3D[{PointSize[0.003], RGBColor[0.36, 0.51, 0.71], Thickness[0.002], Which[pl == "points", Point[pts], pl == "line", Line[pts], pl == "spheres", Sphere[pts, 0.2/CubeRoot[Length[pts]]]]}, PlotRange -> {{-1, 1}, {-1, 1}, {-1, 1}}, Method -> {"SpherePoints" -> 8}]], {{pl, "line", ""}, {"line", "points", "spheres"}}, {{M, 100}, 1, 500, Appearance -> "Labeled"}, {{\[Delta]t, -1.6}, -2, 1}, {{\[CurlyPhi]1, Pi/2, Subscript["\[CurlyPhi]", 1]}, 0, 2 Pi, Appearance -> "Labeled"}, {{\[CurlyPhi]2, Pi/2, Subscript["\[CurlyPhi]", 2]}, 0, 2 Pi, Appearance -> "Labeled"}, {{\[CurlyPhi]3, Pi/2, Subscript["\[CurlyPhi]", 3]}, 0, 2 Pi, Appearance -> "Labeled"}, TrackedSymbols :> True] |
The points are still located on an algebraic surface; the implicit equation of the surface is now the following.
✕
surface[{x_, y_, z_}, {\[CurlyPhi]1_, \[CurlyPhi]2_, \[CurlyPhi]3_}] := (x^4 + y^4 + z^4) - (x^2 + y^2 + z^2) + 4 x^2 y^2 z^2 - x y z (4 (x^2 + y^2 + z^2) - 5) Cos[\[CurlyPhi]1 + \[CurlyPhi]2 - \[CurlyPhi]3] + (2 (x^2 y^2 \ + x^2 z^2 + y^2 z^2) - (x^2 + y^2 + z^2) + 1/2) Cos[ 2 (\[CurlyPhi]1 + \[CurlyPhi]2 - \[CurlyPhi]3)] - x y z Cos[3 (\[CurlyPhi]1 + \[CurlyPhi]2 - \[CurlyPhi]3)] + Cos[4 (\[CurlyPhi]1 + \[CurlyPhi]2 - \[CurlyPhi]3)]/8 + 3/8 |
✕
surface[{Cos[t + \[CurlyPhi]1], Cos[GoldenRatio t + \[CurlyPhi]2], Cos[GoldenRatio^2 t + \[CurlyPhi]3]}, {\[CurlyPhi]1, \ \[CurlyPhi]2, \[CurlyPhi]3}] // TrigToExp // Factor // Simplify |
For vanishing phases , and , one recovers the Cayley surface.
✕
surface[{x, y, z}, {0, 0, 0}] // Simplify |
Note that the implicit equation does only depend on a linear combination of the three parameters, namely . Here are some examples of the surface.
✕
Partition[ Table[ContourPlot3D[ Evaluate[surface[{x, y, z}, {c, 0, 0}] == 0], {x, -1.1, 1.1}, {y, -1.1, 1.1}, {z, -1.1, 1.1}, MeshFunctions -> (Norm[{#1, #2, #3}] &), MeshStyle -> RGBColor[0.36, 0.51, 0.71], PlotPoints -> 40, MaxRecursion -> 2, Axes -> False, ImageSize -> 180], {c, 6}], 3] |
As a side note, I want to mention that for being the smallest positive solution of are also all on the Cayley surface.
✕
Table[With[{p = Root[-1 - # + #^deg &, 1]}, x^2 + y^2 + z^2 - (1 + 2 x y z) /. {x -> Cos[t], y -> Cos[p t], z -> Cos[p^deg t]}] // FullSimplify, {deg, 2, 12}] |
An easy way to understand why the function values of are on a Cayley surface is to look at the functions whose real part is just . For the exponential case = , due to the additivity of the exponents, the corresponding formula becomes quite simply due to the defining equality for the golden ratio.
✕
Exp[I z] * Exp[I GoldenRatio z] == Exp[I GoldenRatio^2 z] // Simplify |
Using the last formula and splitting it into real and imaginary parts, it is now easy to derive the Cayley surface equation from the above. We do this by writing down a system of equations for the real and imaginary components, the corresponding equations for all occurring arguments and eliminate the trigonometric functions.
✕
GroebnerBasis[ {xc yc - zc, xc - (Cos[t] + I Sin[t]), yc - (Cos[GoldenRatio t] + I Sin[GoldenRatio t]), zc - (Cos[GoldenRatio^2 t] + I Sin[GoldenRatio^2 t]), x - Cos[t], y - Cos[GoldenRatio t], z - Cos[GoldenRatio^2 t], Cos[t]^2 + Sin[t]^2 - 1, Cos[GoldenRatio t]^2 + Sin[GoldenRatio t]^2 - 1, Cos[GoldenRatio ^2 t]^2 + Sin[GoldenRatio^2 t]^2 - 1 }, {}, {Cos[t], Cos[GoldenRatio t], Cos[GoldenRatio^2 t], Sin[t], Sin[GoldenRatio t], Sin[GoldenRatio^2 t], xc, yc, zc}] |
For generic , , the set of triples for with increasing will fill the cube. For rational , , the points are on a 1D curve for special algebraic values of , . The following interactive demonstration allows one to explore the position of the triples in the cube. The demonstration uses three 2D sliders (rather than just one) for specifying and to allow for minor modifications of the values of and .
✕
Manipulate[ With[{\[Alpha] = \[Alpha]\[Beta][[1]] + \[Delta]\[Alpha]\[Beta][[ 1]] + \[Delta]\[Delta]\[Alpha]\[Beta][[ 1]], \[Beta] = \[Alpha]\[Beta][[2]] + \[Delta]\[Alpha]\[Beta][[ 2]] + \[Delta]\[Delta]\[Alpha]\[Beta][[2]]}, Graphics3D[{PointSize[0.003], RGBColor[0.36, 0.51, 0.71], Opacity[0.66], Point[ Table[Evaluate[N[{Cos[x], Cos[\[Alpha] x], Cos[\[Beta] x]}]], {x, 0, M Pi, 10^\[Delta]t}]]}, PlotRange -> {{-1, 1}, {-1, 1}, {-1, 1}}, PlotLabel -> NumberForm[ Grid[{{"\[Alpha]", \[Alpha]}, {"\[Beta]", \[Beta]}}, Dividers -> Center], 6]]], {{M, 100}, 1, 500, Appearance -> "Labeled"}, {{\[Delta]t, -2, Row[{"\[Delta]" Style["\[NegativeThinSpace]t", Italic]}]}, -3, 1}, Grid[{{"{\[Alpha],\[Beta] }", "{\[Alpha], \[Beta]} zoom", "{\[Alpha], \[Beta]} zoom 2"}, {Control[{{\[Alpha]\[Beta], {0.888905, 1.27779}, ""}, {1/2, 1/2}, {3, 3}, ImageSize -> {100, 100}}], Control[{{\[Delta]\[Alpha]\[Beta], {0.011, -0.005}, ""}, {-0.02, -0.02}, {0.02, 0.02}, ImageSize -> {100, 100}}], Control[{{\[Delta]\[Delta]\[Alpha]\[Beta], {-0.00006, 0.00008}, ""}, {-0.0002, -0.0002}, {0.0002, 0.0002}, ImageSize -> {100, 100}}]}}] , TrackedSymbols :> True] |
The degree of filling depends sensitively on the values of and . Counting how many values after rounding are used in the cube gives an estimation of the filling. For the case , the following plot shows that the filling degree is a discontinuous function of . The lowest filling degree is obtained for rational with small denominators, which results in closed curves in the cube. (We use rational values for , which explains most of the small filling ratios.)
✕
usedCubes[\[Alpha]_] := Length[Union[Round[Table[Evaluate[ N[{Cos[x], Cos[\[Alpha] x], Cos[\[Alpha]^2 x]}]], {x, 0., 1000 Pi, Pi/100.}], (* rounding *)0.01]]] |
✕
ListLogPlot[ Table[{\[Alpha], usedCubes[\[Alpha]]}, {\[Alpha], 1, 2, 0.0005}], Joined -> True, GridLines -> {{GoldenRatio}, {}}] |
Plotting over a large domain shows clearly that not all possible values of occur equally often. Function values near –1 seem to be assumed much more frequently.
✕
Plot[Evaluate[fGolden[x]], {x, 0, 1000 Pi}, PlotStyle -> Directive[ Opacity[0.3], Thickness[0.001]], Frame -> True, PlotPoints -> 10000] |
How special is with respect to not taking on negative values near –3? The following graphic shows the smallest function values of the function over the domain as a function of . The special behavior near is clearly visible, but other values of , such that the extremal function value is greater than –3, do exist (especially rational and/or rational ).
✕
ListPlot[Monitor[ Table[{\[Alpha], Min[Table[ Evaluate[N[Cos[x] + Cos[\[Alpha] x] + Cos[\[Alpha]^2 x]]], {x, 0., 100 Pi, Pi/100.}]]}, {\[Alpha], 1, 2, 0.0001}], \[Alpha]], PlotRange -> All, AxesLabel -> {"\[Alpha]", None}] |
And the next graphic shows the minimum values over the plane. I sample the function in the intervals and . Straight lines with simple rational relations between and emerge.
✕
cfMin = Compile[{max}, Table[Min[ Table[Cos[x] + Cos[\[Alpha] x] + Cos[\[Beta] x], {x, 1. Pi, max, Pi/100}]], {\[Beta], 0, 2, 1/300}, {\[Alpha], 0, 2, 1/300}]]; |
✕
ReliefPlot[cfMin[#], Frame -> True, FrameTicks -> True, DataRange -> {{0, 2}, {0, 2}}, ImageSize -> 200] & /@ {10 Pi, 100 Pi} |
Here are the two equivalent images for the maxima.
✕
cfMax = Compile[{max}, Table[Max[ Table[Cos[x] + Cos[\[Alpha] x] + Cos[\[Beta] x], {x, 1. Pi, max, Pi/100}]], {\[Beta], 1/2, 2, 1/300}, {\[Alpha], 1/2, 2, 1/300}]]; |
✕
ReliefPlot[cfMax[#], Frame -> True, FrameTicks -> True, DataRange -> {{0, 2}, {0, 2}}, ImageSize -> 200] & /@ {10 Pi, 100 Pi} |
Now I will work toward reproducing the first graphic that was shown in the MathOverflow post. The poster also gives a clever method to quickly calculate 100k+ zeros based on solving the differential equation and recording zero crossings on the fly using WhenEvent to detect zeros. As NDSolve will adaptively select step sizes, a zero of the function can be conveniently detected as a side effect of solving the differential equation of the derivative of the function whose zeros we are interested in. I use an optional third argument for function values different from zero. This code is not guaranteed to catch every single zero—it might happen that a zero is missed; using the MaxStepSize option, one could control the probability of a miss. Because we are only interested in statistical properties of the zeros, we use the default option value of MaxStepSize.
✕
findZeros[f_, n_, f0_: 0, ndsolveoptions : OptionsPattern[]] := Module[{F, zeroList, counter, t0}, zeroList = Table[0, {n}]; counter = 0; Monitor[ NDSolve[{F'[t] == D[f[t], t], F[0] == f[0], WhenEvent[F[t] == f0, counter++; t0 = t; (* locate zeros, store zero, and use new starting value *) \ If[counter <= n, zeroList[[counter]] = t0, F[t0] = f[t0]; "StopIntegration"], "LocationMethod" -> {"Brent", PrecisionGoal -> 10}]}, F, {t, 10^8, 10^8}, ndsolveoptions, Method -> "BDF", PrecisionGoal -> 12, MaxSteps -> 10^8, MaxStepSize -> 0.1], {"zero counter" -> counter, "zero abscissa" -> t0}] // Quiet; zeroList] |
Here are the first 20 zeros of .
✕
zerosGolden = findZeros[fGolden, 20] |
They are indeed the zeros of .
✕
Plot[fGolden[x], {x, 0, 10 Pi}, Epilog -> {Darker[Red], Point[{#, fGolden[#]} & /@ zerosGolden]}] |
Of course, using Solve, one could also get the roots.
✕
Short[exactZerosGolden = x /. Solve[fGolden[x] == 0 \[And] 0 < x < 36, x], 4] |
The two sets of zeros coincide.
✕
Max[Abs[zerosGolden - exactZerosGolden]] |
While this gives more reliable results, it is slower, and for the statistics of the zeros that we are interested in, exact solutions are not needed.
Calculating 100k zeros takes on the order of a minute.
✕
(zerosGolden = findZeros[fGolden, 10^5];) // Timing |
The quality of the zeros () is good enough for various histograms and plots to be made.
✕
Mean[Abs[fGolden /@ zerosGolden]] |
But one could improve the quality of the roots needed to, say, by using a Newton method, with the found roots as starting values.
✕
zerosGoldenRefined = FixedPoint[Function[x, Evaluate[N[x - fGolden[x]/fGolden'[x]]]], #, 10] & /@ Take[zerosGolden, All]; |
✕
Mean[Abs[fGolden /@ zerosGoldenRefined]] |
The process of applying Newton iterations to find zeros as a function of the starting values has its very own interesting features for the function , as the basins of attraction as well as the convergence properties are not equal for all zeros, e.g. the braided strand near the zero stands out. The following graphic gives a glimpse of the dramatic differences, but we will not look into this quite-interesting subtopic any deeper in this post.
✕
Graphics[{Thickness[0.001], Opacity[0.1], RGBColor[0.36, 0.51, 0.71], Table[BSplineCurve[Transpose[{N@ NestList[(# - fGolden[#]/fGolden'[#]) &, N[x0, 50], 20], Range[21]}]], {x0, 1/100, 40, 1/100}]}, PlotRange -> { {-10, 50}, All}, AspectRatio -> 1/3, Frame -> True, FrameLabel -> {x, "iterations"}, PlotRangeClipping -> True] |
If we consider the function , it has a constant function value between the zeros. The Fourier transform of this function, calculated based on the zeros, shows all possible harmonics between the three frequencies 1, and .
✕
Module[{ft, nf, pl, labels}, (* Fourier transform *) ft = Compile[\[Omega], Evaluate[ Total[ Function[{a, b}, Sign[fGolden[ Mean[{a, b}]]] I (Exp[I a \[Omega]] - Exp[I b \[Omega]])/\[Omega] ] @@@ Partition[Take[zerosGolden, 10000], 2, 1]]]]; (* identify harmonics *) nf = Nearest[ SortBy[Last /@ #, LeafCount][[1]] & /@ Split[Sort[{N[#, 10], #} & /@ Flatten[ Table[Total /@ Tuples[{1, GoldenRatio, GoldenRatio^2, -1, -GoldenRatio, -GoldenRatio^2}, {j}], {j, 8}]]], #1[[1]] === #2[[ 1]] &]] // Quiet; (* plot Fourier transforms *) pl = Plot[Abs[ft[\[Omega]]], {\[Omega], 0.01, 8}, PlotRange -> {0, 1000}, Filling -> Axis, PlotPoints -> 100]; (* label peaks *) labels = SortBy[#, #[[1, 2]] &][[-1]] & /@ GatherBy[Select[{{#1, #2}, nf[#1][[1]]} & @@@ Select[Level[Cases[Normal[pl], _Line, \[Infinity]], {-2}], Last[#] > 100 &], Abs[#[[1, 1]] - #[[2]]] < 10^-2 &], Last]; Show[{pl, ListPlot[Callout @@@ labels, PlotStyle -> None]}, PlotRange -> {{0.1, 8.5}, {0, 1100}}, Axes -> {True, False}]] |
There is a clearly visible zero-free region of the zeros mod 2π around .
✕
Histogram[Mod[zerosGolden, 2 Pi], 200] |
Plotting the square of in a polar plot around the unit circle (in yellow/brown) shows the zero-free region nicely.
✕
PolarPlot[1 + fGolden[t]^2/6, {t, 0, 200 2 Pi}, PlotStyle -> Directive[Opacity[0.4], RGBColor[0.36, 0.51, 0.71], Thickness[0.001]], PlotPoints -> 1000, Prolog -> {RGBColor[0.88, 0.61, 0.14], Disk[]}] |
Modulo we again see the two peaks and a flat distribution between them.
✕
Histogram[(# - Round[#, Pi]) & /@ zerosGolden, 200] |
The next graphic shows the distance between the zeros (vertical) as a function of . The zero-free region, as well as the to-be-discussed clustering of the roots, is clearly visible.
✕
Graphics[{RGBColor[0.36, 0.51, 0.71], Function[{a, b}, Line[{{a, b - a}, {b, b - a}}]] @@@ \ Partition[Take[zerosGolden, 10000], 2, 1]}, AspectRatio -> 1/2, Frame -> True, FrameLabel -> {x, "zero distance"}] |
Statistically, the three terms of the defining sum of do not contribute equally to the formation of the zeros.
✕
summandValues = {Cos[#], Cos[GoldenRatio #], Cos[GoldenRatio^2 #]} & /@ zerosGolden; |
✕
Histogram[#, 200] & /@ Transpose[summandValues] |
✕
Table[Histogram3D[{#[[1]], #[[-1]]} & /@ Partition[fGolden' /@ zerosGolden, j, 1], 100, PlotLabel -> Row[{"shift \[Equal] ", j - 1}]], {j, 2, 4}] |
The slopes of at the zeros are strongly correlated to the slopes of successive zeros.
✕
Table[Histogram3D[{#[[1]], #[[-1]]} & /@ Partition[fGolden' /@ zerosGolden, j, 1], 100, PlotLabel -> Row[{"shift \[Equal] ", j - 1}]], {j, 2, 4}] |
A plot of the summands and their derivatives for randomly selected zeros shows that the values of the summands are not unrelated at the zeros. The graphic shows triangles made from the three components of the two derivatives.
✕
fGoldenfGoldenPrime[ x_] = {D[{Cos[x], Cos[GoldenRatio x], Cos[GoldenRatio^2 x]}, x], {Cos[ x], Cos[GoldenRatio x], Cos[GoldenRatio^2 x]}}; |
✕
Graphics[{Thickness[0.001], Opacity[0.05], RGBColor[0.88, 0.61, 0.14], Line[Append[#, First[#]] &@Transpose[fGoldenfGoldenPrime[#]]] & /@ \ RandomSample[zerosGolden, 10000]}] |
Here are the values of the three terms shown in a 3D plot. As a cross-section of the above Cayley surface, it is a closed 1D curve.
✕
Graphics3D[{PointSize[0.002], RGBColor[0.36, 0.51, 0.71], Point[Union@Round[summandValues, 0.01]]}, PlotRange -> {{-1, 1}, {-1, 1}, {-1, 1}}, Axes -> True, AxesLabel -> {Cos[x], Cos[GoldenRatio x], Cos[GoldenRatio^2 x]}] |
And here is the distribution of the distances between successive zeros. This is the graphic shown in the original MathOverflow post. The position of the peaks is what was asked for.
✕
differencesGolden = Differences[zerosGolden]; |
✕
Histogram[differencesGolden, 1000, PlotRange -> All] |
The pair correlation function between successive zero distances is localized along a few curve segments.
✕
Histogram3D[Partition[differencesGolden, 2, 1], 100, PlotRange -> All] |
The nonrandomness of successive zero distances also becomes visible by forming an angle path with the zero distances as step sizes.
✕
Graphics[{Thickness[0.001], Opacity[0.5], RGBColor[0.36, 0.51, 0.71], Line[AnglePath[{#, 2 Pi/5} & /@ Take[differencesGolden, 50000]]]}, Frame -> True] |
Here are some higher-order differences of the distances between successive zeros.
✕
{Histogram[Differences[zerosGolden, 2], 1000, PlotRange -> All], Histogram[Differences[zerosGolden, 3], 1000, PlotRange -> All]} |
Even the nearest-neighbor differences show a lot of correlation between them. The next graphic shows their bivariate distribution for , with each distribution in a different color.
✕
Show[Table[ Histogram3D[Partition[differencesGolden, m, 1][[All, {1, -1}]], 400, ColorFunction -> (Evaluate[{RGBColor[0.368417, 0.506779, 0.709798], RGBColor[0.880722, 0.611041, 0.142051], RGBColor[ 0.560181, 0.691569, 0.194885], RGBColor[ 0.922526, 0.385626, 0.209179], RGBColor[ 0.528488, 0.470624, 0.701351], RGBColor[ 0.772079, 0.431554, 0.102387], RGBColor[ 0.363898, 0.618501, 0.782349], RGBColor[1, 0.75, 0], RGBColor[0.647624, 0.37816, 0.614037], RGBColor[ 0.571589, 0.586483, 0.], RGBColor[0.915, 0.3325, 0.2125], RGBColor[0.40082222609352647`, 0.5220066643438841, 0.85]}[[ m]]] &)], {m, 12}], PlotRange -> All, ViewPoint -> {2, -2, 3}] |
The distribution of the slopes at the zeros have much less structure and show the existence of a maximal slope.
✕
Histogram[fGolden' /@ zerosGolden, 1000] |
The distribution of the distance of the zeros with either positive or negative slope at the zeros is identical for the two signs.
✕
Function[lg, Histogram[Differences[Select[zerosGolden, lg[fGolden'[#], 0] &]], 1000, PlotLabel -> HoldForm[lg[Subscript["f", "\[Phi]"]'[x], 0]]]] /@ {Less, Greater} |
A plot of the values of the first versus the second derivative of at the zeros shows a strong correlation between these two values.
✕
Histogram3D[{fGolden'[#], fGolden''[#]} & /@ zerosGolden, 100, PlotRange -> All] |
The ratio of the values of the first to the second derivative at the zeros shows an interesting-looking distribution with two well-pronounced peaks at ±1. (Note the logarithmic vertical scale.)
✕
Histogram[ Select[fGolden''[#]/fGolden'[#] & /@ zerosGolden, Abs[#] < 3 &], 1000, {"Log", "Count"}] |
As we will see in later examples, this is typically not the case for generic sums of three-cosine terms. Assuming that locally around a zero the function would look like++, the possible region in is larger. In the next graphic, the blue region shows the allowed region, and the black curve shows the actually observed pairs.
✕
slopeCurvatureRegion[{\[CurlyPhi]2_, \[CurlyPhi]3_}, {\[Alpha]_, \ \[Beta]_}] := Module[{v = Sqrt[1 - (-Cos[\[CurlyPhi]2] - Cos[\[CurlyPhi]3])^2]}, {# v - \[Alpha] Sin[\[CurlyPhi]2] - \[Beta] Sin[\ \[CurlyPhi]3], Cos[\[CurlyPhi]2] - \[Alpha]^2 Cos[\[CurlyPhi]2] + Cos[\[CurlyPhi]3] - \[Beta]^2 Cos[\[CurlyPhi]3]} & /@ {-1, 1}] |
✕
Show[{ParametricPlot[ slopeCurvatureRegion[{\[CurlyPhi]2, \[CurlyPhi]3}, {GoldenRatio, GoldenRatio^2}], {\[CurlyPhi]2, 0, 2 Pi}, {\[CurlyPhi]3, 0, 2 Pi}, PlotRange -> All, AspectRatio -> 1, PlotPoints -> 120, BoundaryStyle -> None, FrameLabel -> {f', f''}], Graphics[{PointSize[0.003], RGBColor[0.88, 0.61, 0.14], Point[{fGolden'[#], fGolden''[#]} & /@ zerosGolden]}]}] |
For , the possible region is indeed fully used.
✕
Show[{ParametricPlot[ slopeCurvatureRegion[{\[CurlyPhi]2, \[CurlyPhi]3}, {Pi, Pi^2}], {\[CurlyPhi]2, 0, 2 Pi}, {\[CurlyPhi]3, 0, 2 Pi}, PlotRange -> All, AspectRatio -> 1, PlotPoints -> 20, BoundaryStyle -> None, FrameLabel -> {f', f''}], Graphics[{PointSize[0.003], RGBColor[0.88, 0.61, 0.14], Opacity[0.4], Point[{fPi'[#], fPi''[#]} & /@ findZeros[fPi, 20000]]}]}] |
Similar to the fact that the function values of do not take on larger negative values, the maximum slope observed at the zeros is smaller than the one realizable with a free choice of phases in ++, which is for .
✕
Max[Abs[fGolden' /@ zerosGolden]] |
In the above histograms, we have seen the distribution for the distances of the zeros of . While in some sense zeros are always special, a natural generalization to look at are the distances between successive values such that as a function of . The following input calculates this data from . I exclude small intervals around the endpoints because the distances between values become quite large and thus the calculation becomes time-consuming. (This calculation will take a few hours.)
✕
Monitor[cDistancesGolden = Table[ zeros = findZeros[fGolden, 25000, c]; {c, Tally[Round[Differences[zeros] , 0.005]]}, {c, -1.49, 2.99, 0.0025}], Row[{"c", "\[ThinSpace]=\[ThinSpace]", c}]] |
✕
cDistancesGoldenSA = SparseArray[ Flatten[Table[({j, Round[200 #1] + 1} -> #2) & @@@ cDistancesGolden[[j]][[2]], {j, Length[cDistancesGolden]}], 1]]; |
The following relief plot of the logarithm of the density shows that sharp peaks (as shown above) for the zeros exist for any value of in .
✕
ReliefPlot[Log[1 + Take[cDistancesGoldenSA, All, {1, 1600}]], DataRange -> {{0, 1600 0.005}, {-1.49, 2.99}}, FrameTicks -> True, FrameLabel -> {"\[CapitalDelta] zeros", Subscript[ f, "\[Phi]"]}] |
Before analyzing in more detail, for comparison—and as a warmup for later calculations—let us look at the simpler case of a sum of two cos terms, concretely . The zeros also do not have a uniform distance.
✕
f2Terms[x_] := Cos[x] + Cos[GoldenRatio x] |
✕
zeros2Terms = findZeros[f2Terms, 10^5]; |
Here is a plot of ; the zeros are again marked with a red dot.
✕
Plot[f2Terms[x], {x, 0, 12 Pi}, Epilog -> {Darker[Red], Point[{#, f2Terms[#]} & /@ Take[zeros2Terms, 100]]}] |
But one distance (at ) is exponentially more common than others (note the logarithmic scaling of the vertical axis!).
✕
Histogram[Differences[zeros2Terms], 1000, {"Log", "Count"}, PlotRange -> All] |
About 60% of all zeros seem to cluster near a distance of approximately 2.4. I write the sum of the two cosine terms as a product.
✕
f2Terms[x] // TrigFactor |
The two terms have different period, the smaller one being approximately 2.4.
✕
{FunctionPeriod[Cos[(Sqrt[5] - 1)/4 x], x], FunctionPeriod[Cos[(Sqrt[5] + 3)/4 x], x]} |
✕
N[%/2] |
Plotting the zeros for each of the two factors explains why one sees so many zero distances with approximate distance 2.4.
✕
Solve[f2Terms[x] == 0, x] |
✕
Plot[{Sqrt[2] Cos[(Sqrt[5] - 1)/4 x], Sqrt[2] Cos[(Sqrt[5] + 3)/4 x]}, {x, 0, 20 Pi}, Epilog -> {{Blue, PointSize[0.01], Point[Table[{(2 (2 k - 1) Pi)/(3 + Sqrt[5]), 0}, {k, 230}]]}, {Darker[Red], PointSize[0.01], Point[Table[{1/2 (1 + Sqrt[5]) (4 k - 1) Pi, 0}, {k, 10}]]}}] |
✕
SortBy[Tally[Round[Differences[zeros2Terms], 0.0001]], Last][[-1]] |
Let us look at and next to each other. One observes an obvious difference between the two curves: in , ones sees triples of maximum-minimum-maximum, with all three extrema being negative. This situation does not occur in .
✕
GraphicsRow[{Plot[fGolden[x], {x, 0, 14 Pi}, PlotLabel -> Subscript[f, \[Phi]], ImageSize -> 320], Plot[f2Terms[x], {x, 0, 14 Pi}, PlotLabel -> Subscript[f, 2], ImageSize -> 320]}] |
Here is a plot of after a zero for 1,000 randomly selected zeros. Graphically, one sees many curves crossing zero exactly at the first zero of . Mouse over the curves to see the underlying curve in red to better follow its graph.
✕
Show[Plot[f2Terms[t - #], {t, -2, 0 + 5}, PlotRange -> All, PlotStyle -> Directive[RGBColor[0.36, 0.51, 0.71], Thickness[0.001], Opacity[0.3]]] & /@ RandomSample[zeros2Terms, 1000], PlotRange -> All, Frame -> True, GridLines -> {{0}, {3}}, ImageSize -> 400] /. l_Line :> Mouseover[l, {Red, Thickness[0.002], Opacity[1], l}] |
The distance from the origin to the first zero of is indeed the most commonly occurring distance between zeros.
✕
{#, N[#]} &@(2 x /. Solve[f2Terms[x] == 0 \[And] 1 < x < 2, x] // Simplify) |
Let us note that adding more terms does not lead to more zeros. Consider the function .
✕
f2TermsB[x_] := Cos[GoldenRatio^2 x] |
✕
zeros2TermsB = findZeros[f2TermsB, 10^5]; |
It has many more zeros up to a given (large) than the function .
✕
Max[zeros2TermsB]/Max[zerosGolden] |
The reason for this is that the addition of the slower oscillating functions ( and ) effectively “converts” some zeros into minimum-maximum-minimum triples all below (above) the real axis. The following graphic visualizes this effect.
✕
Plot[{fGolden[x], Cos[GoldenRatio^2 x]}, {x, 10000, 10036}, Filling -> Axis] |
Now let’s come back to our function composed of three cos terms.
If we are at a given zero of , what will happen afterward? If we are at a zero that has distance to its next zero , how different can the function behave in the interval ? For the function , there is a very strong correlation between the function value and the zero distances, even if we move to the function values in the intervals , .
✕
Table[Histogram3D[{#[[1, 1]], fGolden[#[[-1, 2]] + #[[-1, 1]]/2]} & /@ Partition[Transpose[{differencesGolden, Most[zerosGolden]}], k, 1], 100, PlotLabel -> Row[{"shift:\[ThinSpace]", k - 0.5}], AxesLabel -> {HoldForm[d], Subscript[f, \[Phi]], None}], {k, 4}] // Partition[#, 2] & |
Plotting the function starting at zeros shows envelopes. Using a mouseover effect allows one to highlight individual curves. The graphic shows a few clearly recognizable envelopes. Near to them, many of the curves cluster. The four purple points indicate the intersections of the envelopes with the axis.
✕
Show[Plot[fGolden[# + t], {t, -2, 0 + 5}, PlotRange -> All, PlotStyle -> Directive[RGBColor[0.36, 0.51, 0.71], Thickness[0.001], Opacity[0.3]]] & /@ RandomSample[zerosGolden, 1000], Epilog -> {Directive[Purple, PointSize[0.012]], Point[{#, 0} & /@ {1.81362, 1.04398, 1.49906, 3.01144}]}, PlotRange -> All, Frame -> True, GridLines -> {{0}, {-1, 3}}, ImageSize -> 400] /. l_Line :> Mouseover[l, {Red, Thickness[0.002], Opacity[1], l}] |
The envelope that belongs to zeros that are followed with nearly reaching explains the position of the largest maximum in the zero-distance distribution.
✕
zerosGolden[[1]] |
Using transcendental roots, one can obtain a closed-form representation of the root that can be numericalized to any precision.
✕
Solve[fGolden[x] == 0 \[And] 1/2 < x < 3/2, x] |
✕
root3 = N[ Root[{Cos[#1] + Cos[GoldenRatio #1] + Cos[GoldenRatio^2 #1] &, 0.9068093955855631129`20}], 50] |
The next graphic shows the histogram of the zero distances together with the just-calculated root.
✕
Histogram[differencesGolden, 1000, ChartStyle -> Opacity[0.4], PlotRange -> {{1.7, 1.9}, All}, GridLines -> {{2 zerosGolden[[1]]}, {}}, GridLinesStyle -> Purple] |
We select zeros with a spacing to the next zero that are in a small neighborhood of the just-calculated zero spacing.
✕
getNearbyZeros[zeros_, z0_, \[Delta]_] := Last /@ Select[Transpose[{Differences[zeros], Most[zeros]}], z0 - \[Delta] < #[[1]] < z0 + \[Delta] &] |
✕
zerosAroundPeak = getNearbyZeros[zerosGolden, 2 root3, 0.001]; Length[zerosAroundPeak] |
Plotting these curves shifted by the corresponding zeros such that they all have the point in common shows that all these curves are indeed locally (nearly) identical.
✕
Show[Plot[fGolden[# + t], {t, -3, 6}, PlotRange -> All, PlotStyle -> Directive[RGBColor[0.36, 0.51, 0.71], Thickness[0.001], Opacity[0.3]]] & /@ RandomSample[zerosAroundPeak, 100], PlotRange -> All, Frame -> True, GridLines -> {{0}, {-1, 1, 3}}, ImageSize -> 400] /. l_Line :> Mouseover[l, {Red, Thickness[0.002], Opacity[1], l}] |
Curves that start at the zero function value and attain a function value of ≈3 will soon move apart. The next graphic shows the mean value of spread of the distances after a selected distance. This spread is the reason that the zero distances that appear after 2 root3 are not peaks in the distribution of zero distances.
✕
Module[{zeroPosis, data}, zeroPosis = Select[Last /@ Select[Transpose[{differencesGolden, Range[Length[zerosGolden] - 1]}], 2 root3 - 0.001 < #[[1]] < 2 root3 + 0.001 &], # < 99000 &]; data = Table[{j, Mean[#] \[PlusMinus] StandardDeviation[#]} &[ differencesGolden[[zeroPosis + j]]], {j, 100}]; Graphics[{RGBColor[0.36, 0.51, 0.71], Line[{{#1, Subtract @@ #2}, {#1, Plus @@ #2}} & @@@ data]}, AspectRatio -> 1/2, Frame -> True]] |
Instead of looking for the distance of successive roots of , one could look at the roots with an arbitrary right-hand side . Based on the above graphics, the most interesting distribution might occur for . Similar to the two-summand case, the distance between the zeros of the fastest component, namely , dominates the distribution. (Note the logarithmic vertical scale.)
✕
Histogram[ Differences[findZeros[fGolden, 10^5, -1]], 1000, {"Log", "Count"}, PlotRange -> All] |
The above plot of seemed to show that the smallest values attained are around –1.5. This is indeed the absolute minimum possible; this follows from the fact that the summands lie on the Cayley surface.
✕
Minimize[{x + y + z, x^2 + y^2 + z^2 == 1 + 2 x y z \[And] -1 <= x <= 1 \[And] -1 <= y <= 1 \[And] -1 <= z <= 1}, {x, y, z}] |
I use findZeros to find some near-minima positions. Note the relatively large distances between these minima. Because of numerical errors, one sometimes gets two nearby values, in which case duplicates are deleted.
✕
zerosGoldenMin = findZeros[fGolden, 100, -3/2] // DeleteDuplicates[#, Abs[#1 - #2] < 0.1 &] &; |
Close to these absolute minima positions, the function takes on two universal shapes. This is clearly visible by plotting in the neighborhoods of all 100 minima.
✕
Show[Plot[fGolden[# + t], {t, -2, 0 + 5}, PlotRange -> All, PlotStyle -> Directive[RGBColor[0.36, 0.51, 0.71], Thickness[0.001], Opacity[0.3]]] & /@ zerosGoldenMin, GridLines -> {{}, {-3/2, 3}}] |
The structure in the zero-distance distribution is already visible in relatively low-degree rational approximations of .
✕
fGoldenApprox[x_] = fGolden[x] /. GoldenRatio -> Convergents[GoldenRatio, 12][[-1]] |
✕
zerosGoldenApprox = findZeros[fGoldenApprox, 100000]; |
✕
Histogram[Differences[zerosGoldenApprox], 1000, PlotRange -> All] |
For small rational values of , in , one can factor the expression and calculate the roots of each factor symbolically to more quickly generate the list of zeros.
✕
getFirstZeros[f_[x_] + f_[\[Alpha]_ x_] + f_[\[Beta]_ x_] , n_] := Module[{sols, zs}, sols = Select[x /. Solve[f[x] + f[\[Alpha] x] + f[\[Beta] x] == 0, x] // N // ExpandAll // Chop, FreeQ[#, _Complex, \[Infinity]] &]; zs = getZeros[#, Ceiling[n/Length[sols]]] & /@ sols; Sort@Take[Flatten[zs], n]] |
✕
getZeros[ConditionalExpression[a_. + C[1] b_, C[1] \[Element] Integers], n_] := a + b Range[n] |
✕
Table[Histogram[ Differences[ getFirstZeros[ Cos[x] + Cos[\[Alpha] x] + Cos[\[Alpha]^2 x] /. \[Alpha] -> Convergents[GoldenRatio, d][[-1]], 100000]], 1000, PlotRange -> All, PlotLabel -> Row[{"convergent" , " \[Rule] ", d}]], {d, 2, 5}] // Partition[#, 2] & |
For comparison, here are the corresponding plots for .
✕
Table[Histogram[ Differences[ getFirstZeros[ Sin[x] + Sin[\[Alpha] x] + Sin[\[Alpha]^2 x] /. \[Alpha] -> Convergents[GoldenRatio, d][[-1]], 100000]], 1000, PlotRange -> All, PlotLabel -> Row[{"convergent" , " \[Rule] ", d}]], {d, 2, 5}] // Partition[#, 2] & |
I now repeat some of the above visualizations for the extrema instead of the zeros.
✕
findExtremas[f_, n_] := findZeros[f', n] |
✕
extremasGolden = Prepend[findExtremas[fGolden, 100000], 0.]; |
Here is a plot of together with the extrema, marked with the just-calculated values.
✕
Plot[fGolden[x], {x, 0, 10 Pi}, Epilog -> {Darker[Red], Point[{#, fGolden[#]} & /@ Take[ extremasGolden, 30]]}] |
The extrema in the plane shows that extrema cluster around .
✕
Graphics[{PointSize[0.001], RGBColor[0.36, 0.51, 0.71], Opacity[0.1], Point[N[{#, fGolden[#]} & /@ extremasGolden]]}, AspectRatio -> 1/2, ImageSize -> 400, Frame -> True] |
The distribution of the function values at the extrema is already visible in the last graphic. The peaks are at –3/2, –1 and 3. The following histogram shows how pronounced the density increases at these three values.
✕
Histogram[fGolden /@ extremasGolden, 1000, GridLines -> {{-3/2, -1, 3}, {}}, PlotRange -> All] |
In a 3D histogram, one sees a strong correlation between the position of the extrema mod and the function value at the extrema.
✕
Histogram3D[{Mod[#, 2 Pi], fGolden[#]} & /@ extremasGolden, 100, AxesLabel -> {\[CapitalDelta]x, Subscript[f, \[Phi]], None}] |
If one separates the minima and the maxima, one obtains the following distributions.
✕
Function[lg, Histogram3D[{Mod[#, 2 Pi], fGolden[#]} & /@ Select[extremasGolden, lg[fGolden''[#], 0] &], 100, AxesLabel -> {\[CapitalDelta]x, Subscript[f, \[Phi]], None}]] /@ {Less, Greater} |
The function values of extrema ordinates are strongly correlated to the function values of successive extrema.
✕
Table[Histogram3D[{#[[1]], #[[-1]]} & /@ Partition[fGolden /@ extremasGolden, j, 1], 100, PlotLabel -> Row[{"shift:", j - 1}]], {j, 2, 4}] |
A periodogram of the function values at the extrema shows a lot of structure. The various periodicities arising from the three cosine terms and their interference terms become visible. (The equivalent curve for the slope at the zeros is much noisier.) For all , , the periodogram of the function values at the extrema is a relatively smooth curve with only a few structures.
✕
Periodogram[fGolden /@ extremasGolden, PlotStyle -> RGBColor[0.88, 0.61, 0.14], Filling -> Axis] |
Here again are two graphics that show how the three terms contribute to the function value at the maximum. The value of the term at the extrema is quite limited.
✕
summandValuesE = {Cos[#], Cos[GoldenRatio #], Cos[GoldenRatio^2 #]} & /@ extremasGolden; |
✕
Graphics3D[{PointSize[0.002], RGBColor[0.36, 0.51, 0.71], Point[Union@Round[summandValuesE, 0.01]]}, PlotRange -> {{-1, 1}, {-1, 1}, {-1, 1}}, Axes -> True, AxesLabel -> {Subscript[f, \[Phi]], Cos[GoldenRatio x], Cos[GoldenRatio^2 x]}] |
✕
Histogram[#, 200] & /@ Transpose[summandValuesE] |
The two disconnected supports arise from the contributions at the minima and maxima. The contribution at the maxima is quite localized.
✕
Histogram[Cos[GoldenRatio^2 #], 200, PlotRange -> {{-1, 1}, All}] & /@ {Select[extremasGolden, fGolden''[#] > 0 &], Select[extremasGolden, fGolden''[#] < 0 &]} |
I zoom into the second graphic of the last result.
✕
Histogram[ Cos[GoldenRatio^2 #] & /@ Select[extremasGolden, fGolden''[#] < 0 &], 200, PlotRange -> {{0.94, 1}, All}] |
Here is the distribution of the function values at the minima and maxima.
✕
{Histogram[fGolden /@ Select[extremasGolden, fGolden''[#] > 0 &], 200], Histogram[fGolden /@ Select[extremasGolden, fGolden''[#] < 0 &], 200]} |
Reduced to the interval , one sees a small, smooth peak around .
✕
Histogram[Mod[extremasGolden, 2 Pi], 200] |
And this is the overall distribution of the distances between successive extrema. Compared with the zeros, it does not show much unexpected structure.
✕
Histogram[Differences[extremasGolden], 1000, PlotRange -> All] |
Higher differences do show some interesting structure.
✕
Histogram[Differences[extremasGolden, 2], 1000, PlotRange -> All] |
In an analogy to the zeros, I also show the pair correlation function. (In this context, this is also called a peak-to-peak plot.)
✕
Histogram3D[{#2 - #1, #3 - #2} & @@@ Partition[extremasGolden, 3, 1], 100, PlotRange -> All] |
And here are the shapes of near the extrema for 1,000 randomly selected extrema.
✕
Show[Plot[fGolden[# + t], {t, -2, 0 + 5}, PlotRange -> All, PlotStyle -> Directive[RGBColor[0.36, 0.51, 0.71], Thickness[0.001], Opacity[0.3]]] & /@ \ RandomSample[extremasGolden, 1000], PlotRange -> All, Frame -> True, GridLines -> {{0}, {3}}, ImageSize -> 400] /. l_Line :> Mouseover[l, {Red, Thickness[0.002], Opacity[1], l}] |
For use in the next section and in the next example, I form a list of the zeros and extrema in the order in which they are on the real axis.
✕
zerosAndExtremaGolden = Sort[Join[{#, "zero"} & /@ zerosGolden, {#, "extrema"} & /@ extremasGolden]]; |
Using this list, one can make a correlation plot of the slope at a zero with the height of the following maximum. One sees quite a strong correlation.
✕
Histogram3D[{fGolden'[#1[[1]]], fGolden[#2[[1]]]} & @@@ Cases[Partition[zerosAndExtremaGolden, 2, 1], {{_, "zero"}, {_, "extrema"}}], 120] |
There is even a strong correlation between the slope at a zero and the second next extrema value.
✕
Histogram3D[{fGolden'[#[[1, 1]]], fGolden[Cases[#, {_, "extrema"}][[2, 1]]]} & /@ Take[Cases[ Partition[zerosAndExtremaGolden, 5, 1], {{_, "zero"}, __}], 60000], 120] |
All zeros of are on the real axis, while does have complex zeros—exactly at the maximum in the negative maximum-minimum-maximum triples. A plot of shows zeros as peaks. These pairs of zeros with nonvanishing imaginary parts result in relatively large spacing between successive roots in . (The appearance of these complex-valued zeros does not depend on , being irrational, but also occurs for rational , .)
✕
Plot3D[Evaluate[-Log[Abs[fGolden[x + I y]]]], {x, 0, 12}, {y, -1, 1}, PlotPoints -> {60, 60}, BoxRatios -> {3, 1, 1}, MeshFunctions -> {#3 &}, ViewPoint -> {1.9, -2.4, 1.46}, MeshStyle -> RGBColor[0.36, 0.51, 0.71], AxesLabel -> {x, y, Subscript[f, \[Phi]][x + I y]}] |
A contour plot showing the curves of vanishing real and imaginary parts has the zeros as crossings of curves of different color.
✕
ContourPlot[ Evaluate[# == 0 & /@ ReIm[fGolden[x + I y]]], {x, 0, 24}, {y, -1, 1}, PlotPoints -> {60, 60}, AspectRatio -> 1/3, Epilog -> {Purple, PointSize[0.012], Point[ ReIm[ z /. Solve[ fGolden[z] == 0 \[And] -2 < Im[z] < 2 \[And] 0 < Re[z] < 24]]]}] |
Similar to the above case along the real axis, here is the path of convergence in Newton iterations.
✕
Module[{ppx = 111, ppy = 21, f = Evaluate[(# - fGolden[#]/fGolden'[#])] &, data, vals, color}, data = Table[ NestList[f, N[x0 + I y0], 20], {y0, -1, 1, 2/ppy}, {x0, 0, 30, 30/ppx}]; vals = First /@ Select[Reverse[ SortBy[Tally[Flatten[Round[Map[Last, data, {2}], 0.001]]], Last]], #[[2]] > 50 &]; (color[#] = Blend[{{0, RGBColor[0.88, 0.61, 0.14]}, {1/2, RGBColor[0.36, 0.51, 0.71]}, {1, RGBColor[0.561, 0.69, 0.19]}}, RandomReal[]]) & /@ vals; Graphics3D[{Thickness[0.001], Opacity[0.5], Map[If[MemberQ[vals, Round[Last[#], 0.001]], {color[Round[Last[#], 0.001]], BSplineCurve[MapIndexed[Append[#, #2[[1]]] &, ReIm[#]]]}, {}] &, data, {2}]}, BoxRatios -> {3, 1, 2}, ViewPoint -> {0.95, -3.17, 0.70}]] |
Now that we have the positions of the zeros and extrema, we can also have a look at the complex zeros in more detail. In the above plots of over the complex plane, we saw that complex zeros occur when we have the situation maximum-minimum-maximum with all three function values at these negative extrema. Using the zeros and the extrema, we can easily find the positions of these extrema triples.
✕
tripleExtrema = SequenceCases[ zerosAndExtremaGolden, {{_, "extrema"}, {_, "extrema"}, {_, "extrema"}}][[All, 2, 1]]; |
About one-seventh of all these roots have a nonvanishing imaginary part.
✕
Length[tripleExtrema]/(Length[tripleExtrema] + Length[zerosGolden]) // N |
The function value of the maximum of all of these consecutive triple extrema are indeed all negative and never smaller than –1.
✕
Histogram[fGolden /@ tripleExtrema, 100] |
A log-log histogram of the absolute values of these middle maxima suggests that over a large range, a power law relation between their frequencies and their function values holds.
✕
Histogram[ Sort[-Select[fGolden /@ tripleExtrema, Negative]], {"Log", 100}, {"Log", "Count"}] |
Interestingly, the distance between the middle maxima is narrowly concentrated at three different distances.
✕
tripleExtremaDifferences = Differences[tripleExtrema]; |
✕
Histogram[tripleExtremaDifferences, 100] |
The next three plots zoom into the last three localized structures.
✕
Function[x, Histogram[Select[tripleExtremaDifferences, x - 1 < # < x + 1 &], 100]] /@ {5, 7, 12} |
Here are the three typical shapes that belong to these three classes of distances. I show 100 randomly selected and shifted pieces of .
✕
extremaGroups = With[{L = Sort[Transpose[{tripleExtremaDifferences, Most[tripleExtrema]}]]}, Function[x, {x, Select[L, x - 1 < #[[1]] < x + 1 & ]}] /@ {5, 7, 12}]; |
✕
Function[{\[CapitalDelta], l}, Show[Plot[Evaluate[fGolden[# + t]], {t, -2, \[CapitalDelta] + 2}, PlotRange -> All, PlotLabel -> Row[{HoldForm[\[CapitalDelta]x == \[CapitalDelta]]}], PlotStyle -> Directive[Thickness[0.001], RGBColor[0.36, 0.51, 0.71], Opacity[0.6]]] & /@ RandomSample[l[[All, 2]], 100]]] @@@ extremaGroups |
I use the position of all middle maxima as starting values for a numerical root-finding procedure to get the nearby complex roots.
✕
cRoots = If[Im[#] < 0, Conjugate[#]] & /@ z /. FindRoot[Evaluate[fGolden[z] == 0], {z, # + 1/2 I}] & /@ tripleExtrema; |
Large imaginary parts of the complex zeros occur at real parts that are nearby multiples of π.
✕
Histogram3D[{Mod[Re[#], 2 Pi], Im[#]} & /@ cRoots, 100, AxesLabel -> {Im[x], Mod[Re[x], 2 Pi]}] |
The function value at a middle maximum is strongly correlated with the magnitude of the imaginary part of the nearby complex root.
✕
Histogram3D[Transpose[{fGolden /@ tripleExtrema, Im[cRoots]}], 100, AxesLabel -> {Subscript[f, \[Phi]], Im[z]}] |
And here are the distributions of imaginary parts and the differences of the imaginary parts of consecutive (along the real axis) zeros.
✕
{Histogram[Im[cRoots], 100], Histogram[Differences[Im[cRoots]], 100]} |
If one splits the complex roots into the three groups from above, one obtains the following distributions.
✕
Column[GraphicsRow[{Histogram[Im[#], 100], Histogram[Differences[Im[#]], 100]} &[ If[Im[#] < 0, Conjugate[#]] & /@ z /. FindRoot[Evaluate[fGolden[z] == 0], {z, # + 1/2 I}] & /@ #2[[ All, 2]]], ImageSize -> 360, PlotLabel -> Row[{HoldForm[\[CapitalDelta]x == #1]}]] & @@@ extremaGroups] |
As a function of , the complex zeros of periodically join the real axis. The following graphic shows the surface in the -space in yellow/brown and the complex zeros as blue tubes. We first plot the surface and then plot the curves as mesh lines on this surface.
✕
ContourPlot3D[ Re[Cos[x + I y] + Cos[\[Alpha] (x + I y)] + Cos[\[Alpha]^2 (x + I y)]], {x, 0, 12}, {y, 0, 1}, {\[Alpha], 1, 2}, Contours -> {0}, BoxRatios -> {3, 1, 1}, ContourStyle -> {Directive[RGBColor[0.88, 0.61, 0.14], Opacity[0.3], Specularity[RGBColor[0.36, 0.51, 0.71], 10]]}, BoundaryStyle -> None, ImageSize -> 600, PlotPoints -> {160, 80, 80}, MaxRecursion -> 0, MeshStyle -> Directive[RGBColor[0.36, 0.51, 0.71], Thickness[0.006]], Mesh -> {{0}}, MeshFunctions -> {Function[{x, y, \[Alpha]}, Im[Cos[x + I y] + Cos[\[Alpha] (x + I y)] + Cos[\[Alpha]^2 (x + I y)]]]}, AxesLabel -> {x, y, \[Alpha]}] |
The next natural step (and the last one for this section) would be to look at the defined implicitly through. As for any value of , we will have (possibly infinitely) many ; we want to construct the Riemann surface for . A convenient way to calculate Riemann surfaces is through solving the differential equation for the defining function. Through differentiation, we immediately get the differential equation.
✕
f\[Alpha][z_, \[Alpha]_] := Cos[z] + Cos[\[Alpha] z] + Cos[\[Alpha]^2 z] |
✕
Solve[D[f\[Alpha][z[\[Alpha]], \[Alpha]], \[Alpha]] == 0, Derivative[1][z][\[Alpha]]] // Simplify |
✕
rhs[z_, \[Alpha]_] := -(( z (Sin[\[Alpha] z] + 2 \[Alpha] Sin[\[Alpha]^2 z]) )/( Sin[z] + \[Alpha] (Sin[\[Alpha] z] + \[Alpha] Sin[\[Alpha]^2 z]))) |
As starting values, I use along a segment of the real axis.
✕
ICs\[Alpha]x = Cases[Flatten[ Table[{#, x} & /@ (\[Alpha] /. Quiet[Solve[ f\[Alpha][N[1, 40] x, \[Alpha]] == 0 \[And] 0 < \[Alpha] < 2, \[Alpha]]]), {x, 0, 6, 6/100}], 1], {_Real, _}]; |
These are the points that we will use for the numerical differential equation solving.
✕
cp = ContourPlot[ f\[Alpha][x, \[Alpha]] == 0, {x, 0, 6}, {\[Alpha], 0, 1.9}, Epilog -> {Purple, Point[Reverse /@ ICs\[Alpha]x]}, FrameLabel -> {x, \[Alpha]}] |
Starting at a point , we move on a semicircle around the origin of the complex plane. To make sure we stay on the defining Riemann surface, we use the projection method for the numerical solution. And we change variables from , where .
✕
Monitor[rsf\[Alpha]Data = Table[With[{r = ICs\[Alpha]x[[k, 1]]}, nds\[Alpha] = NDSolveValue[{Derivative[1][z][\[CurlyPhi]] == I r Exp[I \[CurlyPhi]] rhs[z[\[CurlyPhi]], r Exp[I \[CurlyPhi]]], z[0] == ICs\[Alpha]x[[k, 2]]}, z, {\[CurlyPhi], 0 , Pi }, WorkingPrecision -> 30, PrecisionGoal -> 6, AccuracyGoal -> 6, Method -> {"Projection", Method -> "StiffnessSwitching", "Invariants" -> f\[Alpha][z[\[CurlyPhi]], r Exp[I \[CurlyPhi]]]}, MaxStepSize -> 0.01, MaxSteps -> 10^5]; {{r, nds\[Alpha][0], nds\[Alpha][Pi]}, ParametricPlot3D[ Evaluate[{r Cos[\[CurlyPhi]], r Sin[\[CurlyPhi]], #[nds\[Alpha][\[CurlyPhi]]]}], Evaluate[Flatten[{\[CurlyPhi], nds\[Alpha][[1]]}]], BoxRatios -> {1, 1, 1}, Axes -> True, PlotRange -> All, PlotStyle -> Directive[Thickness[0.002], Darker[Blue]], ColorFunctionScaling -> False, ColorFunction -> (ColorData["DarkRainbow"][ Abs[1 - #4/ Pi]] &)] & /@ {Re, Im}}], {k, Length[ICs\[Alpha]x]}] // Quiet;, Text@ Row[{"path ", k, " of ", Length[ICs\[Alpha]x]}]] |
Here is a plot of the real part of . One clearly sees how the sets of initially nearby zeros split at branch points in the complex plane.
✕
rsfRe = Show[rsf\[Alpha]Data[[All, 2, 1]], PlotRange -> {All, {0, All}, 10 {-1, 1}}, AxesLabel -> {Re[\[Alpha]], Im[\[Alpha]], Re[z]}, ViewPoint -> {-0.512, -3.293, 0.581}] |
One can calculate the branch points numerically as the simultaneous solutions of and .
✕
branchPointEqs[z_, \[Alpha]_] = {f\[Alpha][z, \[Alpha]], D[f\[Alpha][z, \[Alpha]], z]} |
✕
branchPoints = Union[Round[Select[#, Total[Abs[branchPointEqs @@ #]] < 10^-10 &], 10.^-6]] &@ ( Table[{z, \[Alpha]} /. FindRoot[Evaluate[branchPointEqs[z, \[Alpha]] == {0, 0}] , {z, RandomReal[{-8, 8}] + I RandomReal[{-8, 8}] }, {\[Alpha], RandomReal[{-5, 5}] + I RandomReal[{-5, 5}] }, PrecisionGoal -> 10] // Quiet, {20000}]); |
Here are the branch points near the origin of the complex plane.
✕
Graphics[{RGBColor[0.36, 0.51, 0.71], Point[ReIm /@ Last /@ branchPoints]}, Frame -> True, PlotRange -> 3.5, FrameLabel -> {Re[\[Alpha]], Im[\[Alpha]]}, PlotRangeClipping -> True] |
Representing the positions of the branch points in the above plot as vertical cylinders shows that the splitting indeed occurs at the branch points (we do not include the branch points with to have a better view of this complicated surface).
✕
Show[{rsfRe, Graphics3D[{CapForm[None], GrayLevel[0.3], Specularity[Purple, 20], Cylinder[{Append[ReIm[#], -10], Append[ReIm[#], 10]}, 0.02] & /@ Select[Last /@ branchPoints, Im[#] > 10^-2 &]}]}, PlotRange -> {{-2, 2}, {0, 2}, 8 {-1, 1}}, ViewPoint -> {-1.46, 2.87, 1.05}] |
The possible “shapes” of fGolden near points where the function has an extremum and a value ≈±1 is quite limited. The three possible curves arise from the different ways to form the sum –1 from the values ±1 of the three summands.
✕
maxMinus1Value = Select[extremasGolden, Abs[fGolden[#] - (-1)] < 10^-4 &]; Length[maxMinus1Value] |
✕
Take[maxMinus1Value, 12] |
✕
Show[Plot[fGolden[# + t], {t, 0 - 2, 0 + 5}, PlotRange -> All, PlotStyle -> Directive[RGBColor[0.36, 0.51, 0.71], Thickness[0.001], Opacity[0.3]]] & /@ maxMinus1Value, PlotRange -> All, Frame -> True, GridLines -> {{0}, {3}}, ImageSize -> 400] |
We compare these three curves with the possible curves that could be obtained from considering all possible phases between the three summands—meaning we consider all+ + such that and .
✕
gGolden[x_, {\[CurlyPhi]1_, \[CurlyPhi]2_, \[CurlyPhi]3_}] := Cos[x + \[CurlyPhi]1] + Cos[GoldenRatio x + \[CurlyPhi]2] + Cos[GoldenRatio^2 x + \[CurlyPhi]3]; |
✕
phaseList = Table[ Block[{\[CurlyPhi]1 = RandomReal[{-Pi, Pi}]}, Check[Flatten[{\[CurlyPhi]1, {\[CurlyPhi]2, \[CurlyPhi]3} /. FindRoot[{Cos[\[CurlyPhi]1] + Cos[\[CurlyPhi]2] + Cos[\[CurlyPhi]3] == -1, -Sin[\[CurlyPhi]\ 1] - GoldenRatio Sin[\[CurlyPhi]2] - GoldenRatio^2 Sin[\[CurlyPhi]3] == 0}, {{\[CurlyPhi]2, RandomReal[{-Pi, Pi}]}, {\[CurlyPhi]3, RandomReal[{-Pi, Pi}]}}]}], {}]], {600}]; // Quiet |
The next graphic shows the two curve families together.
✕
Show[ {Plot[Evaluate[ gGolden[x, #] &@ DeleteCases[Take[phaseList, All], {}]], {t, -2, 0 + 5}, PlotRange -> All, PlotStyle -> Directive[RGBColor[0.36, 0.51, 0.71], Opacity[0.3], Thickness[0.001]]], Plot[fGolden[t - #], {t, -2, 0 + 5}, PlotRange -> All, PlotStyle -> Directive[RGBColor[0.88, 0.61, 0.14], Thickness[0.003], Opacity[0.3]]] & /@ maxMinus1Value}] |
These curves now allow us to answer the original question about the location of the singularities in the distribution of the distances of successive zeros. Similar to the largest peak identified above arising from the bunching of curves with function values , the other three maxima arise from curves with local minima or maxima . We calculate some of these zeros numerically.
✕
minEnvelopeZeros = Union[Round[ Sort[\[Delta] /. Quiet[FindRoot[ fGolden[# + \[Delta]] == 0, {\[Delta], 1/2}]] & /@ Take[maxMinus1Value, 100]], 0.025]] |
Indeed, the gridlines match the singularities precisely.
✕
Histogram[Differences[zerosGolden], 1000, PlotRange -> All, GridLines -> {Flatten[{2 zerosGolden[[1]], 2 minEnvelopeZeros}], {}}] |
The unifying feature of all four singularities is their location at the zeros of envelope curves. Here are the curves of fGolden around 100 zeros for each of the four singularities.
✕
With[{L = {#[[1, 1]], Last /@ #} & /@ Reverse[SortBy[Split[Sort[Transpose[ {Round[differencesGolden, 0.002], Most[zerosGolden]}]], #1[[ 1]] === #2[[1]] &], Length]]}, Partition[Function[p, Show[ Plot[fGolden[# + t], {t, -2, 0 + 5}, PlotRange -> All, PlotLabel -> Row[{"x", "\[TildeTilde]", p}], GridLines -> {None, {-1, 3}}, PlotStyle -> Directive[RGBColor[0.36, 0.51, 0.71], Thickness[0.001], Opacity[0.3]]] & /@ Take[Select[L, Abs[#[[1]] - p] < 0.1 &, 1][[1, 2]], UpTo[100]]]] /@ {1, 1.5, 1.8, 3}, 2]] // Grid |
These graphics show strikingly the common feature of these four groups of zero distances: either maxima cluster around or minima cluster around .
The curves in the plane that fulfill the two conditions are shown in the next contour plot.
✕
ContourPlot[ Evaluate[Function[sign, Derivative[1, {0, 0, 0}][gGolden][ 0, {\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}] == 0 /. (Solve[ gGolden[0, {\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}] == sign 1, \[CurlyPhi]1] /. ConditionalExpression[x_, _] :> x /. _C :> 0)] /@ {-1, 1}], {\[CurlyPhi]2, 0, 2 Pi}, {\[CurlyPhi]3, 0, 2 Pi}, PlotPoints -> 60, ContourStyle -> RGBColor[0.36, 0.51, 0.71]] |
Now we can also calculate a numerical approximation to the exact value of the position of the first singularity. We consider the envelope of all curves with the properties , . The envelope property is represented through the vanishing partial derivative with respect to .
✕
(Table[Round[ Mod[{\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3} /. FindRoot[ Evaluate[{gGolden[ 0, {\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}] == -1, Derivative[1, {0, 0, 0}][gGolden][ 0, {\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}] == 0, Derivative[0, {1, 0, 0}][gGolden][ 0, {\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}] == 0}], {\[CurlyPhi]1, RandomReal[{-Pi, Pi}]}, {\[CurlyPhi]2, RandomReal[{-Pi, Pi}]}, {\[CurlyPhi]3, RandomReal[{-Pi, Pi}]}, PrecisionGoal -> 12] , 2 Pi] /. x_?(Abs[#] < 10^-6 || Abs[2 Pi - #] < 10.^-6 &) :> 0, 10.^-6], {100}] // Quiet // Tally) /. x_Real?(Abs[# - Pi] < 0.01 &) :> Pi |
The envelope curve of this family is a small modification of our original function—the sign of the first term is inverted.
✕
cosEqs = {gGolden[x, {Pi, Pi, 0}], gGolden[x, {0, Pi, Pi}], gGolden[x, {Pi, 0, Pi}]} |
The positions of the first roots of the last equations are , and . Multiplying these numbers by two to account for their symmetric distance from the origin yields the zero distances seen above at , and .
✕
FindRoot[# == 0, {x, 1/2}] & /@ cosEqs |
And, using the higher-precision root, we zoom into the neighborhood of the first peak to confirm our conjecture.
✕
With[{z0 = 2 x /. FindRoot[ Evaluate[Cos[x] - Cos[GoldenRatio x] - Cos[GoldenRatio^2 x] == 0], {x, 1/2}, PrecisionGoal -> 12]}, Histogram[ Select[Differences[ zerosGoldenRefined], -0.0003 < z0 - # < 0.0004 &] - z0, 100, "PDF", GridLines -> {{0}, {}}]] |
The envelope arises from the fact that a small variation in can be compensated by appropriate changes in and .
✕
((-\[Alpha] + \[Beta]*Sqrt[\[Alpha]^2 + \[Beta]^2 - 1])*Sin[x*\[Alpha]])/(\[Alpha]^2 + \[Beta]^2) + ((-\[Beta] - \[Alpha]*Sqrt[\[Alpha]^2 + \[Beta]^2 - 1])*Sin[x*\[Beta]])/ (\[Alpha]^2 + \[Beta]^2) // FullSimplify |
✕
((-\[Alpha] + \[Beta] Sqrt[\[Alpha]^2 + \[Beta]^2 - 1]) Sin[x \[Alpha]] - (\[Beta] + \[Alpha] Sqrt[\[Alpha]^2 + \[Beta]^2 - 1]) Sin[x \[Beta]])/(\[Alpha]^2 + \[Beta]^2) |
✕
Manipulate[ Plot[Evaluate[{-Cos[x] + Cos[\[Alpha] x] + Cos[\[Beta] x], -Cos[x] + Cos[x \[Alpha]] + Cos[x \[Beta]] + (((-\[Alpha] + \[Beta] Sqrt[\[Alpha]^2 + \[Beta]^2 \ - 1]) Sin[x \[Alpha]] - (\[Beta] + \[Alpha] Sqrt[\[Alpha]^2 + \[Beta]^2 - 1]) Sin[ x \[Beta]])/(\[Alpha]^2 + \[Beta]^2)) \ \[CurlyPhi]1MinusPi} /. { \[Alpha] -> GoldenRatio, \[Beta] -> GoldenRatio^2} /. \[CurlyPhi]1 -> Pi + 0.1], {x, 0, 1}], {{\[CurlyPhi]1MinusPi, 0.3, HoldForm[\[CurlyPhi]1 - Pi]}, -0.5, 0.5, Appearance -> "Labeled"}, TrackedSymbols :> True] |
These zeros of the equation do indeed match the position of the above singularities in the distribution of the zero distances.
We end this section by noting that envelope condition, meaning the vanishing of derivatives with respect to the phases, is a very convenient method for finding special families of curves. For instance, there are families of solutions such that many curves meet at a zero.
✕
findSolutions[eqs_] := Module[{fm}, While[fm = FindRoot[Evaluate[eqs], {\[CurlyPhi]1, RandomReal[{0, 2 Pi}]}, {\[CurlyPhi]2, RandomReal[{0, 2 Pi}]}, {\[CurlyPhi]3, RandomReal[{0, 2 Pi}]}, {d, RandomReal[{1/2, 5}]}] // Quiet; Not[ Total[Abs[(Subtract @@@ eqs) /. fm]] < 10^-6 \[And] 10^-3 < Abs[d /. fm] < 5] ] ; fm /. (\[CurlyPhi] : (\[CurlyPhi]1 | \[CurlyPhi]2 | \[CurlyPhi]3) \ -> \[Xi]_) :> (\[CurlyPhi] -> Mod[\[Xi], 2 Pi])] |
Modulo reflection symmetry, there is a unique solution to this problem.
✕
SeedRandom[1]; fs1 = findSolutions[{gGolden[ 0, {\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}] == 0, Derivative[0, {1, 0, 0}][gGolden][ 0, {\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}] == 0, gGolden[d, {\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}] == 0, Derivative[0, {1, 0, 0}][gGolden][ d, {\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}] == 0}] |
Plotting a family of curves with different values of shows that at the two zeros, the family of curves bunches up.
✕
Plot[Evaluate[ Table[gGolden[ x, {\[CurlyPhi]1 + \[Delta]\[CurlyPhi]1, \[CurlyPhi]2, \ \[CurlyPhi]3}], {\[Delta]\[CurlyPhi]1, -0.2, 0.2, 0.4/6}] /. fs1], {x, -3, 6}, GridLines -> ({{d}, {}} /. fs1)] |
And here are some solutions that show curves that bunch at a zero and at an extremum.
✕
Function[sr, SeedRandom[sr]; fs2 = findSolutions[{gGolden[ 0, {\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}] == 0, Derivative[0, {1, 0, 0}][gGolden][ 0, {\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}] == 0, Derivative[1, {0, 0, 0}][gGolden][ d, {\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}] == 0, Derivative[0, {1, 0, 0}][gGolden][ d, {\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}] == 0}]; Plot[Evaluate[ Table[gGolden[ x, {\[CurlyPhi]1 + \[Delta]\[CurlyPhi]1, \[CurlyPhi]2, \ \[CurlyPhi]3}], {\[Delta]\[CurlyPhi]1, -0.2, 0.2, 0.4/6}] /. fs2], {x, -3, 6}, GridLines -> ({{d}, {}} /. fs2)]] /@ {1, 6, 38} |
We also look at a second family of envelope solutions: what are the most general types of curves that fulfill the envelope condition at an extremum? This means we have to simultaneously fulfill the following two equations.
✕
{Derivative[1, {0, 0, 0}][gGolden][ 0, {\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}] == 0, Derivative[0, {1, 0, 0}][gGolden][ 0, {\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}] == 0} |
This in turn means we must have .
✕
cpExtEnv = ContourPlot[-GoldenRatio Sin[\[CurlyPhi]2] - GoldenRatio^2 Sin[\[CurlyPhi]3] == 0, {\[CurlyPhi]2, 0, 2 Pi}, {\[CurlyPhi]3, 0, 2 Pi}, PlotPoints -> 40] |
Here is a contour plot of the curves in the plane where the extremum envelope condition is fulfilled.
✕
linesExtEnv = Cases[Normal[cpExtEnv], _Line, \[Infinity]]; |
✕
Show[ Plot[ Cos[x + 0] + Cos[GoldenRatio x + #1] + Cos[GoldenRatio^2 x + #2], {x, -5, 5}, PlotStyle -> Directive[Thickness[0.001], RGBColor[0.36, 0.51, 0.71], Opacity[0.5]], PlotRange -> All, GridLines -> {{}, {-3, -1, 1, 3}}] & @@@ \ RandomSample[Level[linesExtEnv, {-2}], UpTo[300]]] |
We add a mouseover to the contour plot that shows a family of curves near to the envelope conditions.
✕
make\[CurlyPhi]2\[CurlyPhi]3Plot[{\[CurlyPhi]2_, \[CurlyPhi]3_}] := Column[{Plot[ Evaluate[ Table[gGolden[ x, {\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}], {\[CurlyPhi]1, \ -0.15, 0.15, 0.3/6}]], {x, -4, 4}, PlotLabel -> Subscript[\[CurlyPhi], 1] \[TildeTilde] 0], Plot[ Evaluate[ Table[gGolden[ x, {Pi + \[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}], {\ \[CurlyPhi]1, -0.15, 0.15, 0.3/6}]], {x, -4, 4}, PlotLabel -> Subscript[\[CurlyPhi], 1] \[TildeTilde] Pi]}, Dividers -> Center] |
✕
Normal[cpExtEnv] /. Line[l_] :> (Tooltip[Line[ #], Dynamic[make\[CurlyPhi]2\[CurlyPhi]3Plot[Mean[#]]]] & /@ Partition[l, 2, 1] ) |
And the next graphic shows the distance between the nearest (to the origin) positive or negative roots.
✕
addHeight[{\[CurlyPhi]2_, \[CurlyPhi]3_}, pm_] := Module[{roots, \[Rho] }, roots = Sort[x /. Solve[gGolden[x, {0, \[CurlyPhi]2, \[CurlyPhi]3}] == 0 \[And] -6 < x < 6, x]] // Quiet; \[Rho] = If[pm === 1, Select[roots, # > 0 &, 1], -Select[Sort[-roots], # > 0 &, 1] ][[ 1]]; {{\[CurlyPhi]2, \[CurlyPhi]3, 0}, {\[CurlyPhi]2, \[CurlyPhi]3, \[Rho]}}] |
✕
Graphics3D[{Opacity[0.2], Gray, Polygon[{{0, 0, 0}, {2 Pi, 0, 0}, {2 Pi, 2 Pi, 0}, {0, 2 Pi, 0}}], EdgeForm[], Opacity[1], Transpose[{{RGBColor[0.88, 0.61, 0.14], RGBColor[0.36, 0.51, 0.71]}, Table[ Polygon[Join[#1, Reverse[#2]] & @@@ Partition[addHeight[#, s] & /@ #[[1]], 2, 1]] & /@ linesExtEnv, {s, {1, -1}}]}]}, PlotRange -> All, Axes -> True, Lighting -> "Neutral", AxesLabel -> {Subscript[\[CurlyPhi], 2], Subscript[\[CurlyPhi], 3], \!\(\*SubscriptBox[\(\[Rho]\), \("\<\[PlusMinus]\>"\)]\)} ] |
Now I can implement the following function, maximaPositions, to find the singularities of the distributions of the distances of successive zeros for functions of the form + for arbitrary real .
✕
maximaPositions[ a1_. Cos[x_] + a2_. Cos[\[Alpha]_ x_] + a3_. Cos[\[Beta]_ x_], x_] := 2*(Function[{s1, s2, s3}, Min[N[ x /. Solve[ s1 a1 Cos[x] + s2 a2 Cos[\[Alpha] x] + s3 a3 Cos[\[Beta] x] == 0 \[And] 0 < x < 10, x]]]] @@@ {{1, 1, 1}, {-1, 1, 1}, {1, -1, 1}, {1, 1, -1}}) |
Here are the positions of the peaks for .
✕
peaksGolden = maximaPositions[fGolden[x], x] |
Zooming into the histogram shows again that the predicted peak positions match the observed ones well.
✕
Partition[Histogram[Differences[zerosGolden], 2000, PlotRange -> {{# - 0.02, # + 0.02}, All}, GridLines -> {peaksGolden, {}}, PlotRangeClipping -> True] & /@ peaksGolden, 2] // Grid |
Here is a quick numerical/graphical check for a “random” function of the prescribed form with not identically one.
✕
testFunction[x_] := Cos[x] + 2 Cos[Sqrt[2] x] + Sqrt[3] Cos[Pi x] |
✕
zerosTestFunction = findZeros[testFunction, 10^5]; |
✕
singPosFunction = maximaPositions[testFunction[x], x] |
✕
Histogram[Differences[zerosTestFunction], 100, GridLines -> {singPosFunction, {}}] |
We should check for an extended range of parameters if the conjectured formula for the position of the singularities really holds. So for a few hundred values of in , we calculate the histograms of the zero distances. This calculation will take a few hours. (In the interactive version, the cell is set to unevaluatable.)
✕
With[{p\[Alpha] = 200}, Monitor[ \[Alpha]Data = SparseArray[ Flatten[Table[({j, Round[p\[Alpha] #1] + 1} -> #2) & @@@ #[[ j]][[2]], {j, Length[#]}], 1]] & @ Table[ f\[Alpha][x_] := Cos[x] + Cos[\[Alpha] x] + Cos[\[Alpha]^2 x]; zList = findZeros[f\[Alpha], 20000, 0]; {\[Alpha], Tally[Round[Differences[zList], N[1/p\[Alpha]]]]};, {\[Alpha], 1/2, 2, 1/(120 Pi)}], Row[{"\[Alpha]=", N[\[Alpha]]}]]] |
✕
rPlot\[Alpha] = ReliefPlot[Log[1 + Take[\[Alpha]Data, {2, -1}, {1, 1200}]], DataRange -> {{0, 1200 0.005}, {1/2, 2}}, FrameTicks -> True, FrameLabel -> {"\[CapitalDelta] zeros", "\[Alpha]"}, AspectRatio -> 1/3] |
I overlay with the predicted positions of the singularities; they match perfectly.
✕
Monitor[\[Alpha]Sings = Table[{#, \[Alpha]} & /@ maximaPositions[N[Cos[x] + Cos[\[Alpha] x] + Cos[\[Alpha]^2 x]], x], {\[Alpha], 0.5, 2, 1/5/201}];, \[Alpha]] |
Now, what do the positions of the singularities in the case of two parameters , in look like? Plotting the locations of a few thousand singularity positions in space clearly shows four intersecting surfaces.
✕
singGraphics3D = Graphics3D[{RGBColor[0.36, 0.51, 0.71], Sphere[Flatten[ Table[{\[Alpha], \[Beta], #} & /@ Quiet[maximaPositions[ N[Cos[x] + Cos[\[Alpha] x] + Cos[\[Beta] x]], x]], {\[Beta], 1/2 + Pi/1000, 5/2 + Pi/1000, 2/31}, {\[Alpha], 1/2 + Pi/1001, 5/2 + Pi/1001, 2/31}], 2], 0.015]}, PlotRange -> {{0.5, 2.5}, {0.5, 2.5}, {0, 6}}, BoxRatios -> {1, 1, 1}, Axes -> True, ViewPoint -> {2.83, 1.76, -0.61}, Method -> {"SpherePoints" -> 6}] |
Instead of calculating many concrete tuples of positions, we can consider the positions of the singularities as functions of , in . Differentiating the last expression allows us to get a partial differential equation for each of the four equations that, given boundary conditions, we can solve numerically and then plot. Or instead of solving a partial differential equation, we can fix one of the two parameters and and obtain a family of parametrized ordinary differential equations. We carry this procedure out for the two sheets that don’t cross each other.
✕
smallestRoot[{pm1_, pm2_, pm3_}, \[Alpha]_, \[Beta]_] := Min@Select[DeleteCases[Table[ Quiet@ Check[x /. FindRoot[ pm1 Cos[x] + pm2 Cos[\[Alpha] x] + pm3 Cos[\[Beta] x] == 0, {x, RandomReal[{0, Pi}]}, PrecisionGoal -> 12], {}], {40}], {}], # > 0 &] |
✕
makeGrid[{pm1_, pm2_, pm3_}, M_] := Show[Cases[Flatten[ {Table[ nds\[Alpha] = NDSolveValue[{D[ Cos[x[\[Alpha]]] + Cos[\[Alpha] x[\[Alpha]]] + Cos[\[Beta] x[\[Alpha]]], \[Alpha]] == 0, x[E] == smallestRoot[{pm1, pm2, pm3}, E, \[Beta]]}, x, {\[Alpha], 3/4, 3}, PrecisionGoal -> 12]; ParametricPlot3D[{\[Alpha], \[Beta], 2 nds\[Alpha][\[Alpha]]}, Evaluate[Flatten[{\[Alpha], nds\[Alpha][[1]]}]], PlotRange -> All, PlotStyle -> Gray], {\[Beta], 3/4 + Pi/1000, 3 + Pi/1000, 2/M}], Table[ nds\[Beta] = NDSolveValue[{D[ Cos[x[\[Beta]]] + Cos[\[Alpha] x[\[Beta]]] + Cos[\[Beta] x[\[Beta]]], \[Beta]] == 0, x[E] == smallestRoot[{pm1, pm2, pm3}, \[Alpha], E]}, x, {\[Beta], 3/4, 3}, PrecisionGoal -> 12]; ParametricPlot3D[{\[Alpha], \[Beta], 2 nds\[Beta][\[Beta]]}, Evaluate[Flatten[{\[Beta], nds\[Beta][[1]]}]], PlotRange -> All, PlotStyle -> Directive[RGBColor[0.88, 0.61, 0.14], Thickness[0.002]]], {\[Alpha], 3/4 + Pi/1000, 3 + Pi/1000, 2/M}] }], _Graphics3D], PlotRange -> All, AxesLabel -> {"\[Alpha]", "\[Beta]", None}] // Quiet |
✕
Show[{singGraphics3D, makeGrid[{1, 1, 1}, 20], makeGrid[{-1, 1, 1}, 20]}] |
The other two sheets could also be calculated, but things would become a bit more complicated because the initial conditions calculated with smallestRoot are no longer continuous functions of and . The following graphic visualizes this situation when new zeros suddenly appear that didn’t exist for the blue curve due to a small change in .
✕
Module[{\[Alpha] = 2.02, \[Beta] = 2.704}, Plot[{Cos[x] - Cos[\[Alpha] x] + Cos[\[Beta] x], Cos[x] - Cos[(\[Alpha] - 0.1) x] + Cos[\[Beta] x]}, {x, 0, 3}]] |
Interestingly, one can generalize even further the above formula for the peaks in the distance between successive zeros and allow arbitrary phases in the three cos functions. Numerical experiments indicate that for many such cosine sums, one can just ignore the phases and find the smallest zeros of exactly the same equations as above.
✕
maximaPositionsGeneralized[ a1_. Cos[x_ + \[CurlyPhi]1_.] + a2_. Cos[\[Alpha]_ x_ + \[CurlyPhi]2_.] + a3_. Cos[\[Beta]_ x_ + \[CurlyPhi]3_.], x_] := maximaPositions[ a1 Cos[x] + a2 Cos[\[Alpha] x] + a3 Cos[\[Beta] x], x] |
Here is another random example.
✕
fRandom[x_] := 1/2 Cos[x + 1] + Sqrt[2] Cos[GoldenRatio x + Pi/5] + Sqrt[3] Cos[Pi x + 2] zerosRandom = findZeros[fRandom, 10^5]; |
✕
singPosFunction = maximaPositionsGeneralized[fRandom[x], x] |
✕
Histogram[Differences[zerosRandom], 100, GridLines -> {singPosFunction, {}}] |
I modify the phases in the above function fRandom, and obtain the same position for the singularities.
✕
fRandom2[x_] := 1/2 Cos[x + Cos[3]] + Sqrt[2] Cos[GoldenRatio x + Log[2]] + Sqrt[3] Cos[Pi x + E] zerosRandom2 = findZeros[fRandom2, 10^5]; |
✕
Histogram[Differences[zerosRandom2], 100, GridLines -> {singPosFunction, {}}] |
In degenerate cases, such as all phases being , meaning , the positions of the peaks in the distribution of the zero distances will sometimes be different from the just-given conjecture. This means that e.g. the position of the spacings of the extrema of the sin equivalent of are not described by the above equations.
In the last section, we established the position of the peaks of the original MathOverflow post as twice the smallest zeros of . And for many random instances of and , these four numbers indeed characterize the peaks visible in the distribution of successive zero distances of . We saw this clearly in the plots in the last section that spanned various parameter ranges for and .
Earlier, I remarked that the original MathOverflow plot that used the function has some features special to this function, like the gaps in the distribution of the zero distances. At the same time, the function is generic in the sense that the zero-spacing distribution has exactly four peaks, as we expect for generic and : the four red curves in the above graphic for rPlotα. We consider a slight modification of fGolden where instead of , we use .
✕
\[Phi]Brass = (1 + Sqrt[3])/2; fBrass[x_] := Cos[x] + Cos[\[Phi]Brass x] + Cos[\[Phi]Brass^2 x] |
We again calculate the first 100k zeros and their distances.
✕
zerosBrass = findZeros[fBrass, 10^5]; |
✕
differencesBrass = Differences[zerosBrass]; |
Interestingly, we now get a distribution with six maxima. Four of the maxima are again well described by the maxima position conjectured above.
✕
peaksBrass = maximaPositions[fBrass[x], x]; |
✕
Histogram[differencesBrass, 1000, GridLines -> {peaksBrass, {}}] |
The square root term makes all these examples special. The function is ultimately only dependent on two, rather than three, algebraically independent terms.
✕
Cos[x] + Cos[(1 + Sqrt[n])/2 x] + Cos[(1 + Sqrt[n])/2 ^2 x] // ExpandAll // TrigExpand |
✕
Collect[Subtract @@ Eliminate[{fSqrtN == %, Cos[x/4]^2 + Sin[x/4]^2 == 1, Cos[Sqrt[n] x/4]^2 + Sin[Sqrt[n] x/4]^2 == 1}, {Sin[x/4], Sin[Sqrt[n] x/4]}] , fP, Factor] |
This is reflected in the fact that the triples , similar to the above equivalent for the golden ratio, are located on relatively simple 2D surfaces.
✕
Show[{ContourPlot3D[(-1 + x + 2 y^2)^2 + 4 (-1 + x - 2 x y^2) z^2 + 4 z^4 == 0, {x, -1, 1}, {y, -1, 1}, {z, -1, 1}, PlotPoints -> 40, Mesh -> None, ContourStyle -> Opacity[0.5]], Graphics3D[{PointSize[0.005], Point[ Table[{Cos[x] , Cos[\[Phi]Brass x] , Cos[\[Phi]Brass^2 x]}, {x, 1., 10000., 1.}]]}]}] |
A tempting and easy generalization of the positions of the maxima might be zero distances that follow the previously calculated four distances. This generalization is easy to implement.
✕
rootDistances[ a1_. Cos[x_] + a2_. Cos[\[Alpha]_ x_] + a3_. Cos[\[Beta]_ x_], x_, xMax_] := Function[{s1, s2, s3}, DeleteDuplicates[Differences[ Sort[ N[x /. Solve[s1 a1 Cos[x] + s2 a2 Cos[\[Alpha] x] + s3 a3 Cos[\[Beta] x] == 0 \[And] -xMax < x < xMax, x]]]]]] @@@ {{1, 1, 1}, {-1, 1, 1}, {1, -1, 1}, {1, 1, -1}} |
Interestingly, that gives nearly the correct positions, but not quite.
✕
rdBrass = rootDistances[fBrass[x], x, 12] |
✕
Function[c, Histogram[Select[differencesBrass, c - 0.005 < # < c + 0.005 &], 100, GridLines -> {Flatten[rdBrass], {}}, PlotRangeClipping -> True]] /@ {1.0977, 2.366} |
Where are these additional maxima? We will not calculate their exact positions here. But a quick look at the neighborhood of zeros that have the peak distances shows clearly that there are additional families of envelope curves involved. Interestingly again, families with occur, and this time reflected and translated versions of the curve arise.
✕
With[{L = {#[[1, 1]], Last /@ #} & /@ Reverse[SortBy[Split[Sort[Transpose[ {Round[differencesBrass, 0.001], Most[zerosBrass]}]], #1[[ 1]] === #2[[1]] &], Length]]}, Function[p, Show[Plot[fBrass[# + t], {t, -10, 0 + 10}, PlotRange -> All, PlotLabel -> Row[{"x", "\[TildeTilde]", p}], GridLines -> {None, {-1, 1, 3}}, PlotStyle -> Directive[RGBColor[0.36, 0.51, 0.71], Thickness[0.001], Opacity[0.5]]] & /@ Take[Select[L, Abs[#[[1]] - p] < 0.1 &, 1][[ 1, 2]], UpTo[50]]]] /@ {1.098, 2.367}] /. l_Line :> Mouseover[l, {Red, Thick, l}] |
We will not try to interpret the family of curves that form these singularities here and now; in the next section we will develop some general methods to identify the positions of the singularities on the zero distances.
For larger square roots, even stranger distributions with even more maxima can occur in the distribution of the zero spacings. Here is the case:
✕
\[Phi]Platinum = (1 + Sqrt[13])/2; fPlatinum[x_] := Cos[x] + Cos[\[Phi]Platinum x] + Cos[\[Phi]Platinum^2 x] zerosPlatinum = findZeros[fPlatinum, 10^5]; |
The zeros mod have multiple gaps.
✕
Histogram[Mod[zerosPlatinum, 2 Pi], 1000] |
✕
PolarPlot[1 + fPlatinum[t]^2/6, {t, 0, 200 2 Pi}, PlotStyle -> Directive[Opacity[0.4], RGBColor[0.36, 0.51, 0.71], Thickness[0.001]], PlotPoints -> 1000, Prolog -> {RGBColor[0.88, 0.61, 0.14], Disk[]}] |
✕
differencesPlatinum = Differences[zerosPlatinum]; peaksPlatinum = maximaPositions[fPlatinum[x], x]; Histogram[differencesPlatinum, 1000, GridLines -> {peaksPlatinum, {}}] |
We end this section with the square root of 11, which has an even more complicated zero-spacing distribution.
✕
\[Phi]Rhodium = (1 + Sqrt[11]); fRhodium[x_] := Cos[x] + Cos[\[Phi]Rhodium x] + Cos[\[Phi]Rhodium^2 x] zerosRhodium = findZeros[fRhodium, 10^5]; peaksRhodium = maximaPositions[fRhodium[x], x] Histogram[Differences[zerosRhodium], 1000, GridLines -> {peaksRhodium, {}}, PlotRange -> All] |
So far, we have looked at the distribution of the distances between successive zeros. In this section, we will generalize to the distribution between any pair of zeros. While this seems like a more general problem, the equations describing the possible distances will easily allow us to determine the peak positions for successive zero distances.
If we consider not only the distance between consecutive zeros but also between more distant zeros, we get a more complicated distribution of zero distances.
✕
hgd = Histogram[ zeroDistancesAll = Select[(zerosGolden[[#2]] - zerosGolden[[#1]]) & @@@ Sort[Flatten[Table[{k, k + j}, {k, 20000}, {j, 1, 50}], 1]], # < 20 &], {0.001}] |
All of the peaks are sums of the four peaks seen in the distances between consecutive zeros. But the reverse is not true—not all sums of consecutive zero distances show up as peaks. We identify the sums that are observed.
✕
observedPeaks = Sort[First /@ Select[Tally[ Round[zeroDistancesAll, 0.001]], Last[#] > 50 &]]; |
✕
calculatedPeaks = Flatten[Table[Union[{Total[First /@ #], Sort[Last /@ #]} & /@ Tuples[ Transpose[{{1.8136187, 1.04397, 1.49906, 3.01144}, Range[4]}], {j}]], {j, 1, 10}], 1]; |
✕
nf = Nearest[(#[[1]] -> #) & /@ calculatedPeaks]; |
✕
peakSums = DeleteDuplicates[Sort[nf[#][[1]] & /@ observedPeaks]] |
Here is the left half of the last histogram shown together with the peak sum positions.
✕
Show[{hgd, ListPlot[Callout[ {#1, 20 + 12 #}, Row[#2]] & @@@ peakSums, Filling -> Axis, PlotStyle -> PointSize[0.008]]}] |
The variable ranges from 0 to infinity in and so is not suited to be used as a parametrization variable to show large ranges. The three expressions , and are all in the interval . Above, we plotted the summands in these cosine terms; now we plot the triples , where is the distance between the and the zero. This gives some interesting-looking curves.
✕
zeroPairIndices = Sort[Flatten[Table[{k, k + j}, {k, 25000}, {j, 1, 30}], 1]]; |
✕
Graphics3D[{PointSize[0.001], RGBColor[0.36, 0.51, 0.71], Point[ {Cos[GoldenRatio zerosGolden[[#1]]], Cos[GoldenRatio^2 zerosGolden[[#1]]], zerosGolden[[#2]] - zerosGolden[[#1]]} & @@@ zeroPairIndices]}, BoxRatios -> {1, 1, 2}, Axes -> True, PlotRange -> {{-1, 1}, {-1, 1}, {0, 8}}] |
In the case of , we just get a curve; in the case of general and , we obtain an intricate-looking surface—for instance, for .
✕
zerosEPi = findZeros[Function[x, Cos[x] + Cos[E x] + Cos[Pi x]], 4 10^4]; |
✕
Graphics3D[{PointSize[0.001], RGBColor[0.36, 0.51, 0.71], Point[ {Cos[ E zerosEPi[[#1]]], Cos[Pi zerosEPi[[#1]]], zerosEPi[[#2]] - zerosEPi[[#1]]} & @@@ zeroPairIndices]}, BoxRatios -> {1, 1, 2}, Axes -> True, PlotRange -> {{-1, 1}, {-1, 1}, {0, 8}}, ViewPoint -> {1.37, -3.08, 0.28}, AxesLabel -> {Cos[\[Alpha] Subscript[z, i]], Cos[\[Beta] Subscript[z, i]], Subscript[z, j] - Subscript[z, i]}] |
Now, can we find a closed form of this surface? Turns out, we can. For generic and , we will have no algebraic relation between and , so these two expressions are natural independent variables. Assume that is a zero of and that another zero has distance to . Then the sum of the three cosines implies a relation between and and . We can calculate this relation by eliminating unwanted variables from and .
✕
f\[Alpha]\[Beta][x_] := Cos[x] + Cos[\[Alpha] x] + Cos[\[Beta] x] |
✕
f\[Alpha]\[Beta][x0 + \[Delta]] // ExpandAll // TrigExpand |
✕
c\[Alpha]c\[Beta]\[Delta]Equation = GroebnerBasis[ {f\[Alpha]\[Beta][x0], f\[Alpha]\[Beta][x0 + \[Delta]] // ExpandAll // TrigExpand, (* algebaric identities for Cos[...], Sin[...] *) Cos[x0]^2 + Sin[x0]^2 - 1, Cos[\[Alpha] x0]^2 + Sin[\[Alpha] x0]^2 - 1, Cos[\[Beta] x0]^2 + Sin[\[Beta] x0]^2 - 1}, {}, {Cos[x0], Sin[x0], Sin[\[Alpha] x0], Sin[\[Beta] x0]}, MonomialOrder -> EliminationOrder][[ 1]] /. {Cos[x0 \[Alpha]] -> c\[Alpha], Cos[x0 \[Beta]] -> c\[Beta]} // Factor; |
The resulting polynomial (in , ) is pretty large, with more than 2,500 terms.
✕
{Exponent[c\[Alpha]c\[Beta]\[Delta]Equation, {c\[Alpha], c\[Beta]}], Length[c\[Alpha]c\[Beta]\[Delta]Equation]} |
Here is a snippet of the resulting equation.
✕
Short[c\[Alpha]c\[Beta]\[Delta]Equation, 8] |
The displayed curve and surface are special cases of this equation. Because of its size, we compile the equation for faster plotting.
✕
cf\[Alpha]\[Beta] = Compile[{\[Alpha], \[Beta], c\[Alpha], c\[Beta], \[Delta]}, Evaluate[c\[Alpha]c\[Beta]\[Delta]Equation], CompilationOptions -> {"ExpressionOptimization" -> True}] |
We cut part of the surface open to get a better look at the inside of it. (Because of the complicated nature of the surface, we plot over a smaller vertical range compared to the above plot that used points for the zeros.)
✕
Module[{pp = 160, \[CurlyEpsilon] = 10^-1, data}, Monitor[ data = Table[ cf\[Alpha]\[Beta][GoldenRatio, GoldenRatio^2, c\[Alpha], c\[Beta], \[Delta]], {\[Delta], \[CurlyEpsilon], 4, (4 - \[CurlyEpsilon])/pp}, {c\[Beta], -1, 1, 2/pp}, {c\[Alpha], -1, 1, 2/pp}];, N[\[Delta]]]; ListContourPlot3D[data, DataRange -> {{-1, 1}, {-1, 1}, {\[CurlyEpsilon], 4}}, Contours -> {0}, RegionFunction -> (Not[#1 > 0 \[And] #2 < 0] &), MeshFunctions -> {Norm[{#1, #2}] &, #3 &}, BoundaryStyle -> Directive[RGBColor[0.36, 0.51, 0.71], Thickness[0.004]], ViewPoint -> {2.49, -2.22, 0.53}, BoxRatios -> {1, 1, 2}, AxesLabel -> {Cos[\[Alpha] Subscript[z, i]], Cos[\[Beta] Subscript[z, i]], Subscript[z, j] - Subscript[z, i]}]] |
In the case of and , we have the above algebraic equation of the Cayley surface at our disposal to remove the term.
✕
auxEq = Cos[x0]^2 + Cos[\[Alpha] x0]^2 + Cos[\[Beta] x0]^2 - (1 + 2 Cos[x0] Cos[\[Alpha] x0] Cos[\[Beta] x0]) /. Cos[x0] -> -Cos[\[Alpha] x0] - Cos[\[Beta] x0] /. {Cos[ x0 \[Alpha]] -> c\[Alpha], Cos[x0 \[Beta]] -> c\[Beta]} |
(Note that the last equation is zero only for , , only at zeros and not for all values of . Eliminating the variable cβ gives us an even larger implicit equation for the distances of nonsuccessive zeros, with more than 74,000 terms.)
✕
c\[Alpha]\[Delta]Equation = Resultant[c\[Alpha]c\[Beta]\[Delta]Equation, auxEq, c\[Beta]]; |
✕
{Exponent[c\[Alpha]\[Delta]Equation, c\[Alpha]], Length[c\[Alpha]\[Delta]Equation]} |
Luckily, this large equation factors in about 10 minutes into 4 much smaller equations, each having “only” 300 summands.
✕
(c\[Alpha]\[Delta]EquationFactored = Factor[c\[Alpha]\[Delta]Equation];) // Timing |
✕
{Length[c\[Alpha]\[Delta]EquationFactored], Length /@ (List @@ c\[Alpha]\[Delta]EquationFactored)} |
Here is a simplified form of the first factor. (For brevity of the resulting expression, we don’t yet substitute and , but will do so in a moment.)
✕
firstFactor = Collect[c\[Alpha]\[Delta]EquationFactored[[1]], c\[Alpha], FullSimplify] |
✕
cpc\[Phi] = ContourPlot[ Evaluate[Block[{\[Alpha] = GoldenRatio, \[Beta] = GoldenRatio^2}, (# == 0) & /@ (List @@ c\[Alpha]\[Delta]EquationFactored)]], {c\[Alpha], -1, 1}, {\[Delta], 0.1, 6}, PlotPoints -> 60, AspectRatio -> 3/2] /. Tooltip[x_, _] :> x |
We overlay a few thousand randomly selected distances from the 100k zeros of calculated above. The points all fall on the blue curve, meaning the equation firstFactor describes the position of the nonsuccessive zero distances. We indicate the positions of the maxima in the successive zero distances by horizontal gridlines. (As we used generic , in the derivation of the four functions of cαδEquationFactored, we could also use and obtain a similar image.)
✕
c\[Phi]Points = Module[{i, j}, Select[ Table[i = RandomInteger[{1, 99900}]; j = RandomInteger[{1, 10}]; {Cos[GoldenRatio zerosGolden[[i]]], zerosGolden[[i + j]] - zerosGolden[[i]]}, {50000}], #[[2]] < 6 &]]; |
✕
Show[{cpc\[Phi], Graphics[{Black, Opacity[0.2], PointSize[0.005], Point[c\[Phi]Points]}]}, GridLines -> {{}, {1.8136, 1.0439, 1.4990, 3.0114`} }] |
We see that the peak positions of the successive zero distances (gridlines) are horizontal tangents on the curves. Intuitively, this is to be expected: at a horizontal tangent, many zero distances have approximately equal values, and so the singularities in the distribution form. This horizontal tangent observation now gives us a purely algebraic method to determine the peak positions of the zero distances. The condition of horizontal tangents on the given curve described by firstFactor gives two curves whose intersection points determine the peak positions we are looking for. Here are the curves for firstFactor and the curves that represent the conditions of horizontal tangents plotted together. The intersections of the blue and the yellow/brown curves are the relevant points.
✕
ContourPlot[ Evaluate[Block[{\[Alpha] = GoldenRatio, \[Beta] = GoldenRatio^2}, {firstFactor == 0, D[firstFactor == 0 /. \[Delta] -> \[Delta][c\[Alpha]], c\[Alpha] ] /. \[Delta]'[c\[Alpha]] -> 0 /. \[Delta][ c\[Alpha]] -> \[Delta]}]], {c\[Alpha], -1, 1}, {\[Delta], 0, 3.5}, PlotPoints -> 60, GridLines -> {{}, {1.8136, 1.044, 1.5, 3.011}}] /. Tooltip[x_, _] :> x |
By eliminating the variable cα, we are left with a univariate equation in the zero spacing. But the resulting intermediate polynomials will be quite large (~877k terms!). So instead, we calculate a numerically high-precision approximation of the values of the horizontal tangents.
✕
firstFactorTangents = FindRoot[Evaluate[ Block[{\[Alpha] = GoldenRatio, \[Beta] = GoldenRatio^2}, {firstFactor == 0, D[firstFactor == 0 /. \[Delta] -> \[Delta][c\[Alpha]], c\[Alpha] ] /. \[Delta]'[c\[Alpha]] -> 0 /. \[Delta][ c\[Alpha]] -> \[Delta]}]], {c\[Alpha], #1}, {\[Delta], #2}, WorkingPrecision -> 50, Method -> "Newton"] & @@@ {{0.3, 1.8}, {-0.6, 1.04}, {0.3, 1.5}, {0.8, 3.01}} |
The four peak position values agree perfectly with the previously calculated values.
✕
peakFunctions = Function[x, #1 Cos[x] + #2 Cos[GoldenRatio x] + #3 Cos[ GoldenRatio^2 x]] & @@@ {{1, 1, 1}, {-1, 1, 1}, {1, -1, 1}, {1, 1, -1}}; |
✕
#1[\[Delta]/2 /. #2] & @@@ Transpose[{peakFunctions, firstFactorTangents}] |
As a function of the zero distance , the function firstFactor describes not only the distance between successive zeros, but also between distant zeros. In the following graphic, we connect the points of the zeros and for . The graphic shows how firstFactor does indeed describe the zero distances.
✕
Show[{ Graphics[{Thickness[0.001], Opacity[0.2], Table[{{RGBColor[0.368417, 0.506779, 0.709798], RGBColor[ 0.880722, 0.611041, 0.142051], RGBColor[ 0.560181, 0.691569, 0.194885], RGBColor[ 0.922526, 0.385626, 0.209179], RGBColor[ 0.528488, 0.470624, 0.701351], RGBColor[ 0.772079, 0.431554, 0.102387], RGBColor[ 0.363898, 0.618501, 0.782349], RGBColor[1, 0.75, 0], RGBColor[ 0.647624, 0.37816, 0.614037], RGBColor[ 0.571589, 0.586483, 0.], RGBColor[0.915, 0.3325, 0.2125], RGBColor[0.40082222609352647`, 0.5220066643438841, 0.85]}[[ k - 1]], Line[{ #[[-1]] - #1[[1]], Cos[GoldenRatio #[[1]]]} & /@ Take[Partition[zerosGolden, k, 1], 2000]]}, {k, 2, 7}]}, Axes -> True, PlotRange -> {{0, 10}, {-1, 1}}], ContourPlot[ Evaluate[ Block[{\[Alpha] = (1 + Sqrt[5])/ 2, \[Beta] = ((1 + Sqrt[5])/2)^2}, firstFactor == 0]], {\[Delta], 0.1, 10}, {c\[Alpha], -1, 1}, PlotPoints -> {80, 20}, ContourStyle -> Black] /. Tooltip[x_, _] :> x}, Frame -> True, Axes -> False, PlotRangeClipping -> True, AspectRatio -> 1/4] |
The graphic also shows the string correlation between successive zeros that we saw in the previous pair correlation histogram.
✕
Histogram[ VectorAngle[#1 - #2, #3 - #2] & @@@ Partition[{ #[[-1]] - #1[[1]], Cos[GoldenRatio #[[1]]]} & /@ Partition[zerosGolden, 2, 1], 3, 1], 1000, {"Log", "Count"}] |
Above, we plotted the zero distances over the complex plane. The expression firstFactor allows us to plot the value of over the complexplane. The next graphic shows the real part of over a part of the first quadrant.
✕
Module[{pp = 60, pts, c\[Alpha]Zeros}, (c\[Alpha]Zeros[\[Delta]_] := c\[Alpha] /. Solve[Block[{\[Alpha] = GoldenRatio, \[Beta] = GoldenRatio^2}, # == 0], c\[Alpha]]) &[firstFactor]; pts = Cases[ Flatten[Table[ N@{\[Delta]x, \[Delta]y, Re[#]} & /@ c\[Alpha]Zeros[N[\[Delta]x + I \[Delta]y]], {\[Delta]y, -0, 2, 2/pp}, {\[Delta]x, 0, 2, 2/pp}], 2], {_Real, _Real, _Real}]; Graphics3D[{RGBColor[0.36, 0.51, 0.71], Sphere[pts, 0.01]}, BoxRatios -> {1, 1, 3/2}, Axes -> True, PlotRange -> {All, All, 6 {-1, 1}}, AxesLabel -> {Re[\[Delta]], Im[\[Delta]], Re[c\[Alpha]]}, ViewPoint -> {1.98, -2.66, 0.65}, Method -> {"SpherePoints" -> 6}]] |
Now we have all equations at hand to determine the two remaining peak positions of the function fBrass, which had the approximate values and . We use the implicit equation obeyed by and at zeros of .
✕
c\[Alpha]c\[Beta]Brass[c\[Alpha]_, c\[Beta]_] = Collect[1 + 2 c\[Alpha] - 3 c\[Alpha]^2 - 4 c\[Alpha]^3 + 4 c\[Alpha]^4 + 2 c\[Beta] + 2 c\[Alpha] c\[Beta] - 4 c\[Alpha]^2 c\[Beta] - 3 c\[Beta]^2 - 4 c\[Alpha] c\[Beta]^2 + 8 c\[Alpha]^3 c\[Beta]^2 - 4 c\[Beta]^3 + 8 c\[Alpha]^2 c\[Beta]^3 + 4 c\[Beta]^4, c\[Beta]] |
Rather than eliminating the variable symbolically, for each cα, we numerically calculate all possible values for cβ and substitute these values into cαcβδEquation. For faster numerical evaluation, we compile the resulting expression.
✕
c\[Alpha]\[Delta]BrassCompiled = Compile[{{c\[Alpha], _Complex}, {\[Delta], _Complex}}, Evaluate[Block[{\[Alpha] = (1 + Sqrt[3])/ 2, \[Beta] = ((1 + Sqrt[3])/2)^2}, Block[{c\[Beta] = #}, c\[Alpha]c\[Beta]\[Delta]Equation] & /@ (c\[Beta] /. Solve[c\[Alpha]c\[Beta]Brass[c\[Alpha], c\[Beta]] == 0, c\[Beta]])]], CompilationOptions -> {"ExpressionOptimization" -> True}] |
Calculating the values of cαcβBrass on a dense set of cα, δ grid points as blue points graphically the represents all possible zero distances as a function of cα. We overlay some of the previously calculated zero distances from above as orange points, the four identified peak positions as dark green lines and the two outstanding peak positions as red lines.
✕
Module[{ppc\[Alpha] = 400, pp\[Delta] = 600, data, cp}, Monitor[data = Table[c\[Alpha]\[Delta]BrassCompiled[ c\[Alpha], \[Delta]], {\[Delta], 0, 4, 4/pp\[Delta]}, {c\[Alpha], -1, 1, 2/ppc\[Alpha]}];, N[{\[Delta], c\[Alpha]}]]; cp = Show[ ListContourPlot[Re[#], Contours -> {0}, ContourShading -> None, DataRange -> {{-1, 1}, {0, 4}}, ContourStyle -> Directive[RGBColor[0.36, 0.51, 0.71], Thickness[0.004]], PlotRange -> {{-1, 1}, {0.1, 4}}] & /@ Transpose[data, {2, 3, 1}]]; Show[{Graphics[{PointSize[0.004], Opacity[0.3], Orange, Point[{Cos[\[Phi]Brass #1], #2} & @@@ RandomSample[Transpose[{Most[zerosBrass], differencesBrass}], 10000]], Darker[Green], Opacity[1], Thickness[0.002], Line[{{-1, #}, {1, #}} & /@ {2.2299, 1.46396, 2.0722, 3.5761}], Red, Line[{{-1, #}, {1, #}} & /@ {1.098, 2.367}]}, AspectRatio -> 3/2, Frame -> True, PlotRangeClipping -> True, PlotRange -> {{-1, 1}, {0.1, 4}}], cp}]] |
We clearly see that the two remaining peak positions also occur at points where the zero distance curve has a horizontal tangent. Expressing the condition of a horizontal tangent to the curve cαcβδEquation (with the constraint cαcβBrass) numerically allows us to calculate high-precision values for the two remaining peak positions. (We also include the differentiated form of the constraint to have four equations in four variables.)
✕
c\[Alpha]c\[Beta]Brass = (-1 + x + 2 y^2)^2 + 4 (-1 + x - 2 x y^2) z^2 + 4 z^4 /. x -> -y - z /. {y -> c\[Alpha], z -> c\[Beta]} // Simplify |
✕
horizontalTangentsBrass = Block[{\[Alpha] = \[Phi]Brass, \[Beta] = \[Phi]Brass^2}, {c\[Alpha]c\[Beta]\[Delta]Equation /. { c\[Beta] -> c\[Beta][c\[Alpha]], \[Delta] -> \[Delta][c\[Alpha]]}, D[c\[Alpha]c\[Beta]\[Delta]Equation /. { c\[Beta] -> c\[Beta][c\[Alpha]], \[Delta] -> \[Delta][c\[Alpha]]}, c\[Alpha]], c\[Alpha]c\[Beta]Brass /. c\[Beta] -> c\[Beta][c\[Alpha]], D[c\[Alpha]c\[Beta]Brass /. c\[Beta] -> c\[Beta][c\[Alpha]], c\[Alpha]]} /. {c\[Beta][c\[Alpha]] -> c\[Beta], Derivative[1][c\[Beta]][c\[Alpha]] -> c\[Beta]P, \[Delta][c\[Alpha]] -> \[Delta], \[Delta]'[ c\[Alpha]] -> 0}]; |
✕
Length[Expand[#]] & /@ horizontalTangentsBrass |
✕
FindRoot[Evaluate[horizontalTangentsBrass], {c\[Alpha], -4305/100000}, {\[Delta], 10966/10000}, {c\[Beta], 97/100}, {c\[Beta]P, 78/100}, WorkingPrecision -> 30, PrecisionGoal -> 10, Method -> "Newton", MaxIterations -> 100] |
✕
FindRoot[Evaluate[horizontalTangentsBrass], {c\[Alpha], 39/100}, {\[Delta], 2366/1000}, {c\[Beta], -38/100}, {c\[Beta]P, -71/100}, WorkingPrecision -> 30, PrecisionGoal -> 10, Method -> "Newton", MaxIterations -> 100] |
And now the calculated peak positions of the zero distances agree perfectly with the numerically observed ones.
✕
Function[c, Histogram[Select[differencesBrass, c - 0.005 < # < c + 0.005 &], 100, GridLines -> {{1.0966482948, 2.3673493597}, {}}, PlotRangeClipping -> True, Method -> {"GridLinesInFront" -> True}]] /@ {1.0977, 2.366} |
Now that we have found equations for the positions of the singularities, how does the density of the zero distances behave near such a singularity? To find out, we select bins near the singularities and count the number of distances in these bins.
✕
singData = Table[Module[{sel}, sel = If[j == 2 || j == 3 || j == 4, Select[differencesGolden, Evaluate[peaksGolden[[j]] <= # <= peaksGolden[[j]] + 0.2] &], Select[differencesGolden, Evaluate[peaksGolden[[j]] - 0.2 <= # <= peaksGolden[[j]]] &]]; {Abs[#1 - peaksGolden[[j]]], #2} & @@@ Tally[Round[sel, 0.001]]], {j, 4}]; |
A log-log plot of the counts versus the distance to the singularities shows straight lines. This encourages us to conjecture that the behavior of the density near a singularity behaves as .
✕
ListLogLogPlot[singData, PlotRange -> {{5 10^-3, 1.8 10^-1}, All}, PlotLegends -> (Row[{"x", "\[TildeTilde]", NumberForm[#, 4]}] & /@ peaksGolden)] |
The numerical value of depends on the concrete singularity, and seems to be in the range of . Numerical experiments with other sums of three trigonometric functions show a power law behavior near the singularities in general. The values of the exponents vary.
✕
Coefficient[ Fit[Log[Select[#, 10^-3 <= #[[1]] <= 10^-1 &]], {1, x}, x], x] & /@ singData |
We can now also model what would happen if the phases in++ were random (but fulfilling the conditions for the envelopes). Here we do this for the local extrema at with function value –1 at this point.
✕
gGolden[x_, {\[CurlyPhi]1_, \[CurlyPhi]2_, \[CurlyPhi]3_}] := Cos[x + \[CurlyPhi]1] + Cos[GoldenRatio x + \[CurlyPhi]2] + Cos[GoldenRatio^2 x + \[CurlyPhi]3]; |
We select a random value for and calculate values for and so that , .
✕
phases = DeleteCases[ Table[Check[Block[{\[CurlyPhi]1 = RandomReal[{-Pi, Pi}]}, {\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3} /. FindRoot[ Evaluate[{gGolden[ 0, {\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}] == -1, \ Derivative[1, {0, 0, 0}][gGolden][ 0, {\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}] == 0}], {\[CurlyPhi]2, RandomReal[{-Pi, Pi}]}, {\[CurlyPhi]3, RandomReal[{-Pi, Pi}]}]], {}], {100000}] // Quiet, {}]; |
Due to the two constraint equations, the phases are correlated.
✕
Histogram3D[Mod[phases, 2 Pi][[All, {##}]], 100, AxesLabel -> {Subscript["\[CurlyPhi]", #1], Subscript[ "\[CurlyPhi]", #2]}] & @@@ \ \ {{1, 2}, {1, 3}, {2, 3}} |
Two of the three remaining envelope zeros are clearly visible. (With the numerical root-finding method, one catches only a few curves that are near the green curve.) While the overall graphics look symmetric with respect to the line, the individual curves are not. In contrast to the behavior of , under the assumption of totally random phases between , and , the sums with random phases quickly take on values smaller than –3/2.
✕
Plot[Evaluate[gGolden[x, #] & /@ RandomSample[phases, 250]], {x, -2, 5}, PlotStyle -> Directive[Thickness[0.001], Opacity[0.2], Gray], Prolog -> {Plot[ Cos[x] - Cos[GoldenRatio x] - Cos[GoldenRatio^2 x], {x, -2, 5}, PlotStyle -> Directive[RGBColor[0.36, 0.51, 0.71], Opacity[0.5]]][[1]], Plot[-Cos[x] + Cos[GoldenRatio x] - Cos[GoldenRatio^2 x], {x, -2, 5}, PlotStyle -> Directive[RGBColor[0.88, 0.61, 0.14], Opacity[0.5]]][[1]], Plot[-Cos[x] - Cos[GoldenRatio x] + Cos[GoldenRatio^2 x], {x, -2, 5}, PlotStyle -> Directive[ RGBColor[0.56, 0.69, 0.19], Opacity[0.5]]][[1]]}, GridLines -> {{}, {-1, 3}}, Epilog -> {Directive[Purple, PointSize[0.01]], Point[{#/2, 0} & /@ {1.04398, 1.49906, 3.01144}]}] /. l_Line :> Mouseover[l, {Red, Thickness[0.002], Opacity[1], l}] |
✕
gGoldenZeros = Quiet[x /. FindRoot[Evaluate[gGolden[x, #]], {x, 1/2}] & /@ phases]; |
The estimated in obtained from the zeros of these curves with random phases agree remarkably well with the two values from the above histograms. This suggests that, to a certain degree, the assumption of random local phases is justified.
✕
phasesmpp = With[{\[Xi] = peaksGolden[[2]]/2}, Select[gGoldenZeros, \[Xi] < # < \[Xi] + 0.1 &] - \[Xi]]; \[Gamma]2 = Coefficient[ Fit[N[Cases[Log[Tally[Round[phasesmpp, 0.001]]], {_Real, _}]], {1, x}, x], x]; Histogram[phasesmpp, 100, PlotLabel -> Row[{"\[Gamma]", "\[ThinSpace]\[TildeTilde]\[ThinSpace]", NumberForm[\[Gamma]2, 2]}]] |
✕
phasespmp = With[{\[Xi] = peaksGolden[[3]]/ 2}, -(Select[gGoldenZeros, \[Xi] - 0.03 < # < \[Xi] &] - \[Xi])]; \[Gamma]3 = Coefficient[ Fit[N[Cases[Log[Tally[Round[phasespmp, 0.001]]], {_Real, _}]], {1, x}, x], x]; Histogram[phasespmp, 100, PlotLabel -> Row[{"\[Gamma]", "\[ThinSpace]\[TildeTilde]\[ThinSpace]", NumberForm[\[Gamma]3, 2]}]] |
The appendix contains some calculations based on the computed zero density (assuming uniform distribution of phases) to see if the power law holds exactly or is approximate.
For completeness, I repeat the distance investigations that were carried out for the zeros and extrema for inflection points.
The inflection points are the zeros of the second derivative.
✕
findInflectionPoints[f_, n_] := findZeros[f'', n] |
✕
inflectionsGolden = Prepend[findInflectionPoints[fGolden, 100000], 0.]; |
The following plots are the direct equivalent to the case of extrema, so we skip commenting on them individually.
✕
Plot[fGolden[x], {x, 0, 10.3 Pi}, Epilog -> {Darker[Red], Point[{#, fGolden[#]} & /@ Take[ inflectionsGolden, 30]]}] |
✕
Graphics[{PointSize[0.001], RGBColor[0.36, 0.51, 0.71], Opacity[0.2], Point[N[{#, fGolden[#]} & /@ inflectionsGolden]]}, AspectRatio -> 1/2, ImageSize -> 400, Frame -> True] |
✕
Histogram[Mod[inflectionsGolden, 2 Pi], 200] |
✕
Histogram[Differences[inflectionsGolden], 1000, PlotRange -> All] |
✕
Histogram[Differences[inflectionsGolden, 2], 1000, PlotRange -> All] |
✕
Histogram3D[{#2 - #1, #3 - #2} & @@@ Partition[inflectionsGolden, 3, 1], 100, PlotRange -> All] |
✕
summandValuesI = {Cos[#], Cos[GoldenRatio #], Cos[GoldenRatio^2 #]} & /@ inflectionsGolden; |
✕
Graphics3D[{PointSize[0.002], RGBColor[0.36, 0.51, 0.71], Point[Union@Round[summandValuesI, 0.01]]}, PlotRange -> {{-1, 1}, {-1, 1}, {-1, 1}}, Axes -> True, AxesLabel -> {Cos[x], Cos[GoldenRatio x], Cos[GoldenRatio^2 x]}] |
Having found the positions of the peaks of the distribution of the distances of the zeros, another natural question to ask about the zero distribution is: what is the largest possible distance between two successive roots? The largest distance will occur in the following situation: starting at a zero, the function will increase or decrease, then have a first extremum, then a second and a third extremum, and then will have another zero. When the middle extremum barely touches the real axis, the distance between the two zeros will be largest. Here are some plots of the function around zeros that are the furthest apart. Note that while the curves look, at first glance, symmetric around x ≈ 1.6, the low maxima on the left side belongs to the curve with the high maxima on the right side and vice versa.
✕
Show[Plot[fGolden[# + t], {t, -2, 0 + 5}, PlotRange -> All, PlotStyle -> Directive[RGBColor[0.36, 0.51, 0.71], Thickness[0.001], Opacity[0.3]]] & /@ Take[Sort[ Transpose[{differencesGolden, Most[zerosGolden]}]], -500][[All, 2]], PlotRange -> All, Frame -> True, GridLines -> {{0}, {-1, 3}}, ImageSize -> 400] /. l_Line :> Mouseover[l, {Red, Thickness[0.002], Opacity[1], l}] |
For the random phase model, I calculate the phases that make the middle extrema just touch the real axis.
✕
h[x_] = Cos[x + \[CurlyPhi]1] + Cos[\[Alpha] x + \[CurlyPhi]2] + Cos[\[Beta] x + \[CurlyPhi]3] ; \[CurlyPhi]2\[CurlyPhi]3MaxD\[CurlyPhi]1[] = (Solve[{h[x] == 0, D[h[x], x] == 0, D[h[x], \[CurlyPhi]1] == 0} /. x -> 0, {\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}] /. _C -> 0 // Simplify); {Length[\[CurlyPhi]2\[CurlyPhi]3MaxD\[CurlyPhi]1[]], \[CurlyPhi]2\ \[CurlyPhi]3MaxD\[CurlyPhi]1[][[1]]} |
Here is a plot of a solution with the middle extrema touching the real axis.
✕
touchingSolutions\[CurlyPhi]1[x_] = h[x] /. Select[ \[CurlyPhi]2\[CurlyPhi]3MaxD\[CurlyPhi]1[], Im[h[x] /. # /. {\[Alpha] -> GoldenRatio, \[Beta] -> GoldenRatio^2} /. x -> 5.] == 0 &] /. {\[Alpha] -> GoldenRatio, \[Beta] -> GoldenRatio^2}; |
Here is one of the solutions.
✕
touchingSolutions\[CurlyPhi]1[x][[1]] // FullSimplify |
Here is a plot of a solution with the middle extrema touching the real axis. The four solutions each give the same maximal zero distance.
✕
Plot[touchingSolutions\[CurlyPhi]1[x], {x, -3, 3}] |
The so-obtained maximal distance between two zeros agrees well with the observed value. The calculated maximum is slightly larger than the observed one. (The reverse situation would be bad.)
✕
(Min[x /. Solve[touchingSolutions\[CurlyPhi]1[x][[1]] == 0 \[And] 1/10 < x < 4, x, Reals]] - Max[x /. Solve[touchingSolutions\[CurlyPhi]1[x][[1]] == 0 \[And] -4 < x < -1/10, x, Reals]]) // N[#, 5] & |
✕
Max[differencesGolden] |
Finding the envelopes with respect to and does not give a larger result.
✕
\[CurlyPhi]2\[CurlyPhi]3MaxD\[CurlyPhi]2[] = (Solve[{h[x] == 0, D[h[x], x] == 0, D[h[x], \[CurlyPhi]2] == 0} /. x -> 0, {\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}] /. _C -> 0 // Simplify); touchingSolutions\[CurlyPhi]2[x_] = With[{R = {\[Alpha] -> GoldenRatio, \[Beta] -> GoldenRatio^2}}, h[x] /. Select[ \[CurlyPhi]2\[CurlyPhi]3MaxD\[CurlyPhi]2[], Im[h[x] /. # /. R /. x -> 5.] == 0 &] /. R]; |
✕
\[CurlyPhi]2\[CurlyPhi]3MaxD\[CurlyPhi]3[] = (Solve[{h[x] == 0, D[h[x], x] == 0, D[h[x], \[CurlyPhi]3] == 0} /. x -> 0, {\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}] /. _C -> 0 // Simplify); touchingSolutions\[CurlyPhi]3[x_] = With[{R = {\[Alpha] -> GoldenRatio, \[Beta] -> GoldenRatio^2}}, h[x] /. Select[ \[CurlyPhi]2\[CurlyPhi]3MaxD\[CurlyPhi]3[], Im[h[x] /. # /. R /. x -> 5.] == 0 &] /. R] ; |
✕
(Quiet[Min[x /. Solve[N[#] == 0 \[And] 0 < x < 4, x, Reals]] - Max[ x /. Solve[N[#] == 0 \[And] -4 < x < 0, x, Reals]]] & /@ #) & /@ {touchingSolutions\[CurlyPhi]1[x], touchingSolutions\[CurlyPhi]2[x], touchingSolutions\[CurlyPhi]3[x]} |
Here are the three envelope curve families together with near 97,858.4.
✕
With[{x0 = 97858.38930}, Show[{Plot[fGolden[x0 + x], {x, -3, 3}, PlotStyle -> Directive[Thickness[0.01], Gray, Opacity[0.4]]], Plot[touchingSolutions\[CurlyPhi]1[x], {x, -3, 3}, PlotStyle -> RGBColor[0.88, 0.61, 0.14]], Plot[touchingSolutions\[CurlyPhi]2[x], {x, -3, 3}, PlotStyle -> RGBColor[0.36, 0.51, 0.71] ], Plot[touchingSolutions\[CurlyPhi]3[x], {x, -3, 3}, PlotStyle -> RGBColor[0.56, 0.69, 0.19] ]}, PlotRange -> All]] |
Interestingly, the absolute minimum of the distance between two successive zeros in + + is slightly larger.
✕
rootDistance[{\[CurlyPhi]1_Real, \[CurlyPhi]2_Real, \ \[CurlyPhi]3_Real}] := (Min[Select[#, Positive]] - Max[Select[#, Negative]]) &[ N[x /. Quiet[ Solve[Cos[x + \[CurlyPhi]1] + Cos[GoldenRatio x + \[CurlyPhi]2] + Cos[GoldenRatio^2 x + \[CurlyPhi]3] == 0 \[And] -5 < x < 5, x]]]] |
If one lets the three phases range over the domains , then one finds a slightly larger maximal distance between zeros.
✕
With[{pp = 24}, Monitor[Table[{{\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}, rootDistance[ N@{\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}]}, {\[CurlyPhi]1, 0, 2 Pi (1 - 1/pp), 2 Pi/pp}, {\[CurlyPhi]2, 0, 2 Pi (1 - 1/pp), 2 Pi/pp}, {\[CurlyPhi]3, 0, 2 Pi (1 - 1/pp), 2 Pi/pp}];, N@{\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}]] |
✕
FindMaximum[ rootDistance[{\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}], {{\ \[CurlyPhi]1, 23 Pi/12}, {\[CurlyPhi]2, 7 Pi/6}, {\[CurlyPhi]3, Pi/3}}] // Quiet |
✕
FindMaximum[ rootDistance[{\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}], {{\ \[CurlyPhi]1, 23 Pi/12}, {\[CurlyPhi]2, 7 Pi/6}, {\[CurlyPhi]3, Pi/3}}] // Quiet |
This maximum distance is realized when a minimum (maximum) between two zeros barely touches the real axis.
✕
Plot[Cos[x + \[CurlyPhi]1] + Cos[GoldenRatio x + \[CurlyPhi]2] + Cos[GoldenRatio^2 x + \[CurlyPhi]3] /. %[[2]], {x, -3, 3}] |
Calculating the first million zeros of does not yield a value larger than the previously calculated value of 3.334. This suggests that the last curve configuration is not realized along the real axis in .
Here is a list of record maxima slightly below the real axis, found by calculating the first few million extrema of . (This list was found using a version of the above function findZeros for the first derivative and only keeping a list of extremal shallow maxima.)
✕
shallowMaxima = {2.4987218706691797180876160929177, 33.704071215892008970482202830654, 98.293712744702474592256555931685, 443.88844400968497878246638591063, 3388.8528289090563871971140906274, 12846.898421437481029976313761124, 55352.647183638537564573877897525, 124704.59412664060098149321964301, 166634.14221987979291743435707392, 304761.83543691954802508678830822, 457972.87856640025996046175960675, 776157.81309371886102983541391071, 1220707.5925697200786171039302735}; |
The next plot shows how record maxima approach the real axis from below.
✕
ListLogLogPlot[{#, -fGolden[#]} & /@ shallowMaxima, AxesLabel -> {x, -Subscript[f, \[Phi]][x]}] |
Here is a plot of at the least shallow value.
✕
{Plot[fGolden[shallowMaxima[[-1]] + x], {x, -2, 2}], Plot[fGolden[shallowMaxima[[-1]] + x], {x, -0.001, 0.001}]} |
The root difference for this “near” zero is about 3.326058.
✕
Differences[ N[x /. Solve[ fGolden[Rationalize[shallowMaxima[[-1]], 0] + x] == 0 \[And] -3 < x < 3, x], 15]] |
Interpolating the data from the last graphic, one gets for the value of the shallowest maxima up to .
The distances between the nearest roots to the right and left of these maxima seem to approach an upper bound below the 3.3340 value from above.
✕
Differences[ Function[\[CapitalDelta], x /. FindRoot[ Evaluate[fGolden[x] == 0], {x, # + \[CapitalDelta]}, WorkingPrecision -> 100, PrecisionGoal -> 30]] /@ {-2, 2}] & /@ \ shallowMaxima // Flatten // N[#, 10] & |
So is it a special property of that the maximal root distance assuming arbitrary phases is not realized, or is it a more general property of all ? Numerically, experiments with various transcendental values of and suggest that generically the maximal possible root distance obtained from the envelope method agrees with the maximum observed.
Unfortunately, this time the algebraic formulation of the distances between the zeros does not provide the ultimate answer. The following graphic shows the four branches of the factored cαδEquation together with the observed points. No special structure is visible in the curves at the maximal-observed zero distances. The two “strands” of points correspond to the two curves shown in the first plot in this section.
✕
ContourPlot[ Evaluate[Block[{\[Alpha] = GoldenRatio, \[Beta] = GoldenRatio^2}, # == 0 & /@ (List @@ c\[Alpha]\[Delta]EquationFactored)]], {c\[Alpha], -1, 1}, {\[Delta], 3.25, 3.4}, PlotPoints -> 50, Epilog -> {Black, PointSize[0.01], Point[{Cos[GoldenRatio #1] , #2 - #1} & @@@ Select[Partition[zerosGolden, 2, 1], #[[2]] - #[[1]] > 3.25 &]]}, GridLines -> {{}, {1.813, 1.044, 1.5, 3.011, 3.334}}] /. Tooltip[x_, _] :> x |
The new root that appears at the maximal root distance is in the algebraic formulation visible as vertical tangents in the curve. Plotting the vertical tangents at the positions of the largest zero distances observed and the observed zero distances shows this clearly.
✕
ContourPlot[ Evaluate[Block[{\[Alpha] = (1 + Sqrt[5])/ 2, \[Beta] = ((1 + Sqrt[5])/2)^2}, {firstFactor == 0}]], {c\[Alpha], -1, 1}, {\[Delta], 0, 3.6}, PlotPoints -> 50, Epilog -> {Black, PointSize[0.003], Point[{Cos[GoldenRatio #1] , #2 - #1} & @@@ RandomSample[Partition[zerosGolden, 2, 1], 15000]]}, GridLines -> {Mean /@ Split[Sort[ Cos[GoldenRatio #1] & @@@ Select[Partition[zerosGolden, 2, 1], #[[2]] - #[[1]] > 3.32 &]], Abs[#1 - #2] < 0.4 &], {} }] /. Tooltip[x_, _] :> x |
Finding the points of vertical tangents and “lifting” the cα values of these points to the curve near the observed intersections gives the algebraic prediction for maximal root distances.
✕
verticalTangentsc\[Alpha]\[Delta] = FindRoot[Evaluate[ Block[{\[Alpha] = (1 + Sqrt[5])/ 2, \[Beta] = ((1 + Sqrt[5])/2)^2}, {firstFactor == 0, D[firstFactor == 0 /. c\[Alpha] -> c\[Alpha][\[Delta]], \[Delta]] /. c\[Alpha]'[\[Delta]] -> 0 /. c\[Alpha][\[Delta]] -> c\[Alpha]}]], {c\[Alpha], #1}, {\[Delta], #2}, WorkingPrecision -> 50, PrecisionGoal -> 20, MaxIterations -> 200] & @@@ {{-0.15, 1.63}, {0.68, 1.7}} |
✕
(FindRoot[ Evaluate[ Block[{\[Alpha] = (1 + Sqrt[5])/ 2, \[Beta] = ((1 + Sqrt[5])/2)^2}, firstFactor == 0 /. #[[1]] ]], {\[Delta], 3.3}, WorkingPrecision -> 40, PrecisionGoal -> 20, MaxIterations -> 200] // N[#, 20] &) & /@ verticalTangentsc\[Alpha]\[Delta] |
Both zero distances agree; this means the maximal root distance is about 3.32606.
Through graphical explorations, statistical tests and numerical checks, we have been able to conjecture the answer to the original MathOverflow question: the positions of the peaks in the distribution of zero distances of are two times the positions of the smallest zeros of .
The concrete function exhibits an interesting mixture of generic and nongeneric properties due to the golden ratio factors.
Many other structures within the zeros, extrema and inflection points of the sum of three trigonometric functions could be investigated, as well as the relations between the zeros, extrema and other special points. Here are a few examples.
For instance, we could visualize the cosine arguments of the zeros, extrema and inflection points. Here are the argument values modulo .
✕
Graphics3D[{PointSize[0.002], RGBColor[0.36, 0.51, 0.71], Point[Mod[# {1, GoldenRatio, GoldenRatio^2}, 2 Pi] & /@ zerosGolden], RGBColor[0.88, 0.61, 0.14], Point[Mod[# {1, GoldenRatio, GoldenRatio^2}, 2 Pi] & /@ extremasGolden], RGBColor[0.56, 0.69, 0.19], Point[Mod[# {1, GoldenRatio, GoldenRatio^2}, 2 Pi] & /@ inflectionsGolden]}, PlotRange -> {{0, 2 Pi}, {0, 2 Pi}, {0, 2 Pi}}, Axes -> True] // Legended[#, LineLegend[{RGBColor[0.36, 0.51, 0.71], RGBColor[0.88, 0.61, 0.14], RGBColor[0.56, 0.69, 0.19]}, {"zeros", "extrema", "inflection points"}]] & |
Or we could look in more detail at how faithful the zero distances are reproduced if one takes the random phases seriously and looks at all functions of the form + +. For a grid of , , values, we calculate the distance between the smallest positive and largest negative zero.
✕
zeroDistance0[ f_] := (Min[Select[#, Positive]] - Max[Select[#, Negative]]) &[ Table[x /. FindRoot[f, {x, x0}], {x0, -3, 3, 6/17}]] // Quiet |
The resulting distribution of the zero distances looks quantitatively different from the zero distance distributions of . But because the peak positions arise from the envelopes, we see the same peak positions as in .
✕
With[{pp = 40}, Monitor[ zeroData = Table[zeroDistance0[ Cos[x + \[CurlyPhi]1] + Cos[GoldenRatio x + \[CurlyPhi]2] + Cos[GoldenRatio^2 x + \[CurlyPhi]3]], {\[CurlyPhi]3, 0, 2 Pi (1 - 1/pp), 2 Pi/pp}, {\[CurlyPhi]2, 0, 2 Pi (1 - 1/pp), 2 Pi/pp}, {\[CurlyPhi]1, 0, 2 Pi (1 - 1/pp), 2 Pi/pp}];, N@{\[CurlyPhi]3, \[CurlyPhi]2, \[CurlyPhi]1}]] |
✕
Histogram[Flatten[zeroData], 200, GridLines -> {peaksGolden, {}}] |
Or we could model the function value distribution in the general case. The function value distribution for generic , is quite different from the one observed above for . Generically (see the examples in the postlude) it has a characteristic flat middle part in the interval . Assuming again that locally around the function looks like + + with and uniformly distributed , and , we can express the probability to see the function value as:
✕
P(y) = \!\( \*UnderoverscriptBox[\(\[Integral]\), \(0\), \(2 \[Pi]\)]\( \*UnderoverscriptBox[\(\[Integral]\), \(0\), \(2 \[Pi]\)]\( \*UnderoverscriptBox[\(\[Integral]\), \(0\), \(2 \[Pi]\)]\* TemplateBox[{RowBox[{"y", "-", RowBox[{"(", RowBox[{ RowBox[{"cos", "(", SubscriptBox["\[CurlyPhi]", "1"], ")"}], "+", RowBox[{"cos", "(", SubscriptBox["\[CurlyPhi]", "2"], ")"}], "+", RowBox[{"cos", "(", SubscriptBox["\[CurlyPhi]", "3"], ")"}]}], ")"}]}]}, "DiracDeltaSeq"] \[DifferentialD] \*SubscriptBox[\(\[CurlyPhi]\), \(1\)] \[DifferentialD] \*SubscriptBox[\(\[CurlyPhi]\), \(2\)] \[DifferentialD] \*SubscriptBox[\(\[CurlyPhi]\), \(3\)]\)\)\) |
Integrating out the delta function, we obtain the following.
✕
P(y)~\!\( \*UnderoverscriptBox[\(\[Integral]\), \(0\), \(2 \[Pi]\)]\( \*UnderoverscriptBox[\(\[Integral]\), \(0\), \(2 \[Pi]\)] \*FractionBox[\(\[Theta](1 - \*SuperscriptBox[\((y - cos( \*SubscriptBox[\(\[CurlyPhi]\), \(1\)]) - cos( \*SubscriptBox[\(\[CurlyPhi]\), \(2\)]))\), \(2\)])\), SqrtBox[\(1 - \*SuperscriptBox[\((y - cos( \*SubscriptBox[\(\[CurlyPhi]\), \(1\)]) - cos( \*SubscriptBox[\(\[CurlyPhi]\), \(2\)]))\), \(2\)]\)]] \[DifferentialD] \*SubscriptBox[\(\[CurlyPhi]\), \(1\)] \[DifferentialD] \*SubscriptBox[\(\[CurlyPhi]\), \(2\)]\)\) |
Carrying out the integral numerically gives exactly the function value distribution shown below.
✕
Pfv[y_] := NIntegrate[ Piecewise[{{1/(\[Pi] Sqrt[ 1 - (y - Cos[\[CurlyPhi]1] - Cos[\[CurlyPhi]2])^2]), 1 - (y - Cos[\[CurlyPhi]1] - Cos[\[CurlyPhi]2])^2 >= 0}}], {\[CurlyPhi]1, 0, 2 Pi}, {\[CurlyPhi]2, 0, 2 Pi}, PrecisionGoal -> 3]/(2 Pi)^2 |
✕
lpPfv = ListPlot[Join[Reverse[{-1, 1} # & /@ #], #] &[ Monitor[Table[{y, Pfv[y]}, {y, 0, 3, 0.05}], y]], Joined -> True, Filling -> Axis] |
This distribution agrees quite well with the value distribution observed for .
✕
Show[{Histogram[Table[fPi[x], {x, 0, 1000, 0.001}], 100, "PDF"], lpPfv}] |
The observed value of mean spacing between zeros is ≈1.78.
✕
meanGoldenZeroSpacing = Mean[differencesGolden] |
Here is a plot showing how the average mean spacing evolves with an increasing number of zeros taken into account. The convergence is approximately proportional to .
✕
{ListPlot[Take[#, 100] &@Transpose[{ Most[zerosGolden], MapIndexed[#1/#2[[1]] &, Accumulate[differencesGolden]]}], PlotRange -> All, GridLines -> {{}, {meanGoldenZeroSpacing}}], Show[{ListLogLogPlot[Transpose[{ Most[zerosGolden], Abs[ MapIndexed[#1/#2[[1]] &, Accumulate[differencesGolden]] - meanGoldenZeroSpacing]}]], LogLogPlot[5 x^-0.88, {x, 1, 2 10^5}, PlotStyle -> Gray]}]} |
If one uses the random phase approximation and average over all zero distances with at least one zero in the interval , and using a grid of values for the phases, one gets a value that agrees with the empirically observed spacing to less than 1%.
✕
zeroSpacings[{\[CurlyPhi]1_, \[CurlyPhi]2_, \[CurlyPhi]3_}] := Module[{sols, pos1, pos2}, sols = Quiet[ Sort[x /. Solve[N[Cos[x + \[CurlyPhi]1] + Cos[GoldenRatio x + \[CurlyPhi]2] + Cos[GoldenRatio^2 x + \[CurlyPhi]3]] == 0 \[And] -6 < x < 12, x]]]; pos1 = Max[Position[sols, _?(# < 0 &)]]; pos2 = Min[Position[sols, _?(# > 2 Pi &)]]; Differences[Take[sols, {pos1, pos2}]]] |
✕
Module[{pp = 32, sp}, Monitor[ spacingArray = Table[ sp = zeroSpacings[{\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}], {\ \[CurlyPhi]1, 0, 2 Pi (1 - 1/pp), 2 Pi/pp}, {\[CurlyPhi]2, 0, 2 Pi (1 - 1/pp), 2 Pi/pp}, {\[CurlyPhi]3, 0, 2 Pi (1 - 1/pp), 2 Pi/pp}];, {N@{\[CurlyPhi]1, \[CurlyPhi]2, \[CurlyPhi]3}, sp}]] |
✕
Mean[Mean /@ Flatten[spacingArray, 2]] |
We could also look in more detail at the distribution of the distances to the nearest zero from the extrema.
✕
nfZerosGolden = Nearest[zerosGolden] |
✕
Histogram[Abs[# - nfZerosGolden[#][[1]]] & /@ extremasGolden, 1000] |
We can look at the distribution of the (unsigned) areas under a curve between successive zeros.
✕
area[{a_, b_}] = Integrate[fGolden[x], {x, a, b}] |
✕
Histogram[Abs[area /@ Partition[zerosGolden, 2, 1]], 500] |
We should check if the natural generalizations of the conjectures’ peak positions hold. For instance, here is a sum of four cosine terms that uses the plastic constant.
✕
P = N@Root[-1 - # + #^3 &, 1] // ToRadicals; fPlastic[x_] = Cos[x] + Cos[P x] + Cos[P^2 x] + Cos[P^3 x]; |
We again calculate 100k zeros. We also calculate the conjectured peak positions and plot both together. One of the predicted peaks turns out to be an edge; the plastic constant is nongeneric for four cosine terms in the same sense as the golden ratio is for three terms.
✕
zerosPlastic = findZeros[fPlastic, 10^5]; |
✕
peaksPlastic = (2 x /. FindRoot[{Cos[x], Cos[P x] , Cos[P^2 x], Cos[P^3 x]}.#1, {x, #2}]) & @@@ {{{1, 1, 1, 1}, 1}, {{-1, 1, 1, 1}, 0.8}, {{1, -1, 1, 1}, 0.8}, {{1, 1, -1, 1}, 1}, {{1, 1, 1, -1}, 1.6}} |
✕
Histogram[Differences[zerosPlastic], 1000, PlotRange -> All, GridLines -> {peaksPlastic, {}}] |
The peak positions again agree perfectly. Plotting the function near the zeros with the peak distances shows similar envelopes as in the three-term case.
✕
With[{aux = Sort@Transpose[{Differences[zerosPlastic], Most[zerosPlastic]}]}, Function[v, Show[Plot[fPlastic[# + t], {t, -2, 0 + 5}, PlotRange -> All, PlotStyle -> Directive[RGBColor[0.36, 0.51, 0.71], Thickness[0.001], Opacity[0.3]]] & /@ Take[Sort[{Abs[#1 - v], #2} & @@@ aux], 100][[All, 2]], PlotLabel -> Row[{"x", "\[ThinSpace]\[Equal]\[ThinSpace]", NumberForm[v, 3]}], PlotRange -> All, Frame -> True, GridLines -> {{0}, {-2, 4}}] /. l_Line :> Mouseover[l, {Red, Thickness[0.002], Opacity[1], l}]] /@ peaksPlastic] |
A plot of the extrema invites us to make visibility graphs from the extrema. Remember that a visibility graph can be constructed from time series–like data by connecting all points that are “visible” to each other. The following graphic is mostly self-explanatory. We consider the vertical lines from the axis as visibility blockers (rather than the actual function graph). The function visibleQ determines if the point p1 is visible from the point p2 (and vice versa), where “visible” means that no line from the axis to any point between p1 and p2 blocks sight.
✕
visibleQ[{p1_, p2_}] := True visibleQ[{p1_, middle__, p2_}] := (And @@ (C[{p1, #, p2}] & /@ {middle})) /. C :> tf |
✕
tf[{{t1_, v1_}, {s_, u_}, {t2_, v2_}}] := With[{U = v1 + (t1 - s)/(t1 - t2) (v2 - v1)}, If[u < 0, U > 0 || U < u, U > u || U < 0]] |
✕
visibilityEdges[pts_] := Monitor[Table[ If[visibleQ[Take[pts, {i, j}]], i \[UndirectedEdge] j, {}], {i, Length[pts] - 1}, {j, i + 1, Length[pts]}], {i, j}] // Flatten |
✕
extremasGoldenPoints[n_] := {#, fGolden[#]} & /@ Take[extremasGolden, n] |
✕
With[{pts = {#, fGolden[#]} & /@ Take[extremasGolden, 10]}, Show[{Plot[fGolden[x], {x, 0, 11}, PlotStyle -> RGBColor[0.36, 0.51, 0.71]], ListPlot[pts, Filling -> Axis, FillingStyle -> RGBColor[0.88, 0.61, 0.14]], Graphics[{RGBColor[0.36, 0.51, 0.71], PointSize[0.02], MapIndexed[{Point[{#, fGolden[#]}], Black , Text[ #2[[1]], {#, fGolden[#] + 0.2}]} &, Take[ extremasGolden, 10]], Gray, Line[Map[pts[[#]] &, List @@@ visibilityEdges[extremasGoldenPoints[10]], {-1}]]}]}]] |
Here is a larger graph. The maxima with are responsible for the far-distance edges (e.g. 1–179).
✕
visGr = Graph[ visibilityEdges[{#, fGolden[#]} & /@ Take[extremasGolden, 200]], VertexLabels -> "Name", EdgeLabels -> Placed["Name", Tooltip]] |
The relevant information is contained in the degree distribution of the visibility graph.
✕
ListLogLogPlot[Tally[VertexDegree[visGr]]] |
In addition to more summands, we could also look at products of cosine functions.
✕
f\[CapitalPi][x_] = Cos[x]*Product[If[IntegerQ[Sqrt[k]], 1, Cos[Sqrt[k] x]], {k, 2, 6}] |
✕
Plot[f\[CapitalPi][x], {x, 0, 20}, PlotRange -> All] |
✕
cosZeros[\[Alpha]_, n_] := Join[Table[(Pi/2 + 2 \[Pi] k)/\[Alpha], {k, 0, n}], Table[(-Pi/2 + 2 \[Pi] k)/\[Alpha], {k, n}]] |
✕
zeros\[CapitalPi] = With[{n = 10000}, Sort[N@Flatten[{cosZeros[1, n], Table[If[IntegerQ[Sqrt[k]], {}, cosZeros[Sqrt[k], n]], {k, 2, 6}]}]]]; |
✕
Histogram3D[ Partition[Select[Differences[zeros\[CapitalPi]], # < 3 &], 2, 1], 100] |
Modulo an increase in the degree of the polynomials involved, such products should also be amenable to the algebraic approach used above for the nonsuccessive zero distances.
If instead of we had used , the distribution of the zero differences would be much less interesting. (For generic , , there is no substantial difference in the zero-distance distribution between , but , is a nongeneric situation).
Calculating the distribution of spacings gives a mostly uniform distribution. The small visible structures on the right are numerical artifacts, and calculating the zeros with a higher precision makes most of them go away.
✕
fGoldenSin[x_] := Sin[x] + Sin[GoldenRatio x] + Sin[GoldenRatio^2 x] zerosGoldenSin = findZeros[fGoldenSin, 10^5]; Histogram[Differences[zerosGoldenSin], 1000, PlotRange -> All] |
Visually, the spacing of + + does not seem to depend on .
✕
ContourPlot[ Cos[x + \[CurlyPhi]] + Cos[Pi x + \[CurlyPhi]] + Cos[Pi^2 x + \[CurlyPhi]] == 0, {x, -4 Pi, 4 Pi}, {\[CurlyPhi], 0, 2 Pi}, PlotPoints -> 120, AspectRatio -> 1/2, GridLines -> {{}, {Pi/2, Pi, 3/2 Pi}}] |
Let’s end with the example mentioned in the introduction: + + .
✕
fSqrtSin[x_] := Sin[x] + Sin[Sqrt[2] x] + Sin[Sqrt[3] x] zerosSqrtSin = findZeros[fSqrtSin, 10^5]; |
The positions of the peaks are described by the formulas conjectured above.
✕
{peak1, peak2, peak3} = {2 x /. Solve[-Cos[x] + Cos[Sqrt[2] x] + Cos[Sqrt[3] x] == 0 \[And] 0 < x < 1, x][[1]], 2 x /. Solve[+Cos[x] + Cos[Sqrt[2] x] + Cos[Sqrt[3] x] == 0 \[And] 1 < x < 4, x][[1]], 2 x /. Solve[+Cos[x] + Cos[Sqrt[2] x] - Cos[Sqrt[3] x] == 0 \[And] 1 < x < 2, x][[1]]} |
✕
Histogram[Differences[zerosSqrtSin], 1000, {"Log", "Count"}, PlotRange -> All, GridLines -> {{peak1, peak2, peak3}, {}}] |
We note that one of the roots of is very near one of the observed peaks, but it does not describe the peak position.
✕
peak2Alt = x /. Solve[ Sin[x] + Sin[Sqrt[2] x] + Sin[Sqrt[3] x] == 0 \[And] 1 < x < 4, x][[1]] |
✕
Histogram[Select[Differences[zerosSqrtSin], 2.2 < # < 2.35 &], 200, PlotRange -> All, GridLines -> {{{peak2, Red}, {peak2Alt, Blue}}, {}}] |
We will end our graphical/numerical investigation of sums of three cosines for today.
A natural question is the following: can one find a symbolic representation of the density of the zero distances along the algebraic curve derived above? Based on the calculated zeros, we have the following empirical density along the curve.
✕
sdkc\[Alpha]\[Delta] = SmoothKernelDistribution[{Cos[GoldenRatio #], #2 - #1} & @@@ Take[Partition[zerosGolden, 2, 1], All], 0.03, PerformanceGoal -> "Quality"] |
✕
Plot3D[PDF[ sdkc\[Alpha]\[Delta], {c\[Alpha], \[Delta]}], {c\[Alpha], -1.1, 1.1}, {\[Delta], 0, 4}, PlotRange -> All, Exclusions -> {}, PlotPoints -> 120, MeshFunctions -> {#3 &}, AxesLabel -> {c\[Alpha], \[Delta]}] |
How can the behavior of a concrete sum of three cosines be classified around their zeros?
For the sums for which an algebraic relation between the three cosine terms exists, the zeros form curves in the plane. Along these curves, the shape of the function around their zeros changes smoothly. And at the self-intersection of the curve, shapes “split.” The next graphic plots the zeros for the above example . Mouse over the points to see shifted versions of near the zero under consideration.
✕
nfc\[Alpha]\[Delta]Brass = Nearest[c\[Alpha]\[Delta]ListBrass = ({Cos[\[Phi]Brass #], #2 - #1} \ -> #1) & @@@ Take[Partition[zerosBrass, 2, 1], All]]; |
✕
makeZeroPlotBrass[{c\[Alpha]_, \[Delta]_}] := Module[{pts = nfc\[Alpha]\[Delta]Brass[{c\[Alpha], \[Delta]}, {All, 0.01}]}, Plot[Evaluate[(Cos[# + x] + Cos[\[Phi]Brass (# + x)] + Cos[\[Phi]Brass^2 (# + x)]) & /@ RandomSample[pts, UpTo[10]]], {x, -4, 4}]] |
✕
Graphics[{PointSize[0.002], Point[First[#]] & /@ RandomSample[c\[Alpha]\[Delta]ListBrass, 25000]}, Frame -> True, PlotRange -> {{-1, 1}, {0, 2.6}}] /. Point[l_] :> (Tooltip[Point[ l], Dynamic[makeZeroPlotBrass[l]]] ) |
For generic , with no algebraic relation between them, the zeros do not form curves in the plane. One could display the zeros in space where they form a surface (see the above example of the function ). Even after projection into the plane. While this gives point clouds, it is still instructive to see the possible curve shapes.
✕
zerosSqrt23 = findZeros[fSqrt, 100000]; |
✕
nfc\[Alpha]\[Delta]Sqrt23 = Nearest[c\[Alpha]\[Delta]ListSqrt23 = ({Cos[ Sqrt[2] #], #2 - #1} -> #1) & @@@ Take[Partition[zerosSqrt23, 2, 1], All]]; |
✕
makeZeroPlotSqrt23[{c\[Alpha]_, \[Delta]_}] := Module[{pts = nfc\[Alpha]\[Delta]Sqrt23[{c\[Alpha], \[Delta]}, {All, 0.01}]}, Plot[Evaluate[ fSqrt[# + x] & /@ RandomSample[pts, UpTo[10]]], {x, -4, 4}]] |
✕
Graphics[{PointSize[0.002], Point[First[#]] & /@ RandomSample[c\[Alpha]\[Delta]ListSqrt23, 25000]}, Frame -> True, PlotRange -> {{-1, 1}, {0, 5}}, AspectRatio -> 2] /. Point[l_] :> (Tooltip[Point[ l], Dynamic[makeZeroPlotSqrt23[l]]] ) |
Above, based on the observed distribution of the distances between consecutive zeros, a power-law like decay of the distribution was conjectured. In this appendix, we will give some numerical evidence for the power law based on the envelope that defines the smallest zero.
We remind ourselves that we are interested in the smallest zero of + + subject to the constraints , .
Before modeling the power law, let us have a closer look at the data, especially the minima with function values approximately equal to –1 and slope 0 and the following zero. We use the list zerosAndExtremaGolden from above to find such pairs.
✕
minimaAtMinus1 = Select[Cases[ Partition[zerosAndExtremaGolden, 2, 1], {{_, "extrema"}, {_, "zero"}}], (Abs[fGolden[#[[1, 1]]] + 1] < 0.02 \[And] Abs[fGolden'[#[[1, 1]]]] < 0.02) &]; |
These minima split visibly into two groups.
✕
Show[Plot[Evaluate[fGolden[# + t]], {t, -1, 3}, PlotStyle -> Directive[Thickness[0.001], RGBColor[0.36, 0.51, 0.71]]] & /@ RandomSample[minimaAtMinus1, 100][[All, 1, 1]], PlotRange -> 1 All] |
We select the zeros smaller than 0.6.
✕
minimaAtMinus1B = Select[minimaAtMinus1, #[[2, 1]] - #[[1, 1]] < 0.6 &]; |
✕
minimumAndZeroDistances = {#[[1, 1]], #[[2, 1]] - #[[1, 1]]} & /@ minimaAtMinus1B; |
At the minimum , we can locally write + +, which defines three phases ,
✕
\[CurlyPhi]1\[CurlyPhi]2\[CurlyPhi]3Triples = ({Mod[-#1, 2 Pi], Mod[-GoldenRatio #, 1 2 Pi], Mod[-GoldenRatio^2 #1, 2 Pi], #2} & @@@ minimumAndZeroDistances) /. x_?(# > 5 &) :> (2 Pi - x); |
Plotting and shows an approximate linear relationship.
✕
ListPlot[{{#1, #2} & @@@ \[CurlyPhi]1\[CurlyPhi]2\[CurlyPhi]3Triples, \ {#1, #3} & @@@ \[CurlyPhi]1\[CurlyPhi]2\[CurlyPhi]3Triples}, PlotLegends -> {Subscript["\[CurlyPhi]", 2], Subscript["\[CurlyPhi]", 3]}, AxesLabel -> {Subscript["\[CurlyPhi]", 1]}] |
Here are the triples observed in space.
✕
Graphics3D[{RGBColor[0.99, 0.81, 0.495], Sphere[#1, 0.0001 #2] & @@@ Tally[Round[{#1, #2, #3} & @@@ \ \[CurlyPhi]1\[CurlyPhi]2\[CurlyPhi]3Triples, 0.01]]}, PlotRange -> All, Axes -> True, AxesLabel -> {Subscript["\[CurlyPhi]", 1], Subscript["\[CurlyPhi]", 2], Subscript["\[CurlyPhi]", 3]}] |
As a function of the position of the first zero after , where is the smallest root of , we obtain the following graphic. Because near we expect the relative phases to be , and , we display , and .
✕
ListPlot[{{#4, #1} & @@@ \[CurlyPhi]1\[CurlyPhi]2\[CurlyPhi]3Triples, \ {#4, #2 - Pi} & @@@ \[CurlyPhi]1\[CurlyPhi]2\[CurlyPhi]3Triples, {#4, #3 - Pi} & @@@ \[CurlyPhi]1\[CurlyPhi]2\[CurlyPhi]3Triples}, PlotLegends -> {Subscript["\[CurlyPhi]", 1], Subscript["\[CurlyPhi]", 2], Subscript["\[CurlyPhi]", 3]}, AxesLabel -> {Style["z", Italic]}] |
The distribution of the observed phases near the minima with function value –1 are all approximately uniform. (We will use this fact below to model the power law decay.)
✕
Histogram[#1, 50, PlotLabel -> Subscript["\[CurlyPhi]", #2]] & @@@ Transpose[{Take[ Transpose[\[CurlyPhi]1\[CurlyPhi]2\[CurlyPhi]3Triples], 3], {1, 2, 3}}] |
Now let us compare this observed phase distribution with all mathematically possible solutions from the envelope conditions.
✕
\[CurlyPhi]1\[CurlyPhi]2Of\[CurlyPhi]3Solution = Solve[{Cos[\[CurlyPhi]1] + Cos[\[CurlyPhi]2] + Cos[\[CurlyPhi]3] == -1, -Sin[\[CurlyPhi]1] - GoldenRatio Sin[\[CurlyPhi]2] - GoldenRatio^2 Sin[\[CurlyPhi]3] == 0}, {\[CurlyPhi]1, \[CurlyPhi]2}]; |
✕
\[CurlyPhi]1\[CurlyPhi]2Of\[CurlyPhi]3CF = Compile[{{\[CurlyPhi]3, _Complex}}, Evaluate[{\[CurlyPhi]1, \[CurlyPhi]2} /. \[CurlyPhi]1\[CurlyPhi]2Of\ \[CurlyPhi]3Solution /. ConditionalExpression[\[Zeta]_, _] :> \[Zeta] /. _C :> 0]] |
✕
\[CurlyPhi]1\[CurlyPhi]2Of\[CurlyPhi]3[\[CurlyPhi]3_Real] := Append[#, \[CurlyPhi]3] & /@ Cases[Chop[\[CurlyPhi]1\[CurlyPhi]2Of\[CurlyPhi]3CF[\[CurlyPhi]3]], \ {_Real, _Real}] |
For a given value of , we either have no, two or four solutions for . The following interactive demonstration allows us to explore the solutions as a function of . We see the two types of solutions represented by the list minimaAtMinus1 above.
✕
Manipulate[ With[{phases = \ \[CurlyPhi]1\[CurlyPhi]2Of\[CurlyPhi]3[\[CurlyPhi]3]}, Plot[Evaluate[(Cos[x + #[[1]]] + Cos[GoldenRatio x + #[[2]]] + Cos[GoldenRatio^2 x + #[[3]]]) & /@ phases], {x, -1, 3}, PlotLegends -> Placed[({Subscript["\[CurlyPhi]", 1], Subscript["\[CurlyPhi]", 2], Subscript["\[CurlyPhi]", 3]} == NumberForm[#, 3] & /@ phases), Below]]] // Quiet, {{\[CurlyPhi]3, Pi + 0.2, Subscript["\[CurlyPhi]", 3]}, 0, 2 Pi, Appearance -> "Labeled"}, TrackedSymbols :> True] |
Here are the possible phases that are compatible with the envelope conditions as a function of , the position of the first zero.
✕
\[CurlyPhi]1\[CurlyPhi]2\[CurlyPhi]3zList = Table[{#, x /. Quiet[ Solve[gGolden[x, #] == 0 \[And] 0.4 < x < 0.6, x]]} & /@ \ \[CurlyPhi]1\[CurlyPhi]2Of\[CurlyPhi]3[\[CurlyPhi]3], {\[CurlyPhi]3, Pi - 0.5, Pi + 0.5, 1/1001}]; |
✕
\[Pi]ize[x_] := If[x < -Pi/2, 2 Pi + x, x]; \[CurlyPhi]1\[CurlyPhi]2\[CurlyPhi]3zListB = {\[Pi]ize /@ #1, #2} & @@@ Cases[Flatten[\[CurlyPhi]1\[CurlyPhi]2\[CurlyPhi]3zList, 1], {_, {_Real}}]; |
✕
ListPlot[{{#2[[1]], #1[[1]] - 0} & @@@ \[CurlyPhi]1\[CurlyPhi]2\[CurlyPhi]3zListB, {#2[[1]], #1[[2]] - Pi} & @@@ \[CurlyPhi]1\[Curl |
Last week, I read Michael Berry’s paper, “Laplacian Magic Windows.” Over the years, I have read many interesting papers by this longtime Mathematica user, but this one stood out for its maximizing of the product of simplicity and unexpectedness. Michael discusses what he calls the magic window. For 70+ years, we have known about holograms, and now we know about magic windows. So what exactly is a magic window? Here is a sketch of the optics of one:
Parallel light falls onto a glass sheet that is planar on the one side and has some gentle surface variation on the other side (bumps in the above image are vastly overemphasized; the bumps of a real magic window would be minuscule). The light gets refracted by the magic window (the deviation angles of the refracted light rays are also overemphasized in the graphic) and falls onto a wall. Although the window bumpiness shows no recognizable shape or pattern, the light density variations on the wall show a clearly recognizable image. Starting with the image that one wants to see on the wall, one can always construct a window that shows the image one has selected. The variations in the thickness of the glass are assumed to be quite small, and the imaging plane is assumed to be not too far away so that the refracted light does not form caustics—as one sees them, for instance, at the bottom of a swimming pool in sunny conditions.
Now, how should the window surface look to generate any pre-selected image on the wall? It turns out that the image visible on the wall is the Laplacian of the window surface. Magic windows sound like magic, but they are just calculus (differentiation, to be precise) in action. Isn’t this a neat application of multivariate calculus? Schematically, these are the mathematical steps involved in a magic window.
Implementation-wise, the core steps are the following:
And while magic windows are a 2017 invention, their roots go back hundreds of years to so-called magic mirrors. Magic mirrors are the mirror equivalent of magic windows: they too can act as optical Laplace operators (see the following).
Expressed more mathematically: Let the height of the bumpy side of the glass surface be f(x,y). Then the intensity of the light brightness on the wall is approximately Δ_{x,y} f(x,y), where Δ is the Laplacian ∂^{2}./∂x^{2}+∂^{2}./∂y^{2}. Michael calls such a window a “magic window.” It is magic because the glass surface height f(x,y) does not in any way resemble Δ _{x,y} f(x,y).
It sounds miraculous that a window can operate as a Laplace operator. So let’s do some numerical experiments to convince ourselves that this does really work. Let’s start with a goat that we want to use as the image to be modeled. We just import a cute-looking Oberhasli dairy goat from the internet.
The gray values of the pixels can be viewed as a function h: ℝ*ℝ->[0,1]. Interpolation allows us to use this function constructively.
Here is a 3D plot of the goat function ifGoat.
And we can solve the Poisson equation with the image as the right-hand side: Δ f = image using NDSolveValue.
We will use Dirichlet boundary conditions for now. (But the boundary conditions will not matter for the main argument.)
The Poisson equation solution is quite a smooth function; the inverse of the Laplace operator is a smoothing operation. No visual trace of the goat seems to be left.
Overall it is smooth, and it is also still smooth when zoomed in.
Even when repeatedly zoomed in.
The overall shape of the Poisson equation solution can be easily understood through the Green’s function of the Laplace operator.
We calculate and visualize the first few terms (individually) of the double sum from the Green’s function.
Taking 25^{2} terms into account, we have the following approximations for the Poisson equation solution and its Laplacian. The overall shape is the same as the previous numerical solution of the Poisson equation.
For this small number of Fourier modes, the outline of goat is recognizable, but its details aren’t.
Applying the Laplace operator to the PDE solutions recovers (by construction) a version of the goat. Due to finite element discretization and numerical differentiation, the resulting goat is not quite the original one.
A faster and less discretization-dependent way to solve the Poisson equation uses the fast Fourier transform (FFT).
This solution recovers the goat more faithfully. Here is the recovered goat after interpolating the function values.
Taking into account that any physical realization of a magic window made from glass will unavoidably have imperfections, a natural question to ask is: What happens if one adds small perturbations to the solution of the Poisson equation?
The next input modifies each grid point randomly by a perturbation of relative size 10^{-p}. We see that for this goat, the relative precision of the surface has to be on the order of 10^{-6} or better—a challenging but realizable mechanical accuracy.
To see how the goat emerges after differentiation (Δ = ∂^{2} ./∂x^{2}+∂^{2} ./∂y^{2}), here are the partial derivatives shown.
And because we have ∂^{2} ./∂x^{2}+∂^{2} ./∂y^{2} =(∂./∂x+ⅈ ∂./∂y)(∂./∂x–ⅈ ∂./∂y), we also look at the Wirtinger derivatives.
We could also just use a simple finite difference formula to get the goat back. This avoids any interpolation artifacts and absolutely faithfully reproduces the original goat.
The differentiation can even be carried out as an image processing operation.
So far, nothing really interesting. We integrated and differentiated a function. Let’s switch gears and consider the refraction of a set of parallel light rays on a glass sheet.
We consider the lower parts of the glass sheet planar and the upper part slightly wavy with an explicit description height = f(x,y). The index of refraction is n, and we follow the light rays (coming from below) up to the imaging plane at height = Z. Here is a small Manipulate that visualizes this situation for the example surface f(x,y) = 1 + ε (cos(α x) + cos(β y).
We do want the upper surface of the glass nearly planar, so we use the factor ε in the previous equation.
The reason we want the upper surface mostly planar is that we want to avoid rays that “cross” near the surface and form caustics. We want to be in a situation where the density of the rays is position dependent, but the rays do not yet cross. This restricts the values of n, Z and the height of the surface modulation.
Now let’s do the refraction experiment with the previous solution of the Laplace equation as the height of the upper glass surface. To make the surface variations small, we multiply that solution by 0.0001.
We use the median refractive index of glass, n = 1.53.
Instead of using lightRay, we will use a compiled version for faster numerical evaluation.
In absolute units, say the variations in glass height are at most 1 mm; we look at the refracted rays a few meters behind the glass window. We will use about 3.2 million lights rays (4^{2} per pixel).
Displaying all endpoints of the rays gives a rather strong Moiré effect. But the goat is visible—a true refraction goat!
If we accumulate the number of points that arrive in a small neighborhood of the given points {X,Y} in the plane height=Z, the goat becomes much more visible. (This is what would happen if we would observe the brightness of the light that goes through the glass sheet and we assume that the intensities are additive.) To do the accumulation efficiently, we use the function Nearest.
Note that looking into the light that comes through the window would not show the goat because the light that would fall into the eye would mostly come from a small spatial region due to the mostly parallel light rays.
The appearance of the Laplacian of the surface of the glass sheet is not restricted to only parallel light. In the following, we use a point light source instead of parallel light. This means that the effect would also be visible by using artificial light sources, rather than sunlight with a magic window.
So why is the goat visible in the density of rays after refraction? At first, it seems quite surprising whether either a parallel or point source shines on the window.
On second thought, one remembers Maxwell’s geometric meaning of the Laplace operator:
… where indicates the average of f on a sphere centered at with radius ρ. Here is a quick check of the last identity for two and three dimensions.
At a given point in the imaging plane, we add up the light rays from different points of the glass surface. This means we carry out some kind of averaging operation.
So let’s go back to the general refraction formula and have a closer look. Again we assume that the upper surface is mostly flat and that the parameter ε is small. The position {X,Y} of the light ray in the imaging plane can be calculated in closed form as a function of the surface g(x,y), the starting coordinates of the light ray {x,y}, the index of refraction n and the distance of the imaging plane Z.
That is a relatively complicated-looking formula. For a nearly planar upper glass surface (small ε), we have the following approximate coordinates for the {X,Y} coordinates of the imaging plane where we observe the light rays in terms of the coordinate {x,y} of the glass surface.
This means in zeroth order we have {X,Y} ≈ {x,y}. And the deviation of the light ray position in the imaging plane is proportional (n–1)Z. (Higher-order corrections to {X,Y} ≈ {x,y} we could get from Newton iterations, but we do not need them here.)
The density of rays is the inverse of the Jacobian for going from {x,y} to {X,Y}. (Think on the change of variable formulas for 1:1 transforms for multivariate integration.)
Quantifying the size of the resulting expression shows that it is indeed a large expression. This is quite a complex formula. For a quadratic function in x and y, we can get some feeling for the density as a function of the physical parameters ε, n and Z as well as the parameters that describe the surface by varying them in an interactive demonstration. For large values of n, Z and ε, we see how caustics arise.
For nearly planar surfaces (first order in ε), the density is equal to the Laplacian of the surface heights (in x,y coordinates). This is the main “trick” in the construction of magic windows.
This explains why the goat appears as the intensity pattern of the light rays after refraction. This means glass sheets act effectively as a Laplace operator.
Using Newton’s root-finding method, we could calculate the intensity in X,Y coordinates, but the expression explains heuristically why refraction on a nearly planar surface behaves like an optical Laplace operator. For more details, see this article.
Now we could model a better picture of the light ray density by pre-generating a matrix of points in the imaging plane using, say, 10 million rays, and record where they fall within the imaging plane. This time we model the solution of the Poisson equation using ListDeconvolve.
The approximate solution of the Poisson equation is not quite as smooth as the global solutions, but the goat is nevertheless invisible.
We adjust the brightness/darkness through a power law (a crude approximation for a Weber–Fechner perception).
If the imaging plane is too far away, we do get caustics (that remind me of the famous cave paintings from Lascaux).
If the image plane is even further away, the goat slowly becomes unrecognizable.
Although not practically realizable, we also show what the goat would look like for negative Z; now it seems much more sheep-like.
Here is a small animation showing the shape of the goat as a function of the distance Z of the imaging plane from the upper surface.
Even if the image is just made from a few lines (rather than each pixel having a non-white or non-black value), the solution of the Poisson equation is a smooth function, and the right-hand side is not recognizable in a plot of the solution.
But after refraction on a glass sheet (or applying the Laplacian), we see Homer quite clearly.
Despite the very localized curve-like structures that make the Homer image, the resulting Poisson equation solution again looks quite smooth. Here is the solution textured with its second derivative (the purple line will be used in the next input).
The next graphic shows a cross-section of the Poisson equation solution together with its (scaled) first and second derivatives with respect to x along the purple line of the last graphic. The lines show up quite pronounced in the second derivatives.
Let’s repeat a modification of the previous experiment to see how precise the surface would have to be to show Homer. We add some random waves to the Homer solution.
Again we see that the surface would have to be correct at the (10^{-6}) level or better.
Or one can design a nearly planar window that will project one’s most favorite physics equation on the wall when the Sun is shining.
When looking at the window, one will not notice any formulas. But this time, the solution of the Poisson equation has more overall structures.
But the refracted light will make physics equations. The resulting window is perfect for the entrance of, say, physics department buildings.
Now that we’re at the end of this post, let us mention that one can also implement the Laplacian through a mirror, rather than a window. See Michael Berry’s paper from 2006, “Oriental Magic Mirrors and the Laplacian Image” (see this article as well). Modifying the above function for refracting a light ray to reflecting a light ray and assuming a mostly flat mirror surface, we see the Laplacian of the mirror surface in the reflected light intensity.
Making transparent materials and mirrors of arbitrary shape, now called free-form optics, is considered the next generation of modern optics and will have wide applications in science, technology, architecture and art (see here). I think that a few years from now, when the advertising industry recognizes their potential, we will see magic windows with their unexpected images behind them everywhere.
In my recent Wolfram Community post, “How many animals can one find in a random image?,” I looked into the pareidolia phenomenon from the viewpoints of pixel clusters in random (2D) black-and-white images. Here are some of the shapes I found, extracted, rotated, smoothed and colored from the connected black pixel clusters of a single 800×800 image of randomly chosen, uncorrelated black-and-white pixels.
For an animation of such shapes arising, changing and disappearing in a random gray-level image with slowly time-dependent pixel values, see here. By looking carefully at a selected region of the image, at the slowly changing, appearing and disappearing shapes, one frequently can “see” animals and faces.
The human mind quickly sees faces, animals, animal heads and ghosts in these shapes. Human evolution has optimized our vision system to recognize predators and identify food. Our recognition of an eye (or a pair of eyes) in the above shapes is striking. For the neuropsychological basis of seeing faces in a variety of situations where actual faces are absent, see Martinez-Conde2016.
A natural question: is this feature of our vision specific to 2D silhouette shapes, or does the same thing happen for 3D shapes? So here, I will look at random shapes in 3D images and the 2D projections of these 3D shapes. Various of the region-related functions that were added in the last versions of the Wolfram Language make this task possible, straightforward and fun.
I should explain the word Arp-imals from the title. With the term “Arp-imals” I refer to objects in the style of the sculptures by Jean Arp, meaning smooth, round, randomly curved biomorphic forms. Here are some examples.
Forms such as these hide frequently in 3D images made from random black-and-white voxels. Here is a quick preview of shapes we will extract from random images.
We will also encounter what I call Moore-iens, in the sense of the sculptures by the slightly later artist Henry Moore.
With some imagination, one can also see forms of possible aliens in some of the following 2D shapes. (See Domagal-Goldman2016 for a discussion of possible features of alien life forms.)
As in the 2D case, we start with a random image: this time, a 3D image of voxels of values 0 and 1. For reproducibility, we will seed the random number generator. The Arp-imals are so common that virtually any seed produces them. And we start with a relatively small image. Larger images will contain many more Arp-imals.
Hard to believe at first, but the blueprints of the above-shown 3D shapes are in the last 3D cube. In the following, we will extract them and make them more visible.
As in the 2D case, we again use ImageMesh to extract connected regions of white cells. The regions still look like a random set of connected polyhedra. After smoothing the boundaries, nicer shapes will arise.
Here are the regions, separated into non-touching ones, using the function ConnectedMeshComponents. The function makeShapes3D combines the image creation, the finding of connected voxel regions, and the region separation.
For demonstration purposes, in the next example, we use a relatively low density of white voxels to avoid the buildup of a single large connected region that spans the whole cube.
Here are the found regions individually colored in their original positions in the 3D image.
To smooth the outer boundaries, thereby making the shapes more animal-, Arp-imal- and alien-like, the function smooth3D (defined in the accompanying notebook) is a quick-and-dirty implementation of the Loop subdivision algorithm. (As the 3D shapes might have a higher genus, we cannot use BSplineSurface directly, which would have been the direct equivalent to the 2D case.) Here are successive smoothings of the third of the above-extracted regions.
Using the region plot theme "SmoothShading" of the function BoundaryMeshRegion, we can add normals to get the feeling of a genuinely smooth boundary.
And for less than $320 one can obtain this Arp-inspired piece in brass. A perfect, unique, stunning post-Valentine’s gift. For hundreds of alternative shapes to print, see below. We use ShellRegion to reduce the price and save some internal material by building a hollow region.
Here is the smoothing procedure shown for another of the above regions.
And for three more.
Many 3D shapes can now be extracted from random and nonrandom 3D images. The next input calculates the region corresponding to lattice points with coprime coordinates.
In the above example, we start with a coarse 3D region, which feels polyhedral due to the obvious triangular boundary faces. It is only after the smoothing procedure that we obtain “interesting-looking” 3D shapes. The details of the applied smoothing procedure do not matter, as long as sharp edges and corners are softened.
Human perception is optimized for smooth shapes, and most plants and animals have smooth boundaries. This is why we don’t see anything interesting in the collection of regions returned from ImageMesh applied to a 3D image. This is quite similar to the 2D case. In the following visualization of the 2D case, we start with a set of randomly selected points. Then we connect these points through a curve. Filling the curve yields a deformed checkerboard-like pattern that does not remind us of a living being. Rasterizing the filled curve in a coarse-grained manner still does not remind us of organic shapes. The connected region, and especially the smoothed region, do remind most humans of living beings.
The following Manipulate (available in the notebook) allows us to explore the steps and parameters involved in an interactive session.
And here is a corresponding 3D example.
In her reply to my community post, Marina Shchitova showed some examples of faces and animals in shadows of hands and fingers. Some classic examples from the Cassel1896 book are shown here.
So, what do projections/shadows of the above two 3D shapes look like? (For a good overview of the use of shadows in art at the time and place of the young Arp, see Forgione1999.)
The projections of these 3D shapes are exactly the types of shapes I encountered in the connected smoothed components of 2D images. The function projectTo2D takes a 3D graphic complex and projects it into a thin slice parallel to the three coordinate planes. The result is still a Graphics3D object.
These are the 2×3 projections of the above two 2D shapes. Most people recognize animal shapes in the projections.
We get exactly these projections if we just look at the 3D shape from a larger distance with a viewpoint and direction parallel to the coordinate axes.
For comparison, here are three views of the first object from very far away, effectively showing the projections.
By rotating the 3D shapes, we can generate a large variety of different shapes in the 2D projections. The following Manipulate allows us to explore the space of projections’ shapes interactively. Because we need the actual rotated coordinates, we define a function rotate, rather than using the built-in function Rotate.
Here is an array of 16 projections into the x-z plane for random orientations of the 3D shape.
The initial 3D image does not have to be completely random. In the next example, we randomly place circles in 3D and color a voxel white if the circle intersects the voxel. As a result, the 3D shapes corresponding to the connected voxel regions have a more network-like shape.
2D projection shapes of 3D animals typically have no symmetry. Even if an animal has a symmetry, the visible shape from a given viewpoint and a given animal posture does not have a symmetry. But most animals have a bilateral symmetry. I will now use random images that have a bilateral symmetry. As a result, many of the resulting shapes will also have a bilateral symmetry. Not all of the shapes, because some regions do not intersect the symmetry plane. Bilateral symmetry is important for the classic Rorschach inkblot test: “The mid-line appears to attract the patient’s attention with a sort of magical power,” noted Rorschach (Schott2013). The function makeSymmetricShapes3D will generate regions with bilateral symmetry.
Here are some examples.
And here are smoothed and colored versions of these regions. The viewpoint is selected in such a way as to make the bilateral symmetry most obvious.
To get a better feeling for the connection between the pixel values of the 3D image and the resulting smoothed shape, the next Manipulate allows us to specify each pixel value for a small-sized 3D image. The grids/matrices of checkboxes represent the voxel values of one-half of a 3D image with bilateral symmetry.
Randomly and independently selecting the voxel value of a 3D image makes it improbable that very large connected components without many holes form. Using instead random functions and deriving voxel values from these random continuous functions yields different-looking types of 3D shapes that have a larger uniformity over the voxel range. Effectively, the voxel values are no longer totally uncorrelated.
Here are some examples of the resulting regions, as well as their smoothed versions.
Our notebook contains in the initialization section more than 400 selected regions of “interesting” shapes classified into five types (mostly arbitrarily, but based on human feedback).
Let’s look at some examples of these regions. Here is a list of some selected ones. Many of these shapes found in random 3D images could be candidates for Generation 8 Pokémon or even some new creatures, tentatively dubbed Mathtubbies.
Many of the shapes are reminiscent of animals, even if the number of legs and heads is not always the expected number.
To see all of the 400+ shapes from the initialization cells, one could carry out the following.
Do[Print[Framed[Style[t, Bold, Gray], FrameStyle -> Gray]];
Do[Print[Rasterize @ makeRegion @ r], {r, types[t]}], {t,Keys[types]}]
The shapes in the list above were manually selected. One could now go ahead and partially automate the finding of interesting animal-looking shapes and “natural” orientations using machine learning techniques. In the simplest case, we could just use ImageIdentify.
This seems to be a stegosaurus-poodle crossbreed. But we will not pursue this direction here and now, but rather return to the 2D projections. (For using software to find faces in architecture and general equipment, see Hong2014.)
Before returning to the 2D projections, we will play for a moment with the 3D shapes generated and modify them for a different visual appearance.
For instance, we could tetrahedralize the regions and fill the tetrahedra with spheres.
Or with smaller tetrahedra.
Or add some spikes.
Or fill the shapes with cubes.
Or thicken or thin the shapes.
Or thicken and add thin bands.
Or just add a few stripes as camouflage.
Or model the inside through a wireframe of cylinders.
Or build a stick figure.
Or fill the surface with a tube.
Or a Kelvin inversion.
If we look at the 2D projections of some of these 3D shapes, we can see again (with some imagination) a fair number of faces, witches, kobolds, birds and other animals. Here are some selected examples. We show the 3D shape in the original orientation, a randomly oriented version of the 3D shape, and the three coordinate-plane projections of the randomly rotated 3D shape.
Unsurprisingly, some are recognizable 3D shapes, like these projections that look like bird heads.
Others are much more surprising, like the two heads in the projections of the two-legged-two-finned frog-dolphin.
Different orientations of the 3D shape can yield quite different projections.
For the reader’s amusement, here are some more projections.
Now that we have looked at 2D projections of 3D shapes, the next natural step would be to look at 3D projections of 4D shapes. And while there is currently no built-in function Image4D, it is not too difficult to implement for finding the connected components of white 4D voxels. We implement this through the graph theory function ConnectedComponents and consider two 4D voxels as being connected by an edge if they share a common 3D cube face. As an example, we use a 10*10*10*10 voxel 4D image. makeVoxels4D makes the 4D image data and whitePositionQ marks the position of the white voxels for quick lookup.
The 4D image contains quite a few connected components.
Here are the four canonical projections of the 4D complex.
We package the finding of the connected components into a function getConnected4DVoxels.
We also define a function rotationMatrix4D for conveniently carrying rotations in the six 2D planes of the 4D space.
Once we have the 3D projections, we can again use the above function to smooth the corresponding 3D shapes.
In the absence of Tralfamadorian vision, we can visualize a 4D connected voxel complex, rotate this complex in 4D, then project into 3D, smooth the shapes and then project into 2D. For a single 4D shape, this yields a large variety of possible 2D projections. The function projectionGrid3DAnd2D projects the four 3D projections canonically into 2D. This means we get 12 projections. Depending on the shape of the body, some might be identical.
We show the 3D shape in a separate graphic so as not to cover up the projections. Again, many of the 2D projections, and also some of the 3D projections, remind us of animal shapes.
The following Manipulate allows us to rotate the 4D shape. The human mind sees many animal shapes and faces.
Here is another example, with some more scary animal heads.
We could now go to 5D images, but this will very probably bring no new insights. To summarize some of the findings: After rotation and smoothing, a few percent of the connected regions of black voxels in random 3D images have an animal-like shape, or an artistic rendering of an animal-like shape. A large fraction (~10%) of the projections of these 3D shapes into 2D pronouncedly show the pareidolia phenomenon, in the sense that we believe we can recognize animals and faces in these projections. 4D images, due to the voxel count that increases exponentially with dimension, yield an even larger number of possible animal and face shapes.
To download this post as a CDF, click here. New to CDF? Get your copy for free with this one-time download.
]]>In this blog, I will carefully calculate explicit numerical examples of gravitational crystal movements. The “really” in the title should be interpreted as a high-precision, numerical solution to an idealized model problem. It should not be interpreted as “real world.” No retardation, special or general relativistic effects, stability against perturbation, tidal effects, or so on are taken into account in the following calculations. More precisely, we will consider the simplest case of a gravitational crystal: two gravitationally interacting, rigid, periodic 2D planar arrays embedded in 3D (meaning a 1/distance^{2} force law) of masses that can move translationally with respect to each other (no rotations between the two lattices). Each infinite array can be considered a crystal, so we are looking at what could be called the two-crystal problem (parallel to, and at the same time in distinction to, the classical gravitational two-body problem).
Crystals have been considered for centuries as examples of eternal, never-changing objects. Interestingly, various other time-dependent versions of crystals have been suggested over the last few years. Shapere and Wilczek suggested space-time crystals in 2012, and Boyle, Khoo, and Smith suggested so-called choreographic crystals in 2014.
In the following, I will outline the detailed asymptotic calculation of the force inside a periodic array of point masses and the numerical methods to find periodic orbits in such a force field. Readers not interested in the calculation details should fast-forward to the interactive demonstration in the section “The resulting gravitational crystals.”
Within an infinite crystal-like array of point masses, no net force is exerted on any of the point masses due to symmetry cancellation of the forces of the other point masses. This means we can consider the whole infinite array of point masses as rigid. But the space between the point masses has a nontrivial force field.
To calculate orbits of masses, we will have to solve Newton’s famous . So, we need the force of an infinite array of 1/r potentials. We will consider the simplest possible case, namely a square lattice of point masses and lattice constant L. The force at a point {x,y} is given by the following double sum:
Unfortunately, we can’t sum this expression in closed form. Using the sum of the potential is not easier, either; it actually increases the likelihood of a complication in the form of the potential diverging. (Although deriving and subtracting the leading divergent term is possible: if we truncate the sums at ±M, we have a linearly divergent term 8 M sinh^{-1}(1).)
So one could consider a finite 2D array of (2M+1)×(2M+1) point masses in the limit M→∞.
But the convergence of the double sum is far too slow to get precise values for the force. (We want the orbit periodicity to be correct to, say, 7 digits. This means we need to solve the differential equation to about 9 digits, and for this we need the force to be correct to at least 12 digits.)
Because the force is proportional to 1/distance^{2}, and the number of point masses grows with distance squared, taking all points into account is critical for a precise force value. Any approximation can’t make use of a finite number of point masses, but must instead include all point masses.
Borrowing some ideas from York and Wang, Lindbo and Tornberg, and Bleibel for calculating the Madelung constant to high precision, we can make use of one of the most popular integrals in mathematics.
This allows us to write the force as:
Exchanging integration and summation, we can carry out the double sum over all (2∞+1)^{2} lattice points in terms of elliptic theta functions.
Here we carry out the gradient operation under the integral sign:
We obtain the following converging integral:
While the integral does converge, numerical evaluation is still quite time consuming, and is not suited for a right-hand-side calculation in a differential equation.
Now let’s remind ourselves about some properties of the Jacobi elliptic theta function 3. The two properties of relevance to us are its sum representation and its inversion formula.
The first identity shows that for t→0, the theta function (and its derivative) vanishes exponentially. The second identity shows that exponential decay can also be achieved at t→∞.
Using the sum representation, we can carry out the t integration in closed form after splitting the integration interval in two parts. As a result, we obtain for the force a sum representation that is exponentially convergent.
After some lengthy algebra, as one always says (which isn’t so bad when using the Wolfram Language, but is still too long for this short note), one obtains a formula for the force when using the above identities for ϑ_{3} and similar identities for ϑ´_{3}. Here is the x component of the force. Note that M is now the limit of the sum representation of the elliptic theta function, not the size of the point mass lattice. The resulting expression for the force components is pretty large, with a leaf count of nearly 4,000. (Open the cell in the attached notebook to see the full expression.)
Here is a condensed form for the force in the x direction that uses the abbreviation
r_{i j} = (x + i L)^{2} + (y + j L)^{2}:
Truncating the exponentially convergent sums shows that truncation at around 5 terms gives about 17 correct digits for the force.
The convergence speed is basically independent of the position {x, y}. In the next table, we use a point on the diagonal near to the point mass at the origin of the coordinate system.
For points near to a point mass, we recover, of course, the 1/distance^{2} law.
For an even faster numerical calculation of the force, we drop higher-order terms in the double sums and compile the force.
All digits of the force are correct to machine precision.
And the calculation of a single force value takes about a tenth of a millisecond, which is well suited for further numerical calculations.
For further use, we define the function forceXY that for approximate position values returns the 2D force vector.
So now that we have a fast-converging series expansion for the force for the full infinite array of point masses, we are in good shape to calculate orbits.
The simplest possible situation is two square lattices of identical lattice spaces with the same orientation, moving relative to each other. In this situation, every point mass of lattice 1 experiences the same cumulative force from lattice 2, and vice versa. And within each lattice, the total force on each point mass vanishes because of symmetry.
Similar to the well-known central force situation, we can also separate the center of mass from the relative motion. The result is the equation of motion for a single mass point in the field of one lattice.
Here is a plot of the magnitude of the resulting force.
And here is a plot of the direction field of the force. The dark red dots symbolize the positions of the point masses.
How much does the field strength of the periodic array differ from the field strength of a single point mass? The following graphic shows the relative difference. On the horizontal and vertical lines in the middle of the rows and columns of the point masses, the difference is maximal. Due to the singularity of the force at the point masses, the force of a single mass point and the one of the lattice become identical in the vicinity of a point mass.
The next plot shows the direction field of the difference between a single point mass and the periodized version.
Once we have the force field, inverting the relation =-grad numerically allows us (because the force is obviously conservative) to calculate the potential surface of the infinite square array of point masses.
Now lets us look at actual orbits in the potential shown in the last two images.
The following Manipulate allows us to interactively explore the motion of a particle in the gravitational field of the lattice of point masses.
The relatively large (five-dimensional) space of possible orbits becomes more manageable if we look especially for some symmetric orbits, e.g. we enforce that the orbit crosses the line x = 0 or
x = 1/2 horizontally. Many orbits that one would intuitively expect to exist that move around 1, 2, or 3 point masses fall into this category. We use a large 2D slider to allow a more fine-grained control of the initial conditions.
Another highly symmetric situation is a starting point along the diagonal with an initial velocity perpendicular to it.
For the desired motion we are looking for, we demand that after a period, the particle comes back to either its original position with its original velocity vector or has moved to an equivalent lattice position.
Given an initial position, velocity, mass, and approximate period, it is straightforward to write a simple root-finding routine to zoom into an actual periodic orbit. We implement this simply by solving the differential equation for a time greater than the approximate orbit time, and find the time where the sum of the difference of the initial and final positions and initial and final velocities is minimal. The function findPeriodicOrbit carries out the search. This method is well suited for orbits whose periods are not too long. This will yield a nice collection of orbits. For longer orbits, errors in the solution of the differential equation will accumulate, and more specialized methods could be employed, e.g. relaxation methods.
Given some starting values, findPeriodicOrbit attempts to find a periodic orbit, and returns the corresponding initial position and velocity.
Given initial conditions and a maximal solution time, the function minReturnData determines the exact time at which the differences between the initial and final positions and velocities are minimal. The most time-consuming step in the search process is the solution of the differential equation. To avoid repeating work, we do not include the period time as an explicit search variable, but rather solve the differential equation for a fixed time T and then carry out a one-dimensional minimization to find the time at which the sum of the position and velocity differences becomes minimal.
As the search will take about a minute per orbit, we monitor the current orbit shape to entertain us while we wait. Typically, after a couple hundred steps we either find a periodic orbit, or we know that we failed to find a periodic orbit. In the latter case, the local minimum of the function to be minimized (the sum of the norms of initial versus final positions and velocities) has a finite value and so does not correspond to a periodic orbit.
Here is a successful search for a periodic orbit. The initial conditions for the search we either get interactively from the above Manipulate or from a random search selecting viable candidate initial conditions.
Here is a successful search for an orbit that ends at an equivalent lattice position.
So what kind of periodic orbits can we find? As the result of about half a million solutions with random initial positions, velocities, masses, and solution times of the equations of motion, we find the following types of solutions:
2. Closed orbits around a finite (≥ 2) point mass
3. “Traveling orbits” that don’t return to the initial position but to an equivalent position in another lattice cell
(In this classification, we ignore “head-on” collision orbits and separatrix-like orbits along the symmetry lines between rows and columns of point masses.)
Here is a collection of initial values and periods for periodic orbits found in the carried-out searches. The small summary table gives the counts of the orbits found.
Using minReturnDistance, we can numerically check the accuracy of the orbits. At the “return time” (the last element of the sublists of orbitData), the sum of the differences of the position and velocity vectors is quite small.
Now let’s make some graphics showing the orbits from the list orbitData using the function showOrbit.
1. Orbits around a single point mass
In the simplest case, these are just topologically equivalent to a circle. This type of solution is not unexpected; for initial conditions close to a point mass, the influence of the other lattice point masses will be small.
2. Orbits around two point masses
In the simplest case, these are again topologically equivalent to a circle, but more complicated orbits exist. Here are some examples.
3. “Traveling orbits” (open orbits) that don’t return to the initial position but to an equivalent position in another lattice cell
These orbits come in self-crossing and non-self-crossing versions. Here are some examples.
Individually, the open orbits look quite different from the closed ones. When plotting the continuations of the open orbits, their relation to the closed orbits becomes much more obvious.
For instance, the following open orbit reminds me of the last closed orbit.
The last graphic suggests that closed orbits around a finite number of points could become traveling orbits after small perturbations by “hopping” from a closed orbit around a single or finite number of point masses to the next single or finite group of point masses.
But there are also situations where one intuitively might expect closed orbits to exist, but numerically one does not succeed in finding a precise solution. One example is a simple rounded-corner, triangle-shaped orbit that encloses three point masses.
Showing 100 orbits with slightly disturbed initial conditions gives an idea of why a smooth match of the initial point and the final point does not work out. While we can make the initial and final point match, the velocity vectors do not agree in this case.
Another orbit that seems not to exist, although one can make the initial and final points and velocities match pretty well, is the following double-slingshot orbit. But reducing the residue further by small modifications of the initial position and velocity seems not to be possible.
Here are a third and fourth type of orbit that nearly match up, but the function findPeriodicOrbit can’t find parameters that bring the difference below 10^{-5}.
Here are two graphics of the last two orbits.
There are many more periodic orbits. The above is just a small selection of all possible orbits. Exploring a family of trajectories at once shows the wide variety of orbits that can arise. We let all orbits start at the line segment {{x, 1/2}|-1/2 ≤ x ≤ 1/2} with an angle α(x) = 𝜋(1/2-|x|).
If we plot sufficiently many orbits and select the ones that do not move approximately uniformly, we can construct an elegant gravitational crystal church.
The last image nicely shows the “branching” of the trajectories at point masses where the overall shape of the trajectory changes discontinuously. Displaying the flow in the three-dimensional x-t-y space shows the branching even better.
We were looking for concrete periodic orbits in the field of an infinite square array of point masses. For more general results on trajectories in such a potential, see Knauf. Knauf proves that the behavior of average orbits is diffusive. Periodic orbits are the exception in the space of initial conditions. Almost all orbits will wander randomly around. So let’s have a quick look at a larger number of orbits. The following calculation will take about six hours, and evaluates the final points and velocities of masses starting at {x,0.5} with a velocity {0,v} on a dense x-v grid with 0 ≤ x ≤ 1 and 1 ≤ v ≤ 3.
If we display all final positions, we get the following graphic that gives a feeling of the theoretically predicted diffusive behavior of the orbits.
While diffusive in average, as we are solving a differential equation, we expect (at least piecewise) that the final positions depend continuously on the initial conditions. So we burn another six hours of CPU time and calculate the final positions of 800,000 test particles that start radially from a circle around a lattice point mass. (Because of the symmetry of the force field, we have only 100,000 different initial value problems to solve numerically.)
Here are the points of the final positions of the 800,000 points. We again see how nicely the point masses of the lattice temporarily deflect the test masses.
We repeat a variation of the last calculation and determine the minimum value of in the x-v plane, where x and v are the initial conditions of the particle starting at y = 0.5 perpendicular upward.
We solve the equations of motions for 0 ≤ t ≤ 2.5 and display the value of the minimum of in the time range 0.5 ≤ t ≤ 2.5. If the minimum occurs for t=0.5, we use a light gray color; if the minimum occurs for t=2.5, a dark gray color; and for 0.5 < t < 2.5, we color the sum of norms from pink to green. Not unexpectedly, the distance sum shows a fractal-like behavior, meaning the periodic orbits form a thinly spaced subset of initial conditions.
Now that we have the force field of a square array of point masses, we can also use this force to model electrostatic problems, as these obey the same force law.
Identical charges would form a Wigner crystal, which is hexagonal. Two interlaced square lattices of opposite charges would make a model for a 2D NaCl salt crystal.
By summing the (signed) forces of the four sublattices, we can again calculate a resulting force of a test particle.
The trajectories of a test charge become more irregular as compared with the gravitational model considered above. The following Manipulate allows us to get a feeling for the resulting trajectories. The (purple) test charge is attracted to the green charges of the lattice and repelled from the purple charges of the lattice.
We can now combine all the elements together to visualize the resulting gravitational crystals. We plot the resulting lattice movements in the reference frame of one lattice (the blue lattice). The red lattice moves with respect to the blue lattice.
Using detailed numerical calculation, we verified the existence of the suggested gravitational crystals. For the simplest case, the two square lattice, many periodic orbits of small period were found. More extensive searches would surely return more, longer period solutions.
Using the general form of the Poisson summation formula for general lattices, the above calculations could be extended to different lattices, e.g. hexagonal lattices or 3D lattices.
Download this post as a Computable Document Format (CDF) file. New to CDF? Get your copy for free with this one-time download.
]]>Please allow me to introduce myself
I’m a man of precision and science
I’ve been around for a long, long time
Stole many a man’s pound and toise
And I was around when Louis XVI
Had his moment of doubt and pain
Made damn sure that metric rules
Through platinum standards made forever
Pleased to meet you
Hope you guess my name
In case you can’t guess: I am Jean-Charles de Borda, sailor, mathematician, scientist, and member of the Académie des Sciences, born on May 4, 1733, in Dax, France. Two weeks ago would have been my 283rd birthday. This is me:
In my hometown of Dax there is a statue of me. Please stop by when you visit. In case you do not know where Dax is, here is a map:
In Europe when I was a boy, France looked basically like it does today. We had a bit less territory on our eastern border. On the American continent, my country owned a good fraction of land:
I led a diverse earthly life. At 32 years old I carried out a lot of military and scientific work at sea. As a result, in my forties I commanded several ships in the Seven Years’ War. Most of the rest of my life I devoted to the sciences.
But today nobody even knows where my grave is, as my physical body died on February 19, 1799, in Paris, France, in the upheaval of the French Revolution. (Of course, I know where it is, but I can’t communicate it anymore.) My name is the twelfth listed on the northeast side of the Eiffel Tower:
Over the centuries many of my fellow Frenchman who joined me up here told me that I deserved a place in the Panthéon. But you will not find me there, nor at the Père Lachaise, Montparnasse, or Montmartre cemeteries.
But this is not why I still cannot rest in peace. I am a humble man; it is the kilogram that keeps me up at night. But soon I will be able to rest in peace at night for all time and approach new scientific challenges.
Let me tell you why I will soon find a good night’s sleep.
All my life, I was into mathematics, geometry, physics, and hydrology. And overall, I loved to measure things. You might have heard of substitution weighing (also called Borda’s method)—yes, this was my invention, as was the Borda count method. I also substantially improved the repeating circle. Here is where the story starts. The repeating circle was crucial in making a high-precision determination of the size of the Earth, which in turn defined the meter. (A good discussion of my circle can be found here.)
I lived in France when it was still a monarchy. Times were difficult for many people—especially peasants—partially because trade and commerce were difficult due to the lack of measures all over the country. If you enjoy reading about history, I highly recommend Kula’s Measures and Men to understand the weights and measurements situation in France in 1790. The state of the weights and measures were similar in other countries; see for instance Johann Georg Tralles’ report about the situation in Switzerland.
In August 1790, I was made the chairman of the Commission of Weights and Measures as a result of a 1789 initiative from Louis XVI. (I still find it quite miraculous that 1,000 years after Charlemagne’s initiative to unify weights and measures, the next big initiative in this direction would be started.) Our commission created the metric system that today is the International System of Units, often abbreviated as SI (le Système international d’unités in French).
In the commission were, among others, Pierre-Simon Laplace (think the Laplace equation), Adrien-Marie Legendre (Legendre polynomials), Joseph-Louis Lagrange (think Lagrangian), Antoine Lavoisier (conservation of mass), and the Marquis de Condorcet. (I always told Adrien-Marie that he should have some proper portrait made of him, but he always said he was too busy calculating. But for 10 years now, the politician Louis Legendre’s portrait has not been used in math books instead of Adrien-Marie’s. Over the last decades, Adrien-Marie befriended Jacques-Louis David, and Jacques-Louis has made a whole collection of paintings of Adrien-Marie; unfortunately, mortals will never see them.) Lagrange, Laplace, Monge, Condorcet, and I were on the original team. (And, in the very beginning, Jérôme Lalande was also involved; later, some others were as well, such as Louis Lefèvre‑Gineau.)
Three of us (Monge, Lagrange, and Condorcet) are today interred or commemorated at the Panthéon. It is my strong hope that Pierre-Simon is one day added; he really deserves it.
As I said before, things were difficult for French citizens in this era. Laplace wrote:
The prodigious number of measures in use, not only among different people, but in the same nation; their whimsical divisions, inconvenient for calculation, and the difficulty of knowing and comparing them; finally, the embarrassments and frauds which they produce in commerce, cannot be observed without acknowledging that the adoption of a system of measures, of which the uniform divisions are easily subjected to calculation, and which are derived in a manner the least arbitrary, from a fundamental measure, indicated by nature itself, would be one of the most important services which any government could confer on society. A nation which would originate such a system of measures, would combine the advantage of gathering the first fruits of it with that of seeing its example followed by other nations, of which it would thus become the benefactor; for the slow but irresistible empire of reason predominates at length over all national jealousies, and surmounts all the obstacles which oppose themselves to an advantage, which would be universally felt.
All five of the mathematicians (Monge, Lagrange, Laplace, Legendre, and Condorcet) have made historic contributions to mathematics. Their names are still used for many mathematical theorems, structures, and operations:
In 1979, Ruth Inez Champagne wrote a detailed thesis about the influence of my five fellow citizens on the creation of the metric system. For Legendre’s contribution especially, see C. Doris Hellman’s paper. Today it seems to me that most mathematicians no longer care much about units and measures and that physicists are the driving force behind advancements in units and measures. But I did like Theodore P. Hill’s arXiv paper about the method of conflations of probability distributions that allows one to consolidate knowledge from various experiments. (Yes, before you ask, we do have instant access to arXiv up here. Actually, I would say that the direct arXiv connection has been the greatest improvement here in the last millennium.)
Our task was to make standardized units of measure for time, length, volume, and mass. We needed measures that were easily extensible, and could be useful for both tiny things and astronomic scales. The principles of our approach were nicely summarized by John Quincy Adams, Secretary of State of the United States, in his 1821 book Report upon the Weights and Measures.
Originally we (we being the metric men, as we call ourselves up here) had suggested just a few prefixes: kilo-, deca-, hecto-, deci-, centi-, milli-, and the no-longer-used myria-. In some old books you can find the myria- units.
We had the idea of using prefixes quite early in the process of developing the new measurements. Here are our original proposals from 1794:
Side note: in my time, we also used the demis and the doubles, such as a demi-hectoliter (=50 liters) or a double dekaliter (=20 liters).
As inhabitants of the twenty-first century know, times, lengths, and masses are measured in physics, chemistry, and astronomy over ranges spanning more than 50 orders of magnitude. And the units we created in the tumultuous era of the French Revolution stood the test of time:
In the future, the SI might need some more prefixes. In a recent LIGO discovery, the length of the interferometer arms changed on the order of 10 yoctometers. Yoctogram resolution mass sensors exist. One yoctometer equals 10^{–24} meter. Mankind can already measure tiny forces on the order of zeptonewtons.
On the other hand, astronomy needs prefixes larger than 10^{24}. One day, these prefixes might become official.
I am a man of strict rules, and it drives me nuts when I see people in the twenty-first century not obeying the rules for using SI prefixes. Recently I saw somebody writing on a whiteboard that a year is pretty much exactly 𝜋 dekamegaseconds (𝜋 daMs):
While it’s a good approximation (only 0.4% off), when will this person learn that one shouldn’t concatenate prefixes?
The technological progress of mankind has occurred quickly in the last two centuries. And mega-, giga-, tera- or nano-, pico-, and femto- are common prefixes in the twenty-first century. Measured in meters per second, here is the probability distribution of speed values used by people. Some speeds (like speed limits, the speed of sound, or the speed of light) are much more common than others, but many local maxima can be found in the distribution function:
Here is the report we delivered in March of 1791 that started the metric system and gave the conceptual meaning of the meter and the kilogram, signed by myself, Lagrange, Laplace, Monge, and Concordet (now even available through what the modern world calls a “digital object identifier,” or DOI, like 10.3931/e-rara-28950):
Today most people think that base 10 and the meter, second, and kilogram units are intimately related. But only on October 27, 1790, did we decide to use base 10 for subdividing the units. We were seriously considering a base-12 subdivision, because the divisibility by 2, 3, 4, and 6 is a nice feature for trading objects. It is clear today, though, that we made the right choice. Lagrange’s insistence on base 10 was the right thing. At the time of the French Revolution, we made no compromises. On November 5, 1792, I even suggested changing clocks to a decimal system. (D’Alambert had suggested this in 1754; for the detailed history of decimal time, see this paper.) Mankind was not ready yet; maybe in the twenty-first century decimal clocks and clock readings would finally be recognized as much better than 24 hours, 60 minutes, and 60 seconds. I loved our decimal clocks—they were so beautiful. So it’s a real surprise to me today that mankind still divides the angle into 90 degrees. In my repeating circle, I was dividing the right angle into 100 grades.
We wanted to make the new (metric) units truly equal for all people, not base them, for instance, on the length of the forearm of a king. Rather, “For all time, for all people” (“À tous les temps, à tous les peuples”). Now, in just a few years, this dream will be achieved.
And I am sure there will come the day where Mendeleev’s prediction (“Let us facilitate the universal spreading of the metric system and thus assist the common welfare and the desired future rapprochement of the peoples. It will come not yet, slowly, but surely.”) will come true even in the three remaining countries of the world that have not yet gone metric:
The SI units have been legal for trade in the USA since the mid-twentieth century, when United States customary units became derived from the SI definitions of the base units. Citizens can choose which units they want for trade.
We also introduced the decimal subdivision of money, and our franc was in use from 1793 to 2002. At least today all countries divide their money on the basis of base 10—no coins with label 12 are in use anymore. Here is the coin label breakdown by country:
We took the “all” in “all people” quite seriously, and worked with our archenemy Britain and the new United States (through Thomas Jefferson personally) together to make a new system of units for all the major countries in my time. But, as is still so often the case today, politics won over reason.
I died on February 19, 1799, just a few months before the our group’s efforts. On June 22, 1799, my dear friend Laplace gave a speech about the finished efforts to build new units of length and mass before the new prototypes were delivered to the Archives of the Republic (where they are still today).
In case the reader is interested in my eventful life, Jean Mascart wrote a nice biography about me in 1919, and it is now available as a reprint from the Sorbonne.
Two of my friends, Jean Baptiste Joseph Delambre and Pierre Méchain, were sent out to measure distances in France and Spain from mountain to mountain to define the meter as one ten-millionth of the distance from the North Pole to the equator of the Earth. Historically, I am glad the mission was approved. Louis XVI was already under arrest when he approved the financing of the mission. My dear friend Lavoisier called their task “the most important mission that any man has ever been charged with.”
If you haven’t done so, you must read the book The Measure of All Things by Ken Alder. There is even a German movie about the adventures of my two old friends. Equipped with a special instrument that I had built for them, they did the work that resulted in the meter. Although we wanted the length of the meter to be one ten-millionth of the length of the half-meridian through Paris from pole to equator, I think today this is a beautiful definition conceptually. That the Earth isn’t quite as round as we had hoped for we did not know at the time, and this resulted in a small, regrettable error of 0.2 mm due to a miscalculation of the flattening of the Earth. Here is the length of the half-meridian through Paris, expressed through meters along an ellipsoid that approximates the Earth:
If they had elevation taken into account (which they did not do—Delambre and Méchain would have had to travel the whole meridian to catch every mountain and hill!), and had used 3D coordinates (meaning including the elevation of the terrain) every few kilometers, they would have ended up with a meter that was 0.4 mm too short:
Here is the elevation profile along the Paris meridian:
And the meter would be another 0.9 mm longer if measured with a yardstick the length of a few hundred meters:
Because of the fractality of the Earth’s surface, an even smaller yardstick would have given an even longer half-meridian.
It’s more realistic to follow the sea-level height. The difference between the length of the sea-level meridian meter and the ellipsoid approximation meter is just a few micrometers:
But at least the meridian had to go through Paris (not London, as some British scientists of my time proposed). But anyway, the meridian length was only a stepping stone to make a meter prototype. Once we had the meter prototype, we didn’t have to refer to the meridian anymore.
Here is a sketch of the triangulation carried out by Pierre and Jean Baptiste in their adventurous six-year expedition. Thanks to the internet and various French digitization projects, the French-speaking reader interested in metrology and history can now read the original results online and reproduce our calculations:
The part of the meridian through Paris (and especially through the Paris Observatory, marked in red) is today marked with the Arago markers—do not miss them during your next visit to Paris! François Arago remeasured the Paris meridian. After Méchain joined me up here in 1804, Laplace got the go-ahead (and the money) from Napoléon to remeasure the meridian and to verify and improve our work:
The second we derived from the length of a year. And the kilogram as a unit of mass we wanted to (and did) derive from a liter of water. If any liquid is special, it is surely water. Lavoisier and I had many discussions about the ideal temperature. The two temperatures that stand out are 0 °C and
4 °C. Originally we were thinking about 0 °C, as with ice water it is easy to see. But because of the maximal density of water at 4 °C, we later thought that would be the better choice. The switch to
4 °C was suggested by Louis Lefèvre-Gineau. The liter as a volume in turn we defined as one-tenth of a meter cubed. As it turns out, compared with high-precision measurements of distilled water,
1 kg equals the mass of 1.000028 dm^{3} of water. The interested reader can find many more details of the process of the water measurements here and about making the original metric system here. A shorter history in English can be found in the recent book by Williams and the ten-part series by Chisholm.
I don’t want to brag, but we also came up with the name “meter” (derived from the Greek metron and the Latin metrum), which we suggested on July 11 of 1792 as the name of the new unit of length. And then we had the area (=100 m^{2}) and the stere (=1 m^{3}).
And I have to mention this for historical accuracy: until I entered the heavenly spheres, I always thought our group was the first to carry out such an undertaking. How amazed and impressed I was when shortly after my arrival up here, I-Hsing and Nankung Yiieh introduced themselves to me and told me about their expedition from the years 721 to 725, more than 1,000 years before ours, to define a unit of length.
I am so glad we defined the meter this way. Originally the idea was to define a meter through a pendulum of proper length as a period of one second. But I didn’t want any potential change in the second to affect the length of the meter. While dependencies will be unavoidable in a complete unit system, they should be minimized.
Basing the meter on the Earth’s shape and the second on the Earth’s movement around the Sun seemed like a good idea at the time. Actually, it was the best idea that we could technologically realize at this time. We did not know how tides and time changed the shape of the Earth, or how continents drift apart. But we believed in the future of mankind, in ever-increasing measurement precision, but we did not know what concretely would change. But it was our initial steps for precisely measuring distances in France that were carried out. Today we have high-precision geo potential maps as high-order series of Legendre polynomials:
With great care, the finest craftsmen of my time melted platinum, and we forged a meter bar and a kilogram. It was an exciting time. Twice a week I would stop by Janety’s place when he was forging our first kilograms. Melting and forming platinum was still a very new process. And Janety, Louis XVI’s goldsmith, was a true master of forming platinum—to be precise, a spongelike eutectic made of platinum and arsenic. Just a few years earlier, on June 6, 1782, Lavoisier showed the melting of platinum in a hydrogen-oxygen flame to (the future) Tsar Paul I at a garden party at Versailles; Tsar Paul I was visiting Marie Antoinette and Loius XVI. And Étienne Lenoir made our platinum meter, and Jean Nicolas Fortin our platinum kilogram. For the reader interested in the history of platinum, I recommend McDonald’s and Hunt’s book.
Platinum is a very special metal; it has a high density and is chemically very inert. It is also not as soft as gold. The best kilogram realizations today are made from a platinum-iridium mixture (10% iridium), as adding iridium to platinum does improve its mechanical properties. Here is a comparison of some physical characteristics of platinum, gold, and iridium:
This sounds easy, but at the time the best scientists spent countless hours calculating and experimenting to find the best materials, the best shapes, and the best conditions to define the new units. But both the new meter bar and the new kilogram cylinder were macroscopic bodies. And the meter has two markings of finite width. All macroscopic artifacts are difficult to transport (we developed special travel cases); they change by very small amounts over a hundred years through usage, absorption, desorption, heating, and cooling. In the amazing technological progress of the nineteenth and twentieth centuries, measuring time, mass, and length with precisions better than one in a billion has become possible. And measuring time can even be done a billion times better.
I still vividly remember when, after we had made and delivered the new meter and the mass prototypes, Lavoisier said, “Never has anything grander and simpler and more coherent in all its parts come from the hands of man.” And I still feel so today.
Our goal was to make units that truly belonged to everyone. “For all time, for all people” was our motto. We put copies of the meter all over Paris to let everybody know how long it was. (If you have not done so, next time you visit Paris, make sure to visit the mètre étalon near to the Luxembourg Palace.) Here is a picture I recently found, showing an interested German tourist studying the history of one of the few remaining mètres étalons:
It was an exciting time (even if I was no longer around when the committee’s work was done). Our units served many European countries well into the nineteenth and large parts of the twentieth century. We made the meter, the second, and the kilogram. Four more base units (the ampere, the candela, the mole, and the kelvin) have been added since our work. And with these extensions, the metric system has served mankind very well for 200+ years.
How the metric system took off after 1875, the year of the Metre Convention, can be seen by plotting how often the words kilogram, kilometer, and kilohertz appear in books:
We defined only the meter, the seond, the liter, and the kilogram. Today many more name units belong to the SI: becquerel, coulomb, farad, gray, henry, hertz, joule, katal, lumen, lux, newton, ohm, pascal, siemen, sievert, tesla, volt, watt, and weber. Here is a list of the dimensional relations (no physical meaning implied) between the derived units:
Many new named units have been added since my death, often related to electrical and magnetic phenomena that were not yet known when I was alive. And although I am a serious person in general, I am often open to a joke or a pun—I just don’t like when fun is made of units. Like Don Knuth’s Potrzebie system of units, with units such as the potrzebie, ngogn, blintz, whatmeworry, cowznofski, vreeble, hoo, and hah. Not only are their names nonsensical, but so are their values:
Or look at Max Pettersson’s proposal for units for biology. The names of the units and the prefixes might sound funny, but for me units are too serious a subject to make fun of:
These unit names do not even rhyme with any of the proper names:
To reiterate, I am all in favor of having fun, even with units, but it must be clear that it is not meant seriously:
Or explicitly nonscientific units, such as helens for beauty, puppies for happiness, or darwins for fame are fine with me:
I am so proud that the SI units are not just dead paper symbols, but tools that govern the modern world in an ever-increasing way. Although I am not a comics guy, I love the recent promotion of the base units to superheroes by the National Institute of Standards and Technology:
Note that, to honor the contributions of the five great mathematicians to the metric system, the curves in the rightmost column of the unit-representing characters are given as mathematical formulas, e.g. for Dr. Kelvin we have the following purely trigonometric parametrization:
So we can plot Dr. Kelvin:
Having the characters in parametric form is handy: when my family has reunions, the little ones’ favorite activity is coloring SI superheroes. I just print the curves, and then the kids can go crazy with the crayons. (I got this idea a couple years ago from a coloring book by the NCSA.)
And whenever a new episode comes out, all us “measure men” (George Clooney, if you see this: hint, hint for an exciting movie set in the 1790s!) come together to watch it. As you can imagine, the last episode is our all-time favorite. Rumor has it up here that there will be a forthcoming book The Return of the Metrologists (2018 would be a perfect year) complementing the current book.
And I am glad to see that the importance of measuring and the underlying metric system is in modern times honored through the World Metrology Day on May 20, which is today.
In my lifetime, most of what people measured were goods: corn, potatoes, and other foods, wine, fabric, and firewood, etc. So all my country really needed were length, area, volume, angles, and, of course, time units. I always knew that the importance of measuring would increase over time. But I find it quite remarkable that only 200 years after I entered the heavenly spheres, hundreds and hundreds of different physical quantities are measured. Today even the International Organization for Standardization (ISO) lists, defines, and describes what physical quantities to use. Below is an image of an interactive Demonstration (download the notebook at the bottom of this post to interact with it) showing graphically the dimensions of physical quantities for subsets of selectable dimensions. First select two or three dimensions (base units). Then the resulting graphics show spheres with sizes proportional to the number of different physical quantities with these dimensions. Mouse over the spheres in the notebook to see the dimensions. For example, with “meter”, “second”, and “kilogram” checked, the diagram shows the units of physical quantities like momentum (kg^{1} m^{1} s^{–1}) or energy (kg^{2} m^{1} s^{–2}):
Here is a an excerpt of the code that I used to make these graphics. These are all physical quantities that have dimensions L^{2} M^{1} T^{–1}. The last one is the slightly exotic electrodynamic observable
:
Today with smart phones and wearable devices, a large number of physical quantities are measured all the time by ordinary people. “Measuring rules,” as I like to say. Or, as my (since 1907) dear friend William Thomson liked to say:
… when you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely, in your thoughts, advanced to the stage of science, whatever the matter may be.
Here is a graphical visualization of the physical quantities that are measured by the most common measurement devices:
Electrical and magnetic phenomena were just starting to become popular when I was around. Electromagnetic effects related to physical quantities that are expressed through the electric current only become popular much later:
I remember how excited I was when in the second half of the nineteenth century and the beginning of the twentieth century the various physical quantities of electromagnetism were discovered and their connections were understood. (And, not to be forgotten: the recent addition of memristance.) Here is a diagram showing the most important electric/magnetic physical quantities q_{k} that have a relation of the form q_{k}=q_{i} q_{j} with each other:
On the other hand, I was sure that temperature-related phenomena would soon be fully understood after my death. And indeed just 25 years later, Carnot proved that heat and mechanical work are equivalent. Now I also know about time dilation and length contraction due to Einstein’s theories. But mankind still does not know if a moving body is colder or warmer than a stationary body (or if they have the same temperature). I hear every week from Josiah Willard about the related topic of negative temperatures. And recently, he was so excited about a value for a maximal temperature for a given volume V expressed through fundamental constants:
For one cubic centimeter, the maximal temperature is about 5PK:
Long after my physical death, some of the giants of physics of the nineteenth century and early twentieth century, foremost among them James Clerk Maxwell, George Johnstone Stoney, and Max Planck (and Gilbert Lewis) were considering units for time, length, and mass that were built from unchanging properties of microscopic particles and the associated fundamental constants of physics (speed of light, gravitational constant, electron charge, Planck constant, etc.):
Maxwell wrote in 1870:
Yet, after all, the dimensions of our Earth and its time of rotation, though, relative to our present means of comparison, very permanent, are not so by any physical necessity. The earth might contract by cooling, or it might be enlarged by a layer of meteorites falling on it, or its rate of revolution might slowly slacken, and yet it would continue to be as much a planet as before.
But a molecule, say of hydrogen, if either its mass or its time of vibration were to be altered in the least, would no longer be a molecule of hydrogen.
If, then, we wish to obtain standards of length, time, and mass which shall be absolutely permanent, we must seek them not in the dimensions, or the motion, or the mass of our planet, but in the wavelength, the period of vibration, and the absolute mass of these imperishable and unalterable and perfectly similar molecules.
When we find that here, and in the starry heavens, there are innumerable multitudes of little bodies of exactly the same mass, so many, and no more, to the grain, and vibrating in exactly the same time, so many times, and no more, in a second, and when we reflect that no power in nature can now alter in the least either the mass or the period of any one of them, we seem to have advanced along the path of natural knowledge to one of those points at which we must accept the guidance of that faith by which we understand that “that which is seen was not made of things which do appear.’
At the time when Maxwell wrote this, I was already a man’s lifetime up here, and when I read it I applauded him (although at this time I still had some skepticism toward all ideas coming from Britain). I knew that this was the path forward to immortalize the units we forged in the French Revolution.
There are many physical constants. And they are not all known to the same precision. Here are some examples:
Converting the values of constants with uncertainties into arbitrary precision numbers is convenient for the following computations. The connection between the intervals and the number of digits is given as follows. The arbitrary precision number that corresponds to v ± δ is the number v with precision –log_{10}(2 δ/v) Conversely, given an arbitrary precision number (numbers are always convenient for computations), we can recover the v ± δ form:
After the exactly defined constants, the Rydberg constant with 11 known digits stands out for a very precisely known constant. On the end of the spectrum is G, the gravitational constant. At least once a month Henry Cavendish stops at my place with yet another idea on how to build a tabletop device to measure G. Sometimes his ideas are based on cold atoms, sometimes on superconductors, and sometimes on high-precision spheres. If he could still communicate with the living, he would write a comment to Nature every week. A little over a year ago Henry was worried that he should have done his measurements in winter as well in summer, but he was relieved to see that no seasonal dependence of G’s value seems to exist. The preliminary proposal deadline for the NSF’s Big G Challenge was just four days ago. I think sometime next week I will take a heavenly peek at the program officer’s preselected experiments.
There are more physical constants, and they are not all equal. Some are more fundamental than others, but for reasons of length I don’t want to get into a detailed discussion about this topic now. A good start for interested readers is Lévy-Leblond’s papers (also here), as well as this paper, this paper, and the now-classic Duff–Okun–Veneziano paper. For the purpose of making units from physical constants, the distinction of the various classes of physical constants is not so relevant.
The absolute values of the constants and their relations to heaven, hell, and Earth is an interesting subject on its own. It is a hot topic of discussion for mortals (also see this paper), as well as up here. Some numerical coincidences (?) are just too puzzling:
Of course, using modern mathematical algorithms, such as lattice reduction, we can indulge in the numerology of the numerical part of physical constants:
For instance, how can we form 𝜋 out of fundamental constant products?
Or let’s look at my favorite number, 10, the mathematical basis of the metric system:
And given a set of constants, there are many ways to form a unit of a given unit. There are so many physical constants in use today, you have to be really interested to keep up on them. Here are some of the lesser-known constants:
Physical constants appear in so many equations of modern physics. Here is a selection of 100 simple physics formulas that contain the fundamental constants:
Of course, more complicated formulas also contain the physical constants. For instance, the gravitational constant appears (of course!) in the formula of the gravitational potentials of various objects, e.g. for the potential of a line segment and of a triangle:
My friend Maurits Cornelis Escher loves these kinds of formulas. He recently showed me some variations of a few of his 3D pictures that show the equipotential surfaces of all objects in the pictures by triangulating all surfaces, then using the above formula—like his Escher solid. The graphic shows a cut version of two equipotential surfaces:
I frequently stop by at Maurits Cornelis’, and often he has company—usually, it is Albrecht Dürer. The two love to play with shapes, surfaces, and polyhedra. They deform them, Kelvin-invert them, everse them, and more. Albrecht also likes the technique of smoothing with gravitational potentials, but he often does this with just the edges. Here is what a Dürer solid’s equipotential surfaces look like:
And here is a visualization of formulas that contain c^{α}–h^{β}–G^{γ} in the exponent space αγβγγ. The size of the spheres is proportional to the number of formulas containing c^{α}·h^{β}·G^{γ}; mousing over the balls in the attached notebook shows the actual formulas. We treat positive and negative exponents identically:
One of my all-time favorite formulas is for the quantum-corrected gravitational force between two bodies, which contains my three favorite constants: the speed of light, the gravitational constants, and the Planck constant:
Another of my favorite formulas is the one for the entropy of a black hole. It contains the Boltzmann constant in addition to c, h, and G:
And, of course, the second-order correction to the speed of light in a vacuum in the presence of an electric or magnetic field due to photon-photon scattering (ignoring a polarization-dependent constant). Even in very large electric and magnetic fields, the changes in the speed of light are very small:
In my lifetime, we did not yet understand the physical world enough to have come up with the idea of natural units. That took until 1874, when Stoney proposed for the first time natural units in his lecture to the British Science Association. And then, in his 1906–07 lectures, Planck made use of the now-called Planck units extensively, already introduced in his famous 1900 article in Annalen der Physik. Unfortunately, both these unit systems use the gravitational constant G prominently. It is a constant that we today cannot measure very accurately. As a result, also the values of the Planck units in the SI have only about four digits:
These units were never intended for daily use because they are either far too small or far too large compared to the typical lengths, areas, volumes, and masses that humans deal with on a daily basis. But why not base the units of daily use on such unchanging microscopic properties?
(Side note: The funny thing is that in the last 20 years Max Planck again doubts if his constant h is truly fundamental. He had hoped in 1900 to derive its value from a semi-classical theory. Now he hopes to derive it from some holographic arguments. Or at least he thinks he can derive the value of h/k_{B} from first principles. I don’t know if he will succeed, but who knows? He is a smart guy and just might be able to.)
Many exact and approximate relations between fundamental constants are known today. Some more might be discovered in the future. One of my favorites is the following identity—within a small integer factor, is the value of the Planck constant potentially related to the size of the universe?
Another one is Beck’s formula, showing a remarkable coincidence (?):
But nevertheless, in my time we never thought it would be possible to express the height of a giraffe through the fundamental constants. But how amazed I was nearly ten years ago, when looking through the newly arrived arXiv preprints to find a closed form for the height of the tallest running, breathing organism derived by Don Page. Within a factor of two he got the height of a giraffe (Brachiosaurus and Sauroposeidon don’t count because they can’t run) derived in terms of fundamental constants—I find this just amazing:
I should not have been surprised, as in 1983 Press, Lightman, Peierls, and Gold expressed the maximal running speed of a human (see also Press’ earlier paper):
In the same spirit, I really liked Burrows’ and Ostriker’s work on expressing the sizes of a variety of astronomical objects through fundamental constants only. For instance, for a typical galaxy mass we obtain the following expression:
This value is within a small factor from the mass of the Milky Way:
But back to units, and fast forward another 100+ years to the second half of the twentieth century: the idea of basing units on microscopic properties of objects gained more and more ground.
Since 1967, the second has been defined through 9,192,631,770 periods of the light from the transition between the two hyperfine levels of the ground state of the cesium 133, and the meter has been defined since 1983 as the distance light travels in one second when we define the speed of light as the exact quantity 299,792,458 meters per second. To be precise, this definition is to be realized at rest, at a temperature of 0 K, and at sea level, as motion, temperature, and the gravitational potential influence the oscillation period and (proper) time. Ignoring the sea-level condition can lead to significant measurement errors; the center of the Earth is about 2.5 years younger than its surface due to differences in the gravitational potential.
Now, these definitions for the unit second and meter are truly equal for all people. Equal not just for people on Earth right now, but also for in the future and far, far away from Earth for any alien. (One day, the 9,192,631,770 periods of cesium might be replaced by a larger number of periods of another element, but that will not change its universal character.)
But if we wanted to ground all units in physical constants, which ones should we choose? There are often many, many ways to express a base unit through a set of constants. Using the constants from the table above, there are thirty (thirty!) ways to combine them to make a mass dimension:
Because of the varying precision of the constants, the combinations are also of varying precision (and of course, of different numerical values):
Now the question is which constants should be selected to define the units of the metric system? Many aspects, from precision to practicality to the overall coherence (meaning there is no need for various prefactors in equations to compensate for unit factors) must be kept in mind. We want our formulas to look like F = m a, rather than containing explicit numbers such as in the Thanksgiving turkey cooking time formulas (assuming a spherical turkey):
Or in the PLANK formula (Max hates this name) for the calculation of indicated horsepower:
Here in the clouds of heaven, we can’t use physical computers, so I am glad that I can use the more virtual Wolfram Open Cloud to do my calculations and mathematical experimentation. I have played for many hours with the interactive units-constants explorer below, and agree fully with the choices made by the International Bureau of Weights and Measures (BIPM), meaning the speed of light, the Planck constant, the elementary charge, the Avogadro constant, and the Boltzmann constant. I showed a preliminary version of this blog to Edgar, and he was very pleased to see this table based on his old paper:
I want to mention that the most popular physical constant, the fine-structure constant, is not really useful for building units. Just by its special status as a unitless physical quantity, it can’t be directly connected to a unit. But it is, of course, one of the most important physical constants in our universe (and is probably only surpassed by the simple integer constant describing how many spatial dimensions our universe has). Often various dimensionless combinations can be found from a given set of physical constants because of relations between the constants, such as c^{2}=1/(ε_{0} μ_{0}). Here are some examples:
But there is probably no other constant that Paul Adrien Maurice Dirac and I have discussed more over the last 32 years than the fine-structure constant α=e_{2}/(4 𝜋 ε_{0} ħ c). Although up here we meet with the Lord regularly in a friendly and productive atmosphere, he still refuses to tell us a closed form of α . And he will not even tell us if he selected the same value for all times and all places. For the related topic of the values of the constants chosen, he also refuses to discuss fine tuning and alternative values. He says that he chose a beautiful expression, and one day we will find out. He gave some bounds, but they were not much sharper than the ones we know from the Earth’s existence. So, like living mortals, for now we must just guess mathematical formulas:
Or guess combinations of constants:
And here is one of my favorite coincidences:
And a few more:
The rise in importance and usage of the physical constants is nicely reflected in the scientific literature. Here is a plot of how often (in publications per year) the most common constants appear in scientific publications from the publishing company Springer. The logarithmic vertical axis shows the exponential increase in how often physical constants are mentioned:
While the fundamental constants are everywhere in physics and chemistry, one does not see them so much in newspapers, movies, or advertisements, as they deserve. I was very pleased to see the introduction of the Measures for Measure column in Nature recently.
To give the physical constants the presence they deserve, I hope that before (or at least not long after) the redefinition we will see some interesting video games released that allow players to change the values of at least c, G, and h to see how the world around us would change if the constants had different values. It makes me want to play such a video game right now. With large values of h, not only could one build a world with macroscopic Schrödinger cats, but interpersonal correlations would also become much stronger. This could make the constants known to children at a young age. Such a video game would be a kind of twenty-first-century Mr. Tompkins adventure:
It will be interesting to see how quickly and efficiently the human brain will adapt to a possible life in a different universe. Initial research seems to be pretty encouraging. But maybe our world and our heaven are really especially fine-tuned.
The modern system of units, the current SI has, in addition to the second, the meter, and the kilogram, other units. The ampere is defined as the force between two infinitely long wires, the kelvin through the triple point of water, the mole through the kilogram and carbon-12, and the candela through blackbody radiation. If you have never read the SI brochure, I strongly encourage you to do so.
Two infinitely long wires are surely macroscopic and do not fulfill Maxwell’s demand (but it is at least an idealized system), and de facto it defines the magnetic constant. And the triple point of water needs a macroscopic amount of water. This is not perfect, but it’s OK. Carbon-12 atoms are already microscopic objects. Blackbody radiation is again an ensemble of microscopic objects, but a very reproducible one. So some of the current SI fulfills in some sense Maxwell’s goals.
But most of my insomnia over the last 50 years has been caused by the kilogram. It caused me real headaches, and sometimes even nightmares, when we could not put it on the same level as the second and the meter.
In the year of my physical death (1799), the first prototype of a kilogram, a little platinum cylinder, was made. About 39.7 mm in height and 39.4 mm in diameter, this was for 75 years “the” kilogram. It was made from the forged platinum sponge made by Janety. Miller gives a lot of the details of this kilogram. It is today in the Archives nationales. In 1879, Johnson Matthey (in Britain—the country I fought with my ships!), using new melting techniques, made the material for three new kilogram prototypes. Because of a slightly higher density, these kilograms were slightly smaller in size, at 39.14 mm in height. The cylinder was called KIII and became the current international prototype kilogram . Here is the last sentence from the preface of the mass determination of the the international prototype kilogram from 1885, introducing :
A few kilograms were selected and carefully compared to our original kilogram; for the detailed measurements, see this book. All three kilograms had a mass less than 1 mg different from the original kilogram. But one stood out: it had a mass difference of less than 0.01 mg compared to the original kilogram. For a detailed history of the making of , see Quinn. And so, still today, per definition, a kilogram is the mass of a small metal cylinder sitting in a safe at the International Bureau of Weights and Measures near Paris. (It’s technically actually not on French soil, but this is another issue.) In the safe, which needs three keys to be opened, under three glass domes, is a small platinum-iridium cylinder that defines what a kilogram is. For the reader’s geographical orientation, here is a map of Paris with the current kilogram prototype (in the southwest), our original one (in the northeast), both with a yellow border, and some other Paris visitor essentials:
In addition to being an artifact, it was so difficult to get access to the kilogram (which made me unhappy). Once a year, a small group of people checks if it is still there, and every few years its weight (mass) is measured. Of course, the result is, per definition and the agreement made at the first General Conference on Weights and Measures in 1889, exactly one kilogram.
Over the years the original kilogram prototype gained dozens of siblings in the form of other countries’ national prototypes, all of the same size, material, and weight (up to a few micrograms, which are carefully recorded). (I wish the internet had been invented earlier, so that I had a communication path to tell what happened with the stolen Argentine prototype 45; since then, it has been melted down.) At least, when they were made they had the same weight. Same material, same size, similarly stored—one would expect that all these cylinders would keep their weight. But this is not what history showed. Rather than all staying at the same weight, repeated measurements showed that virtually all other prototypes got heavier and heavier over the years. Or, more probable, the international prototype has gotten lighter.
From my place here in heaven I have watched many of these the comparisons with both great interest and concern. Comparing their weights (a.k.a. masses) is a big ordeal. First you must get the national prototypes to Paris. I have silently listened in on long discussions with TSA members (and other countries’ equivalents) when a metrologist comes with a kilogram of platinum, worth north of $50k in materials—and add another $20k for the making (in its cute, golden, shiny, special travel container that should only be opened in a clean room with gloves and mouth guard, and never ever touched by a human hand)—and explains all of this to the TSA. An official letter is of great help here. The instances that I have watched from up here were even funnier than the scene in the movie 1001 Grams.
Then comes a complicated cleaning procedure with hot water, alcohol, and UV light. The kilograms all lose weight in this process. And they are all carefully compared with each other. And the result is that with very high probability, “the” kilogram, our beloved international prototype kilogram (IPK), loses weight. This fact steals my sleep.
Here are the results from the third periodic verification (1988 to 1992). The graphic shows the weight difference compared to the international prototype:
For some newer measurements from the last two years, see this paper.
What I mean by “the” kilogram losing weight is the following. Per definition (independent of its “real objective” mass), the international prototype has a mass of exactly 1 kg. Compared with this mass, most other kilogram prototypes of the world seem to gain weight. As the other prototypes were made, using different techniques over more than 100 years, very likely the real issue is that the international prototype is losing weight. (And no, it is not because of Ceaușescu’s greed and theft of platinum that Romania’s prototype is so much lighter; in 1889 the Romanian prototype was already 953 μg lighter than the international prototype kilogram.)
Josiah Willard Gibbs, who has been my friend up here for more than 110 years, always mentions that his home country is still using the pound rather than the kilogram. His vote in this year’s election would clearly go to Bernie. But at least the pound is an exact fraction of the kilogram, so anything that will happen to the kilogram will affect the pound the same way:
But soon all my dreams and centuries-long hopes will come true and I can find sleep again. In 2018, two years from now, the greatest change in the history of units and measures since my work with my friend Laplace and the others will happen.
All units will be based on things that are accessible to everybody everywhere (assuming access to some modern physical instruments and devices).
The so-called new SI will reduce all of the seven base units to seven fundamental constants of physics or basic properties of microscopic objects. Down on Earth, they started calling them “reference constants.”
Some people also call the new SI quantum SI because of its dependence on the Planck constant h and the elementary charge e. In addition to the importance of the Planck constant h in quantum mechanics, the following two quantum effects are connecting h and e: the Josephson effect and its associated Josephson constant K_{J} = 2 e / h, and the quantum Hall effect with the von Klitzing constant R_{K} = h / e^{2}. The quantum metrological triangle: connecting frequency and electric current through a singe electron tunneling device, connecting frequency and voltage through the Josephson effect, and connecting voltage and electric current through the quantum Hall effect will be a beautiful realization of electric quantities. (One day in the future, as Penin has pointed out, we will have to worry about second-order QED effects, but this will be many years from now.)
The BIPM already has a new logo for the future International System of Units:
Concretely, the proposal is:
2. The meter will continue to be defined through an exactly defined speed of light.
3. The kilogram will be defined through an exactly defined value of the Planck constant.
4. The ampere will be defined through an exactly defined value of the elementary charge.
5. The kelvin will be defined through an exactly defined value of the Boltzmann constant.
6. The mole will be defined through an exact (counting) value.
7. The candela will be defined through an exact value of the candela steradian-to-watt ratio at a fixed frequency (already now the case).
I highly recommend a reading of the draft of the new SI brochure. Laplace and I have discussed it a lot here in heaven, and (modulo some small issues) we love it. Here is a quick word cloud summary of the new SI brochure:
Before I forget, and before continuing the kilogram discussion, some comments on the other units.
I still remember when we discussed introducing metric time in the 1790s: a 10-hour day, with 100 minutes per hour, and 100 seconds per minute, and we were so excited by this prospect. In hindsight, this wasn’t such a good idea. The habits of people are sometimes too hard to change. And I am so glad I could get Albert Einstein interested in the whole metrology over the past 50 years. We have had so many discussions about the meaning of time and that the second measures local time, and the difference between measurable local time and coordinate time. But this is a discussion for another day. The uncertainty of a second is today less than 10^{−16}. Maybe one day in the future, cesium will be replaced by aluminum or other elements to achieve 100 to 1,000 times smaller uncertainties. But this does not alter the spirit of the new SI; it’s just a small technical change. (For a detailed history of the second, see this article.)
Clearly, today’s definition of second is much better than one that depends on the Earth. At a time when stock market prices are compared at the microsecond level, the change of the length of a day due to earthquakes, polar melting, continental drift, and other phenomena over a century is quite large:
I have heard some chemists complain that their beloved unit, the mole, introduced into the SI only in 1971, will become trivialized. In the currently used SI, the mole relates to an actual chemical, carbon-12. In the new SI, it will be just a count of objects. A true chemical equivalent to a baker’s dozen, the chemist’s dozen. Based on the Avogadro constant, the mole is crucial in connecting the micro world with the macro world. A more down-to-Earth definition of the mole matters for such quantitative values—for example, pH values. The second is the SI base unit of time; the mole is the SI base unit of the physical quantity, or amount of substance:
But not everybody likes the term “amount of substance.” Even this year (2016), alternative names are being proposed, e.g. stoichiometric amount. Over the last decades, a variety of names have been proposed to replace “amount of substance.” Here are some examples:
But the SI system only defines the unit “mole.” The naming of the physical quantity that is measured in moles is up to the International Union of Pure and Applied Chemistry.
For recent discussions from this year, see the article by Leonard, “Why Is ‘Amount of Substance’ So Poorly Understood? The Mysterious Avogadro Constant Is the Culprit!”, and the article by Giunta, “What’s in a Name? Amount of Substance, Chemical Amount, and Stoichiometric Amount.”
Wouldn’t it be nice if we could have made a “perfect cube” (number) that would represent the Avogadro number? Such a representation would be easy to conceptualize. This was suggested a few years back, and at the time was compatible with the value of the Avogadro constant, and would have been a cube of edge length 84,446,888 items. I asked Srinivasa Ramanujan, while playing a heavenly round of cricket with him and Godfrey Harold Hardy, his longtime friend, what’s special about 84,446,888, but he hasn’t come up with anything deep yet. He said that 84,446,888=2^3*17*620933, and that 620,933 appears starting at position 1,031,622 in the decimal digits of 𝜋, but I can’t see any metrological relevance in this. With the latest value of the Avogadro constant, no third power of an integer number falls into the possible values, so no wonder there is nothing special.
Here is the latest CODATA (Committee on Data for Science and Technology) value from the NIST Reference on Constants, Units, and Uncertainty:
The candidate number 84,446,885 cubed is too small, and adding a one gives too large a number:
Interestingly, if we would settle for a body-centered lattice, with one additional atom per unit cell, then we could still maintain a cube interpretation:
A face-centered lattice would not work, either:
But a diamond (silicon) lattice would work:
To summarize:
Here is a little trivia:
Sometime amid the heights of the Cold War, the accepted value of the Avogadro constant suddenly changed in the third digit! This was quite a change, considering that there is currently a lingering controversy regarding the discrepancy in the sixth digit. Can you explain the sudden decrease in Avogadro constant during the Cold War?
Do you know the answer? If not, see here or here.
But I am diverting from my main thread of thoughts. As I am more interested in the mechanical units anyway, I will let my old friend Antoine Lavoisier judge the new mole definition, as he was the chemist on our team.
Josiah Willard Gibbs even convinced me that temperature should be defined mechanically. I am still trying to understand John von Neumann’s opinion on this subject, but because I never fully understand his evening lectures on type II and type III factors, I don’t have a firm opinion on the kelvin. Different temperatures correspond to inequivalent representations of the algebras. As I am currently still working my way through Ruetsche’s book, I haven’t made my mind up on how to best define the kelvin from an algebraic quantum field theory point of view. I had asked John for his opinion of a first-principle evaluation of h / k based on KMS states and Tomita–Takesaki theory, and even he wasn’t sure about it. He told me some things about thermal time and diamond temperature that I didn’t fully understand.
And then there is the possibility of deriving the value of the Boltzmann constant. Even 40 years after the Koppe–Huber paper, it is not clear if it is possible. It is a subject I am still pondering, and I am taking various options into account. As mentioned earlier, the meaning of temperature and how to define its units are not fully clear to me. There is no question that the new definition of the kelvin will be a big step forward, but I don’t know if it will be the end of the story.
This is one of the most direct, intuitive, and beautiful definitions in the new SI: the current is just the number of electrons that flow per second. Defining the value of the ampere through the number of elementary charges moved around is just a stroke of genius. When it was first suggested, Robert Andrews Millikan up here was so happy he invited many of us to an afternoon gathering in his yard. In practice (and in theoretical calculations), we have to exercise a bit more care, as we mainly measure the electric current of electrons in crystalline objects, and electrons are no longer “bare” electrons, but quasiparticles. But we’ve known since 1959, thanks to Walter Kohn, that we shouldn’t worry too much about this, and expect the charge of the electron in a crystal to be the same as the charge of a bare electron. As an elementary charge is a pretty small charge, the issue of measuring fractional charges as currents is not a practical one for now. I personally feel that Robert’s contribution to determining the value of the physical constants in the beginning of the twentieth century are not pointed out enough (Robert Andrews really knew what he was doing).
No, you will not get me started on my opinion the candela. Does it deserve to be a base unit? The whole story of human-centered physiological units is a complicated one. Obviously they are enormously useful. We all see and hear every day, even every second. But what if the human race continues to develop (in Darwin’s sense)? How will it fit together with our “for all time” mantra? I have my thoughts on this, but laying them out here and now would sidetrack me from my main discussion topic for today.
I also want to mention that originally I was very concerned about the introduction of some of the additional units that are in use today. In endless discussions with my chess partner Carl Friedrich Gauss here in heaven, he had originally convinced me that we can reduce all measurements of electric quantities to measurements of mechanical properties, and I already was pretty fluent in his CGS system, that originally I did not like it at all. But as a human-created unit system, it should be as useful as possible, and if seven units do the job best, it should be seven. In principle one could even eliminate a mass unit and express a mass through time and length. In addition to just being impractical, I strongly believe this is conceptually not the right approach. I recently discussed this with Carl Friedrich. He said he had the idea of just using time and length in the late 1820s, but abandoned such an approach. While alive, Carl Friedrich never had the opportunity to discuss the notion of mass as a synthetic a priori with Immanual, over the last century the two (Carl Friedrich and Immanuel) agreed on mass as an a priori (at least in this universe).
Our motto for the original metric system was, “For all time, for all people.” The current SI already realizes “for all people,” and by grounding the new SI in the fundamental constants of physics, the first promise “for all time” will finally become true. You cannot imagine what this means to me. If at all, fundamental constants seem to change maximally with rates on the order of 10^{–18} per year. This is many orders of magnitude away from the currently realized precisions for most units.
Granted, some things will get a bit numerically more cumbersome in the new SI. If we take the current CODATA values as exact values, then, for instance, the von Klitzing constant e^{2}/h will be a big fraction:
The integer part of the last result is, of course, 25,812Ω. Now, is this a periodic decimal fraction or a terminating fraction? The prime factorization of the denominator tells us that it is periodic:
Progress is good, but as happens so often, it comes at a price. While the new constant-based definitions of the SI units are beautiful, they are a bit harder to understand, and physics and chemistry teachers will have to come up with some innovative ways to explain the new definitions to pupils. (For recent first attempts, see this paper and this paper.)
And in how many textbooks have I seen that the value of the magnetic constant (permeability of the vacuum) μ_{0} is 4 𝜋 10^{–7} N / A^{2}? The magnetic and the electric constants will in the new SI become measured quantities with an error term. Concretely, from the current exact value:
With the Planck constant h exactly and the elementary charge e exactly, the value of μ_{0} would incur the uncertainty of the fine-structure constant α. Fortunately, the dimensionless fine-structure constant α is one of the best-known constants:
But so what? Textbook publishers will not mind having a reason to print new editions of all their books. They will like it—a reason to sell more new books.
With μ_{0} a measured quantity in the future, I predict one will see many more uses of the current underdog of the fundamental constant, the impedance of the vacuum Z in the future:
I applaud all physicists and metrologist for the hard work they’ve carried out in continuation of my committee’s work over the last 225 years, which culminated in the new, physical constant-based definitions of the units. So do my fellow original committee members. These definitions are beautiful and truly forever.
(I know it is a bit indiscreet to reveal this, but Joseph Louis Lagrange told me privately that he regrets a bit that we did not introduce base and derived units as such in the 1790s. Now with the Planck constant being too important for the new SI, he thought we should have had a named base unit for the action (the time integral over his Lagrangian). And then make mass a derived quantity. While this would be the high road of classical mechanics, he does understand that a base unit for the action would not have become popular with farmers and peasants as a daily unit needed for masses.)
I don’t have the time today to go into any detailed discussion of the quarterly garden fests that Percy Williams Bridgman holds. As my schedule allows, I try to participate in every single one of them. It is also so intellectually stimulating to listen to the general discussions about the pros and cons of alternative unit systems. As you can imagine, Julius Wallot, Jan de Boer, Edward Guggenheim, William Stroud, Giovanni Giorgi, Otto Hölder, Rudolf Fleischmann, Ulrich Stille, Hassler Whitney, and Chester Page are, not unexpectedly, most outspoken at these parties. The discussion about coherence and completeness of unit systems and what is a physical quantity go on and on. At the last event, the discussion of whether probability is or is not a physical quantity went on for six hours, with no decision at the end. I suggested inviting Richard von Mises and Hans Reichenbach the next time. They might have something to contribute. At the parties, Otto always complains that mathematicians do not care enough anymore about units and unit systems as they did in the past, and he is so happy to see at least theoretical physicists pick up the topic from time to time, like the recent vector-based differentiation of physical quantities or the recent paper on the general structure of unit systems. And when he saw in an article from last year’s Dagstuhl proceedings that modern type theory met units and physical dimensions, he was the most excited he had been in decades.
Interestingly, basically the same discussions came up three years ago (and since then regularly) in the monthly mountain walks that Claude Shannon organizes. Leo Szilard argues that the “bit” has to become a base unit of the SI in the future. In his opinion, information as a physical quantity has been grossly underrated.
Once again: the new SI will be just great! There are a few more details that I would like to see changed. The current status of the radian and the steradian, which SP 811 now defines as derived units, saying, “The radian and steradian are special names for the number one that may be used to convey information about the quantity concerned.” But I see with satisfaction that the experts are discussing this topic recently quite in detail.
To celebrate the upcoming new SI here in heaven, we held a crowd-based fundraiser to celebrate this event. We raised enough funds to actually hire the master himself, Michelangelo. He will be making a sculpture. Some early sketches shown to the committee (I am fortunate to have the honorary chairmanship) are intriguing. I am sure it will be an eternal piece rivaling the David. One day every human will have the chance to see it (may it be a long time until then, dependent on your current age and your smoking habits). In addition to the constants and the units on their own, he plans to also work Planck himself, Boltzmann, and Avogadro into the sculpture, as these are the only three constants named after a person. Max was immediately accessible to model, but we are still having issues getting permission for Boltzmann to leave hell for a while to be a model. (Millikan and Fletcher were, understandably, a bit disappointed.) Ironically, it was Paul Adrien Maurice Dirac who came up with a great idea on how to convince Lucifer to get Boltzmann a Sabbath-ical. Ironically—because Paul himself is not so keen on the new SI because of the time dependence of the constants themselves over billions of years. But anyway, Paul’s clever idea was to point out that three fundamental constants, the Planck constant (6.62… × 10^{34} J · s), the Avogradro constant (6.02… × 10^{23} / mol), and the gravitational constant (6.6… × 10^{–11} m^{3} / (kg · s)) all start with the digit 6. And forming the number of the beast, 666, through three fundamental constants really made an impression on Lucifer, and I expect him to approve Ludwig’s temporary leave.
As an ex-mariner with an affinity for the oceans, I also pointed out to Lucifer that the mean ocean depth is exactly 66% of his height (2,443 m, according to a detailed re-analysis of Dante’s Divine Comedy). He liked this cute fact so much that he owes me a favor.
So far, Lucifer insists on having the combination G(m_{e} / (h k))^{1/2} on the sculpture. For obvious reasons:
We will see how this discussion turns out. As there is really nothing wrong with this combination, even if it is not physically meaningful, we might agree to his demands.
All of the new SI 2018 committee up here has also already agreed on the music, we will play Wojciech Kilar’s Sinfonia de motu, which uniquely represents the physical constants as a musical composition using only the notes c, g, e, h (b-flat in the English-speaking world), and a (where a represents the cesium atom). And we could convince Rainer Maria Rilke to write a poem for the event. Needless to say, Wojciech, who has now been with us for more than two years, agreed, and even offered to compose an exact version.
Down on Earth, the arrival of the constants-based units will surely also be celebrated in many ways and many places. I am looking forward especially to the documentary The State of the Unit, which will be about the history of the kilogram and its redefinition through the Planck constant.
As I already touched on, the most central point of the new SI will be the new definition of the kilogram. After all, the kilogram is the one artifact still present in the current SI that should be eliminated. In addition to the kilogram itself, many more derived units depend on it, say, the volt: 1 volt = 1 kilogram meters^{2}/(ampere second^{3}). Redefining the kilogram will make many (at least the theoretically inclined) electricians happy. Electrician have been using their exact conventional values for 25 years.
The value resulting from the convential value for the von Klitzing constant and the Josephson constant is very near to the latest CODATA value of the Planck constant:
A side note on the physical quantity that the kilogram represents: The kilogram is the SI base unit for the physical quantity mass. Mass is most relevant for mechanics. Through Newton’s second law, , mass is intimately related to force. Assume we have understood length and time (and so also acceleration). What is next in line, force or mass? William Francis Magie wrote in 1912:
It would be very improper to dogmatize, and I shall accordingly have to crave your pardon for a frequent expression of my own opinion, believing it less objectionable to be egotistic than to be dogmatic…. The first question which I shall consider is that raised by the advocates of the dynamical definition of force, as to the order in which the concepts of force and mass come in thought when one is constructing the science of mechanics, or in other words, whether force or mass is the primary concept…. He [Newton] further supplies the measurement of mass as a fundamental quantity which is needed to establish the dynamical measure of force…. I cannot find that Lagrange gives any definition of mass…. To get the measure of mass we must start with the intuitional knowledge of force, and use it in the experiments by which we first define and then measure mass…. Now owing to the permanency of masses of matter it is convenient to construct our system of units with a mass as one of the fundamental units.
And Henri Poincaré in his Science and Method says, “Knowing force, it is easy to define mass; this time the definition should be borrowed from dynamics; there is no way of doing otherwise, since the end to be attained is to give understanding of the distinction between mass and weight. Here again, the definition should be led up to by experiments.”
While I always had an intuitive feeling for the meaning of mass in mechanics, up until the middle of the twentieth century, I never was able to put it into a crystal-clear statement. Only over the last decades, with the help of Valentine Bargmann and Jean-Marie Souriau did I fully understand the role of mass in mechanics: mass is an element in the second cohomology group of the Lie algebra of the Galilei group.
Mass as a physical quantity manifests itself in different domains of physics. In classical mechanics it is related to dynamics, in general relativity to the curvature of space, and in quantum field theory mass occurs as one of the Casimir operators of the Poincaré group.
In our weekly “Philosophy of Physics” seminar, this year led by Immanuel himself, Hans Reichenbach, and Carl Friedrich von Weizsäcker (Pascual Jordan suggested this Dreimännerführung of the seminars), we discuss the nature of mass in five seminars. The topics for this year’s series are mass superselection rules in nonrelativistic and relativistic theories, the concept and uses of negative mass, mass-time uncertainty relations, non-Higgs mechanisms for mass generation, and mass scaling in biology and sports. I need at least three days of preparation for each seminar, as the recommended reading list is more than nine pages—and this year they emphasize the condensed matter appearance of these phenomena a lot! I am really looking forward to this year’s mass seminars; I am sure that I will learn a lot about the nature of mass. I hope Ehrenfest, Pauli, and Landau don’t constantly interrupt the speakers, as they did last year (the talk on mass in general relativity was particularly bad). In the last seminar of the series, I have to give my talk. In addition to metabolic scaling laws, my favorite example is the following:
I also intend to speak about the recently found predator-prey power laws.
For sports, I already have a good example inspired by Texier et al.: the relation between the mass of a sports ball and its maximal speed. The following diagram lets me conjecture speed_{max}~ln(mass). In the downloadable notebook, mouse over to see the sport, the mass of the ball, and the top speeds:
For the negative mass seminar, we had some interesting homework: visualize the trajectories of a classical point particle with complex mass in a double-well potential. As I had seen some of Bender’s papers on complex energy trajectories, the trajectories I got for complex masses did not surprise me:
End side note.
The complete new definition reads thus: The kilogram, kg, is the unit of mass; its magnitude is set by fixing the numerical value of the Planck constant to be equal to exactly 6.62606X*10^{–34} when it is expressed in the unit s^{–1} · m^{2} · kg, which is equal to J · s. Here X stands for some digits soon to be explicitly stated that will represent the latest experimental values.
And the kilogram cylinder can finally retire as the world’s most precious artifact. I expect soon after this event the international kilogram prototype will finally be displayed in the Louvre. As the Louvre had been declared “a place for bringing together monuments of all the sciences and arts” in May 1791 and opened in 1793, all of us on the committee agreed that one day, when the original kilogram was to be replaced with something else, it would end up in the Louvre. Ruling the kingdom of mass for more than a century, IPK deserves its eternal place as a true monument of the sciences. I will make a bet—in a few years the retired kilogram, under its three glass domes, will become one of the Louvre’s most popular objects. And the queue that physicists, chemists, mathematicians, engineers, and metrologists will form to see it will, in a few years, be longer than the queue for the Mona Lisa. I would also make a bet that the beautiful miniature kilogram replicas will within a few years become the best-selling item in the Louvre’s museum store:
At the same time, as a metrologist, maybe the international kilogram prototype should stay where it is for another 50 years, so that it can be measured against a post-2018 kilogram made from an exact value of the Planck constant. Then we would finally know for sure if the international kilogram prototype is/was really losing weight.
Let me quickly recapitulate the steps toward the new “electronic” kilogram.
Intuitively, one could have thought to define the kilogram through the Avogadro constant as a certain number of atoms of, say, ^{12}C. But because of binding energies and surface effects in a pile of carbon (e.g. diamond, graphene) made up from n = round(1 kg / m (^{12}C)) atoms to realize the mass of one kilogram, all the n carbon-12 atoms would have to be well separated. Otherwise we would have a mass defect (remember Albert’s famous E = m c^{2} formula), and the mass equivalent for one kilogram or compact carbon versus the same number of individual, well-separated atoms is on the order of 10^{–10}. Using the carbon-carbon bond energry, here is an estimation of the mass difference:
A mass difference of this size can for a 1 kg weight can be detected without problems with a modern mass comparator.
To give a sense of scale, this would be equivalent to the (Einsteinian) relativistic mass conversion of the energy expenditure of fencing for most of a day:
This does not mean one could not define a kilogram through the mass of an atom or a fraction of it. Given the mass of a carbon atom m (^{12}C), the atomic mass constant u = m (^{12}C) / 12 follows, and using u we can easily connect to the Planck constant:
I read with great interest the recent comparison of using different sets of constants for the kilogram definition. Of course, if the mass of a ^{12}C atom would be the defined value, then the Planck constant would become a measured, meaning nonexact, value. For me, having an exact value for the Planck constant is aesthetically preferable.
I have been so excited over the last decade following the steps toward the redefinition of the kilogram. For more than 20 years now, there has been a light visible at the end of the tunnel that would eliminate the one kilogram from its throne.
And when I read 11 years ago the article by Ian Mills, Peter Mohr, Terry Quinn, Barry Taylor, and Edwin Williams entitled “Redefinition of the Kilogram: A Decision Whose Time Has Come” in Metrologia (my second-favorite, late-morning Tuesday monthly read, after the daily New Arrivals, a joint publication of Hells’ Press, the Heaven Publishing Group, Jannah Media, and Deva University Press), I knew that soon my dreams would come true. The moment I read the Appendix A.1 Definitions that fix the value of the Planck constant h, I knew that was the way to go. While the idea had been floating around for much longer, it now became a real program to be implemented within a decade (give or take a few years).
James Clerk Maxwell wrote in his 1873 A Treatise on Electricity and Magnetism:
In framing a universal system of units we may either deduce the unit of mass in this way from those of length and time already defined, and this we can do to a rough approximation in the present state of science; or, if we expect soon to be able to determine the mass of a single molecule of a standard substance, we may wait for this determination before fixing a universal standard of mass.
Until around 2005, James Clerk thought that mass should be defined through the mass of an atom, but he came around over the last decade and now favors the definition through Planck’s constant.
In a discussion with Albert Einstein and Max Planck (I believe this was in the early seventies) in a Vienna-style coffee house (Max loves the Sachertorte and was so happy when Franz and Eduard Sacher opened their now-famous HHS (“Heavenly Hotel Sacher”)), Albert suggested using his two famous equations, E = m c^{2} and E = h f, to solve for m to get m = h f / c^{2}. So, if we define h as was done with c, then we know m because we can measure frequencies pretty well. (Compton was arguing that this is just his equation rewritten, and Niels Bohr was remarking that we cannot really trust E = m c^{2} because of its relatively weak experimental verification, but I think he was just mocking Einstein, retaliating for some of the Solvay Conference Gedankenexperiment discussions. And of course, Bohr could not resist bringing up Δm Δt ~ h / c^{2} as a reason why we cannot define the second and the kilogram independently, as one implies an error in the other for any finite mass measurement time. But Léon Rosenfeld convinced Bohr that this is really quite remote, as for a day measurement time this limits the mass measurement precision to about 10^{–52} kg for a kilogram mass m.)
An explicit frequency equivalent f = m c^{2} / h is not practical for a mass of a kilogram as it would mean f ~ 1.35 10^{50} Hz, which is far, far too large for any experiment, dwarfing even the Planck frequency by about seven orders of magnitude. But some recent experiments from Berkeley from the last few years will maybe allow the use of such techniques at the microscopic scale. For more than 25 years now, in every meeting of the HPS (Heavenly Physical Society), Louis de Broglie insists on these frequencies being real physical processes, not just convenient mathematical tools.
So we need to know the value of the Planck constant h. Still today, the kilogram is defined as the mass of the IPK. As a result, we can measure the value of h using the current definition of the kilogram. Once we know the value of h to a few times 10^{–8} (this is basically where we are right now), we will then define a concrete value of h (very near or at the measured value). From then on, the kilogram will become implicitly defined through the value of the Planck constant. At the transition, the two definitions overlap in their uncertainties, and no discontinuities arise for any derived quantities. The international prototype has lost over the last 100 years on the order of 50 μg weight, which is a relative change of 5 × 10^{–8}, so a value for the Planck constant with an error less than 2 × 10^{–8} does guarantee that the mass of objects will not change in a noticeable manner.
Looking back over the last 116 years, the value of the Planck constant gained about seven digits in precision. A real success story! In his paper “Ueber das Gesetz der Energieverteilung im Normalspectrum,” Max Planck for the first time used the symbol h, and gave for the first time a numerical value for the Planck constant (in a paper published a few months earlier, Max used the symbol b instead of h):
(I had asked Max why he choose the symbol h, and he said he can’t remember anymore. Anyway, he said it was a natural choice in conjunction with the symbol k for the Boltzmann constant. Sometimes one reads today that h was used to express the German word Hilfsgrösse (auxiliary helping quantity); Max said that this was possible, and that he really doesn’t remember.)
In 1919, Raymond Thayer Birge published the first detailed comparison of various measurements of the Planck constant:
From Planck’s value 6.55 × 10^{–34} J · s to the 2016 value 6.626070073(94) × 10^{–34} J · s, amazing measurement progress has been made.
The next interactive Demonstration allows you to zoom in and see the progress in measuring h over the last century. Mouse over the Bell curves (indicating the uncertainties of the values) in the notebook to see the experiment (for detailed discussions of many of the experiments for determining h, see this paper):
There have been two major experiments carried out over the last few years that my original group eagerly followed from the heavens: the watt balance experiment (actually, there is more than one of them—one at NIST, two in Paris, one in Bern…) and the Avogadro project. As a person who built mechanical measurements when I was alive, I personally love the watt balance experiment. Building a mechanical device that through a clever trick by Bryan Kibble eliminates an unknown geometric quantity gets my applause. The recent do-it-yourself LEGO home version is especially fun. With an investment of a few hundred dollars, everybody can measure the Planck constant at home! The world has come a long way since my lifetime. You could perhaps even check your memory stick before and after you put a file on it and see if its mass has changed.
But my dear friend Lavoisier, not unexpectedly, always loved the Avogadro project that determines the value of the Avogadro constant to high precision. Having 99.995% pure silicon makes the heart of a chemist beat faster. I deeply admire the efforts (and results) in making nearly perfect spheres out of them. The product of the Avogadro constant with the Planck constant N_{A} h is related to the Rydberg constant. Fortunately, as we saw above, the Rydberg constant is known to about 11 digits; this means that knowing N_{A} h to a high precision allows us to find the value of our beloved Planck constant h to high precision. In my lifetime, we started to understand the nature of the chemical elements. We knew nothing about isotopes yet—if you had told me that there are more than 20 silicon isotopes, I would not even have understood the statement:
I am deeply impressed how mankind today can even sort the individual atoms by their neutron count. The silicon spheres of the Avogadro project are 99.995 % silicon 28—much, much more than the natural fraction of this isotope:
While the highest-end beam balances and mass comparators achieve precisions of 10^{–11}, they can only compare masses but not realize one. Once the Planck constant has a fixed value using the watt balance, a mass can be constructively realized.
I personally think the Planck constant is one of the most fascinating constants. It reigns in the micro world and is barely visible at macroscopic scales directly, yet every macroscopic object holds together just because of it.
A few years ago I was getting quite concerned that our dream of eternal unit definitions would never be realized. I could not get a good night’s sleep when the value for the Planck constant from the watt balance experiments and the Avogadro silicon sphere experiments were far apart. How relieved I was to see that over the last few years the discrepancies were resolved! And now the working mass is again in sync with the international prototype.
Before ending, let me say a few words about the Planck constant itself. The Planck constant is the archetypal quantity that one expects to appear in quantum-mechanical phenomena. And when the Planck constant goes to zero, we recover classical mechanics (in a singular limit). This is what I myself thought until recently. But since I go to the weekly afternoon lectures of Vladimir Arnold, which he started giving in the summer of 2010 after getting settled up here, I now have strong reservations against such simplistic views. In his lecture about high-dimensional geometry, he covered the symplectic camel; since then, I view the Heisenberg uncertainty relations more as a classical relic than a quantum property. And since Werner Heisenberg recently showed me the Brodsky–Hoyer paper on ħ expansions, I have a much more reserved view on the BZO cube (the Bronshtein–Zelmanov–Okun cGh physics cube). And let’s not forget recent attempts to express quantum mechanics without reference to Planck’s constant at all. While we understand a lot about the Planck constant, its obvious occurrences and uses (such as a “conversion factor” between frequency and energy of photons in a vacuum), I think its deepest secrets have not yet been discovered. We will need a long ride on a symplectic camel into the deserts of hypothetical multiverses to unlock it. And Paul Dirac thinks that the role of the Planck constant in classical mechanics is still not well enough understood.
For the longest time, Max himself thought that in phase space (classical or through a Wigner transform), the minimal volume would be on the order of his constant h. As one of the fathers of quantum mechanics, Max follows the conceptual developments still today, especially the decoherence program. How amazed was he when sub-h structures were discovered 15 years ago. Eugene Wigner told me that he had conjectured such fine structures since the late 1930s. Since then, he has loved to play around with plotting Wigner functions for all kind of hypergeometric potentials and quantum carpets. His favorite is still the Duffing oscillator’s Wigner function. A high-precision solution of the time-dependent Schrödinger equations followed by a fractional Fourier transform-based Wigner function construction can be done in a straightforward and fast way. Here is how a Gaussian initial wavepacket looks after three periods of the external force. The blue rectangle is an area with in the x p plane of area h:
Here are some zoomed-in (colored according to the sign of the Wigner function) images of the last Wigner function. Each square has an area of 4 h and shows a variety of sub-Planckian structures:
For me, the forthcoming definition of the kilogram through the Planck constant is a great intellectual and technological achievement of mankind. It represents two centuries of hard work at metrological institutes, and cements some of the deepest physical truths found in the twentieth century into the foundations of our unit system. At once a whole slew of units, unit conversions, and fundamental constants will be known with greater precision. (Make sure you get a new CODATA sheet after the redefinition and have the pocket card with the new constant values with you always until you know all the numbers by heart!) This will open a path to new physics and new technologies. In case you make your own experiments determining the values of the constants, keep in mind that the deadline for the inclusion of your values is July 1, 2017.
The transition from the platinum-iridium kilogram, historically denoted , to the kilogram based on the Planck constant h can be nicely visualized graphically as a 3D object that contains both characters. Rotating it shows a smooth transition of the projection shape from to h representing over 200 years of progress in metrology and physics:
The interested reader can order a beautiful, shiny, 3D-printed version here. It will make a perfect gift for your significant other (or ask your significant other to get you one) for Christmas to be ready for the 2018 redefinition, and you can show public support for it as a pendent or as earrings. (Available in a variety of metals, platinum is, obviously, the most natural choice, and it is under $5k—but the $82.36 polished silver version looks pretty nice too.)
Here are some images of golden-looking versions of Toh3D (up here, gold, not platinum is the preferred metal color):
I realize that not everybody is (or can be) as excited as I am about these developments. But I see forward to the year 2018 when, after about 225 years, the kilogram as a material artifact will retire and a fundamental constant will replace it. The new SI will base our most important measurement standards on twenty-first century technology.
If the reader has questions or comments, don’t hesitate to email me at jeancharlesdeborda@gmail.com; based on recent advances in the technological implications of EPR=ER, we now have a much faster and more direct connection to Earth.
À tous les temps, à tous les peuples!
Download this post as a Computable Document Format (CDF) file. New to CDF? Get your copy for free with this one-time download.
]]>There is a vast amount of literature on the appearance of the golden ratio in nature, in physiology and psychology, and in human artifacts (see this page on the golden ratio; these articles on the golden ratio in art, in nature, and in the human body; and this paper on the structure of the creative process in science and art). In the past thirty years, there has been increasing skepticism about the prevalence of the golden ratio in these domains. Earlier studies have been revisited or redone. See, for example, Foutakis, Markowsky on Greek temples, Foster et al., Holland, Benjafield, and Svobodova et al. for human physiology.
In my last blog, I analyzed the aspect ratios of more than one million old and new paintings. Based on psychological experiments from the second half of the nineteenth century, especially by Fechner in the 1870s, one would expect many paintings to have a height-to-width ratio equal to the golden ratio or its inverse. But the large sets of paintings analyzed did not confirm such a conjecture.
While we did not find the expected prevalence of the golden ratio in external measurements of paintings, maybe looking “inside” will show signs of the golden ratio (or its inverse)?
In today’s blog, we will analyze collections of paintings, photographs, and magazine covers that feature human faces. We will also analyze where human faces appear in a few selected movies.
The literature on art history and the aesthetics of photography puts forward a theory of dividing the canvas into thirds, horizontally and vertically. And when human faces are portrayed, two concrete rules for the position of the eyeline are often mentioned:
The rule of thirds is often abbreviated as ROT. In 1998 Frascari and Ghirardini—in the spirit of Adolf Zeising, the father of the so-called golden numberism—coined the term “ϕaithful” (making clever use of the Greek symbol ϕ that is used to denote the golden ratio) to label the unrestricted belief in the primacy of the golden ratio. Some consider the rule of thirds an approximation of the golden ratio rule; “ROT on steroids” and similar phrases are used. Various photograph-related websites contain a lot of discussion about the relation of these two rules. For early uses of the rule of thirds, see Nafisi. For the more modern use starting in the eighteenth century, see this history of the rule of thirds. For a recent human-judgment-based evaluation of the rule of thirds in paintings and photographs, see Amirshahi et al.
So because we cannot determine which rule is more common by first-principle mathematical computations, let’s again look at some data. At what height, measured from the bottom, are the eyes in paintings showing human faces?
Let’s start with paintings. As with the previous blog, we will use a few different data sources. We will look at four painting collections: Wikimedia, the Smithsonian, Britain’s Your Paintings, and Saatchi.
If we want to analyze the positions of faces within a painting, we must first locate the faces. The function FindFaces comes in handy. While typically used for photographs, it works pretty well on (representational) paintings too. Here are a few randomly selected paintings of people from Wikimedia. First, the images are imported and the faces located and highlighted by a yellow, translucent rectangle. We see potentially different amounts of horizontal space around a face, but the vertical extension is pretty uniform from the chin to the bottom of the forehead hairs.
A more detailed look reveals that the eyeline is approximately at 60% of the height of the selected face area. (Note that this is approximately 1/ϕ). To demonstrate the correctness of the 60%-of-the-face-height rule for some randomly selected images from Wikipedia, we show the resulting eyeline in red and the two lines ±5% above and below.
Independent of gender and haircut, the 60% height seems to be a good approximation for the eyeline. Of course, not all faces that we encounter in paintings and photographs are perfectly straightened. For tilting heads, we note both eyes will not be on a horizontal line. But as an average, the 60% rule works well.
Overall we see that the eyeline can be located within a few percent of the vertical height of the face rectangle. The error of the resulting estimation of the eyeline height in a painting/photograph in most collections should be about ≤2% for a typical ratio of face height to painting/photograph height. Plus or minus 2% should be small enough such that for a large enough painting/photograph collection we can discriminate the golden ratio height 1/ϕ from the rule of thirds 2/3. On the range [0,1], the distance between 1/ϕ and 2/3 is about 5%. (Using a specialized eye detection method to determine the vertical height of the eyes we leave for a later blog.)
We start with images of paintings from Wikimedia.
Using the 0.6 factor for the eyeline heights, we get the following distribution of the faces identified. About 12,000 faces were found in 8,000 images. The blue curve shows the probability density of the position of the eyelines of all faces, and the red curve the faces whose bounding rectangles occupy more than 1/12 of the total area of the painting. (While somewhat arbitrary, here and in the following, we will use 1/12 as the relative face rectangle area, above which a face will be considered to be a larger part of the whole image.) We see a clear single maximum at 2/3 from the bottom, as predicted by the ROT. (The two black vertical lines are at 2/3 and 1/ϕ).
Because we determine the faces from potentially cropped images rather than ruler-based measurements on the actual paintings, we get some potential errors in our data. As analyzed in the last blog, these effects seem to average out and introduce final errors well under 1% for over 10,000 paintings.
Here are two heat maps: one for all faces, and the other for larger faces only. We place face-enclosing rectangles over each other, and the color indicates the fraction of all faces at a given position. One sees that human faces appear as frequently in the left half as in the right half. To allow comparisons of the face positions of paintings with different aspect ratios, the widths and heights of all paintings were rescaled to fit into a square. The centers of the faces fall nicely into the [2/3,1/ϕ] range. (The Wolfram Language code to generate the PDF and heat map plots is given below.)
Here is a short animation showing how the peak of the face distributions forms as more and more paintings are laid over each other.
Repeating the Wikimedia analysis with 4,000 portrait paintings from the portrait collection of the Smithsonian yields a similar result. This time, because we selected portrait paintings from the very beginning, the blue curve already shows a more located peak.
The British Your Paintings website has a much larger collection of paintings. We find 58,000 paintings with a total of 76,000 faces.
The mean and standard deviation for all eyeline heights is 0.64±0.19, and the median is 0.69.
In the eyeline position/relative face size plane, we obtain the following distribution showing that larger faces are, on average, positioned lower. Even for very small relative face sizes, the most common eyeline height is between 1/ϕ and 2/3.
The last image also begs for a plot of the PDF of the relative size of the faces in a painting. The mean area of a face rectangle is 3.9% of the whole painting area, with a standard deviation of 5.5%.
Here is the corresponding cumulative distribution of all eyeline positions of faces larger than a given relative size. The two planes in the yz plane are at 1/ϕ and 2/3.
Did the fraction of paintings obeying the ROT of ϕ change over time? Looking at the data, the answer is no. For instance, here is the distribution of the eyeline heights for all nineteenth- and twentieth-century paintings from our dataset. (There are some claims that even Stone Age paintings already took the ROT into account.)
As paintings often contain more than one person, we repeat the analysis with the paintings that just have a single face. Now we see a broader maximum that spans the range from 1/ϕ to 2/3.
Looking at the binned rather than the smoothed data in the range of the global maximum, we see two well-resolved maxima: one according to the ROT and one according to the golden ratio.
Now that we have gone through all the work to locate the faces, we might as well do something with them. For instance, we could superimpose them. And as a result, here is the average face from 11,000 large faces from nineteenth-century British paintings. The superimposed images of tens of thousands of faces also gives us some confidence in the robustness and quality of the face extraction process.
Given a face from a nineteenth-century painting, which (famous) living person looks similar? Using Classify["NotablePerson",…], we can quickly find some unexpected facial similarities of living celebrities to people shown in older British paintings. The function findSimilarNotablePerson takes as the argument the abbreviated URL of a page from the Your Paintings website, imports the painting, extracts the face, and then finds the most similar notable person from the built-in database.
Here is a Demonstration that shows a few more similar pairs (please see the attached notebook to look through the different pairings).
Now let us look at some more modern paintings. We find 15,000 modern portraits at Saatchi. Faces in modern portraits can look quite abstract, but FindFaces still is able to locate a fair number of them. Here are some concrete examples.
And here is an array of 144 randomly selected faces in modern art paintings. From a distance, one recognizes human faces, but deviations due to stylistic differences become less visible.
If we again superimpose all faces, we get a quite normal-looking human face. With a more female appearance (e.g. softer jawline and fuller lips) as compared to the nineteenth-century British paintings, the overall face has more female characteristics. The fact that the average face looks quite “normal” is surprising when looking at the above 12*12 matrix of faces.
If we add not just all color values but also random positive and negative weights, we get much more modern-art-like average faces.
Now concerning the main question of this blog: what are the face positions in these modern portraits? Turns out, they again follow the golden ratio much more frequently than the ROT. About 30% more paintings have the eyeline at 1/ϕ±1% compared to 2/3±1%.
The mean and standard deviation for all eyeline heights is 0.60±0.16, and the median is 0.62. A clearly lower-centered and narrower distribution.
And if we plot the PDF of the eyeline height versus the relative face size, we clearly see a sweet spot at eyeline height 2/3 and relative face area 1/5. Smaller faces with relative size of about 5% occur higher, at eyeline height about 3/4.
And here is again the corresponding 3D graphic that shows the 1/ϕ eyeline height for larger relative faces is quite pronounced.
We should check with another data source to confirm that more modern paintings have a more ϕaithful eyeline. The site Fine Art America offers thousands of modern paintings of celebrities. Here is the average of 5,000 such celebrity paintings (equal amounts politicians, actors and actresses, musicians, and athletes). Again we clearly see the maximum of the PDF at 1/ϕ rather than at 2/3.
For individual celebrities, the distribution might be different. Here is a small piece of code that uses some functions defined in the last section to analyze portrait paintings of individual persons.
Here are some examples. (We used about 150 paintings per person.)
Perhaps unexpectedly, Jimi Hendrix is nearly perfectly ϕaithful, while Mick Jagger seems perfectly ROTen. Obama and Jesus obey nearly exactly the rule of thirds in its classic form.
Now, for comparison to the eyeline positions in paintings, let us look at some sets of photographs and determine the positions of the faces in these. Let’s start with professional portrait photographs. The Getty Image collection is a premier collection of good photographs. In contrast to the paintings, the maximum for large faces is much closer to 2/3 (ROT) than to 1/ϕ for a random selection of 200,000 portrait photographs.
And here is again the distribution in the eyeline height/relative face size plane. For very large relative face sizes, the most common eyeline height even drops below 1/ϕ.
And here is the corresponding heat map arising from overlaying 300,000 head rectangles.
So what about other photographs, those aesthetically less perfect than Getty Images? The Shutterstock website has many photos. Selecting photos with subjects of various tags, we quite robustly (meaning independent of the concrete tags) see the maximum of the eyeline height PDF near 2/3. This time, we display the results for portraits showing groups of identically tagged people.
These are the eyeline height distributions and the average faces of 100,000 male and female portraits. (The relatively narrow peak in the twin-peak structure of the distribution between 0.5 and 0.55 comes from photos that are close-up headshots that don’t show the entire face.)
Restricting the photograph selection even more, e.g. to over 10,000 photographs of persons tagged with nerd or beard shows again ROTen-ness.
The next two rows show photos tagged with happy or sad.
All of the last six tag types (male, female, nerd, beard, happy, sad) of photographs show a remarkable robustness of the position of the eyeline maximum. It is always in the interval [1/ϕ,2/3], with a trend toward 2/3 (ROT).
But where are the babies (the baby eyeline, to be precise)? The two peaks are now even more pronounced, with the first peak even bigger than the second—the reason being that many more baby pictures are just close-ups of the baby’s whole face.
Next we’ll have a look at the eyeline height PDFs for two professional photographers: Peggy Sirota and Mario Testino. Because both artists often photograph models, the whole human body will be in the photograph, which shifts the eyeline height well above 2/3. (We will come back to this phenomenon later.)
After looking at professionally made photos, we should, of course, also have a look at the pinnacle of modern amateur portraiture—the selfie. (For a nice summary of the history of the selfie, see Saltz. For a detailed study in the increase of selfie popularity over the last three years by nearly three orders of magnitude, see Souza et al. Using some of the service connects, e.g. the “Flickr” connection, we can immediately download a sample of selfies. Here are five selfies from the last week in September around the Eiffel Tower. Not all images tagged as “selfies” are just the faces in close up.
Every day, more than 100,000 selfies are added to Instagram (one can easily browse them here)—this is a perfect source for selfies. Here are the eyeline height distributions for 100,000 selfie thumbnails.
Compared with the professional photographs, we see that the maximum of the eyeline height distributions is clearly above 2/3 for photos that contain a face larger than 1/12 of the total photo. So the next time you take a selfie, position your face a bit lower in the picture to better obey the ROT and ϕ. (Systematic deviations of selfies from established photographic aesthetic principles have already been observed by Bruno et al.)
The eyeline height in a selfie changes much less with the total face area as compared to professional photographs.
And again, the corresponding heat map.
The maximum of the total area of the faces in selfies is—not unexpectedly—due to the finite length of the human arm or typical telescopic selfie sticks, bounded by about one meter. So selfies with very small faces are scarcer than photographs or paintings with small faces.
What’s the average selfie face look like? The left image is the average over all faces, the middle image the average over all male faces, and the right image the average over all female faces. (Genders were heuristically determined by matching the genders associated with a given name to user names.) The fact that the average selfie looks female arises from the fact that a larger number of selfies are of female faces. This was also found in the recent study by Manovich et al.
Now, it could be that the relative height of the eyeline is dependent on the concrete person portrayed. We give the full code in case the reader wants to experiment with people not investigated here. Eyeline heights we measure in images from the Getty website, tagged with the keywords to be specified in the function positionSummary.
Now it takes just a minute to get the average eyeline height of people seen in the news, each based on analyzing 600 portrait shots of Lady Gaga, Taylor Swift, Brad Pitt, and Donald Trump. Lady Gaga’s eyeline is, on average, clearly higher, quite similar to typical selfie positions. On the other hand, Taylor Swift’s eyeline is peaked at the modern painting-like maximum at 1/ϕ.
Many more types of photographs could be analyzed. But we end here and leave further exploration and more playtime to the reader.
Many LinkedIn profile pages have photographs of the page owners. These photographs are another data source for our eyeline height investigations. Taking 25,000 male and 25,000 female profile photos, we obtain the following results. Because the vast majority of LinkedIn photographs are close-up shots, the curve for faces occupying more than 1/12 of the whole area is quite similar to the curve of all faces, and so we show only the distribution of all faces. This time, the yellow curve shows all faces that occupy between 10% and 30% of the total area.
Here are the eyeline height PDF, the bivariate PDF, and the average face for 10,000 male members from LinkedIn. Based on the frequency of male first names in the US, Bing image searches restricted to the LinkedIn domain were carried out, and the images found were collected.
Interestingly, the global maximum of the eyeline height distribution occurs clearly below 1/ϕ, the opposite effect compared to the selfies analyzed above. The center graph shows the distribution of the eyeline height as a function of the face area. The global maximum appears at a face area of 1/5 and at eyeline height quite close to 1/ϕ. This means the low global maximum is mostly caused by photographs where the face rectangles occupy more than 30% of the total area. The most typical LinkedIn photograph has a face rectangle area of 1/5th of the total area and the eyeline height is at 1/ϕ.
The corresponding distribution over all female US first names is quite similar to the corresponding curve for males. But for faces that occupy a larger fraction of the image, the female distribution is visibly different. The average eyeline height of these photos of women on LinkedIn is a few percent smaller than the corresponding male curve.
With the large number of members on LinkedIn, it even becomes feasible to look for eyeline height distribution for individual names. We carry out a facial profiling for three names: Josh, Raj, and Mei. Taking 2,500 photos for each name, we obtain the following distributions and average faces.
The distributions agree quite well with the corresponding gender distributions above.
After observing the remarkable peak of the eyeline height PDF at 1/ϕ, I was wondering which of my Wolfram Research or Wolfram|Alpha coworkers obey the ϕaithful rule. And indeed I found more of my male coworkers have the 1/ϕ height than female coworkers. Not unexpectedly, our design director’s is among the ϕaithful. The next input imports photos from the LinkedIn pages of other Wolfram employees and draws a red line at height 1/ϕ.
Let us compare the peak distribution with the one from the current members of Congress. We import photos of all members of Congress.
Here are some example photos.
Similar to the LinkedIn profile photos, the maximum of the eyeline PDF is slightly lower than 2/3. We also show the face of the averaged member of Congress.
After having analyzed the face positions of amateur and professional photographs, a next natural area for exploration is magazine covers: their photographs are carefully made, selected, and placed. TIME magazine maintains a special website for their 4,800 covers covering over ninety years of published issues. (For a quick view of all covers, see Manovich’s cover analysis from a few years ago.)
It is straightforward to download the covers, and then find and extract the faces.
These are the two resulting distributions for the eyelines.
The maximum occurs at a height smaller than 1/2. This is mostly caused by the title “TIME” on top of the cover. Newer editions have partial overlaps between the magazine title and the image. The following plot shows the yearly average of the eyeline height over time. Since the 1980s, there has been a trend for higher eyeline positions on the cover.
If we calculate the PDFs of the eyeline positions of all issues from the last twenty-five years, we see quite a different distribution with a bimodal structure. One of the peaks is nearly exactly at 1/ϕ.
And here are the average faces per decade. We see also that the covers of the first two decades were in black and white.
For a second example, we will look at the German magazine SPIEGEL. It is again straightforward to download all the covers, locate the faces, and extract the eyelines.
Again, because of the title text “SPIEGEL” on top of the cover, the maximum of the PDF of the eyeline height on the cover occurs at relatively low heights (≈0.56).
A heat map of the face positions shows this clearly.
Taking into account both that the magazine title “SPIEGEL” is typically 13% of the cover height and that there is whitespace at the bottom, the renormalized peak of the eyeline height is nearly exactly at 1/ϕ.
For a third, not-so-politically-oriented magazine, we chose the biweekly Rolling Stone. They too have a collection of their covers (through 2013) online. The eyeline height distribution is again bimodal, with the largest peak at 1/ϕ. So Rolling Stone is a ϕaithful magazine.
By year, the average eyeline height shows some regularities within an eight-year period.
The cumulative mean of the eyeline heights is very near to 1/ϕ, and the average through 2013 deviates only 0.4% from 1/ϕ.
To parallel the earlier two magazines, here are the averaged faces by decade.