Water: A Multi-Objective Benchmark Problem


This is a write-up that really should go into the documentation of PGAPack but I was unable to include plotly graphics into a github README.rst file, even using the proxy-site raw.githack.com (which is a page that serves html and javascript files with the correct content type so that it can be displayed in a web browser – github choses to serve these files with content-type text/plain). Github refuses to render an iframe inside a github page.

The "Water Resource Planning" problem [RTS01] is an engineering problem that has been used as a benchmark for multi-objective optimization in the literature including classical papers like the original article on NSGA_II [DPAM02]. The problem consists of seven constraints and a five-dimensional objective function.

I have used this for quite some time to test my NSGA-II implementation in PGAPack. It is in the examples/nsgaii directory and can (after compilation) be called with test-problem index 12:

examples/nsgaii/optimize 12

The output of this program can also be found in the file

test/nsgaii_optimize_12.data

For testing I'm shipping a script called crowdingplot to plot objectives in a multi-objective problem. It is in the same directory as the code for the example.

The "water" problem has strong correlation between objectives 0 and 2 (as also found by Singh et. al. [SIR11]), we can see this when we plot the two objectives using:

crowdingplot -x=0 -y=2 test/nsgaii_optimize_12.data

The call to crowdingplot above will use the matplotlib backend, if you want the plotly backend (as in the graphics above), you can specify the -S option.

We see in this plot that getting a better evaluation for objective 0 also gets a better evaluation of objective 2 and vice-versa, i.e. the objectives are not contradictory. Furthermore objective 3 is redundant [SIR11] with the other objectives and can be left out when plotting objectives. So the 5-dimensional pareto front can be reduced to a 3-dimensional front and can be plotted with crowdingplot in three dimensions showing quite similar results when scaled to the same visible range:

crowdingplot -3 -x=0 -y=1 -z=4 test/nsgaii_optimize_12.data
crowdingplot -3 -x=2 -y=1 -z=4 test/nsgaii_optimize_12.data

On the one hand this means that this example is not really a five dimensional problem – on the other hand it is nice that we can visualize the data which is not so easy for five dimensions.

[DPAM02]

Kalyanmoy Deb, Amrit Pratap, Sameer Agarwal, and T. Meyarivan. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 6(2):182–197, April 2002.

[RTS01]

Tapabrata Ray, Kang Tai, and Kin Chye Seow. Multiobjective design optimization by an evolutionary algorithm. Engineering Optimization, 33(4):399–424, 2001.

[SIR11] (1,2)

Hemant Kumar Singh, Amitay Isaacs, and Tapabrata Ray. A pareto corner search evolutionary algorithm and dimensionality reduction in many-objective optimization problems. IEEE Transactions on Evolutionary Computation, 15(4):539–556, August 2011.

Update on non-dominated sorting measurements


This is an update on a previous post [Sch25] about a new implementation of non-dominated sorting for PGAPack. Non-dominated sorting is a central component of most multi-objective optimization implementations.

In the recent post I did not mention how many generations my test ran. The number of generations in the last and in the following tests was 200. In the last test the code was compiled without optimization. The following measurements were made with -O3 the best optimization gcc offers. I noticed that my implementation of the evaluation function for the DTLZ-1 test [DTLZ05] is quadratic in the number of objectives so I feared this would influence the tests. It turns out it didn't, the overhead even for 20 objectives is still too low to be noticeable.

In addition to the first tests I'm now measuring – in addition to the overall time – the time just for the non-dominated sorting. So in the following we have two types of graphics, the ones where the title is starting with "Overall" include the time of evaluation function and whatever the algorithm is doing for bookkeeping. The ones starting with "Non-dominated" just measure the time for the non-dominated sorting.

So lets first see the overall times, compared to my last post [Sch25] this time with optimization turned on.

I've also plotted the absolute difference between measurements in the following. Note that the absolute difference when measuring non-dominated sorting times only vs. the difference when measuring overall times are visually identical, they differ only by a few milliseconds (as we would expect). We see that the absolute differences grow faster than linear which we would expect since the old non-dominated sorting algorithm grows quadratically with the population size and this drowns out the smaller growth of the new algorithm.

Finally the following graphics displays the speedups I'm getting. Note that now similar to Jensen [Jen03] I'm getting higher times for the new algorithm for large numbers of objectives and small number of individuals. The absolute difference is very small, in the previous graphics you can see the difference only when you zoom in on the first two measurements with population size 100 and 200. The absolute difference is below 30ms for 200 generations. So I'm not going to code special cases for different population sizes – this would also be difficult as the population size when this would make sense differs with CPU speed as we can see in my previous blog post [Sch25] where we did not see a slowdown (with optimization turned off) with the new algorithm. In addition you typically need larger populations when using more objectives to cover the Pareto front [Jen03].

The following graphics contain the measurements for just the computation of the non-dominated sorting. When looking at the speedup (both, in the previous plot as well as the one at the end of the following graphics), especially when we turn off the traces for 2, 3, and 5 objectives (you can do this yourself in legend on the right of the graphics), we see that the growth is sub-linear. We're dividing here times that grow quadratically by times that grow (a little) more than linear. So the fraction is growing sub-linear as expected. But don't underestimate the efficacy of the new algorithm: The absolute difference (as seen in one of the graphics above) is growing faster than linear.

The script used for the measurements, the script for plotting and the measured data (and the old measured data) are available if someone want to reproduce this. The example is in PGAPack in the directory examples/nsgaii but the version with the new non-dominated sorting algorithm and timing instrumentation for the test is not yet released as of this writing.

[DTLZ05]

Kalyanmoy Deb, Lothar Thiele, Marco Laumanns, and Eckart Zitzler. Scalable test problems for evolutionary multiobjective optimization. In Ajith Abraham, Lakhmi Jain, and Robert Goldberg, editors, Evolutionary Multiobjective Optimization, Theoretical Advances and Applications, Advanced Information and Knowledge Processing, pages 105–145. Springer, 2005.

[Jen03] (1,2)

Mikkel T. Jensen. Reducing the run-time complexity of multiobjective EAs: The NSGA-II and other algorithms. IEEE Transactions on Evolutionary Computation, 7(5):503–515, October 2003.

[Sch25] (1,2,3)

Ralf Schlatterbeck. Non-dominated sorting for multi-objective optimization. Blog post, Open Source Consulting, June 2025.

Non-dominated Sorting for Multi-Objective Optimization


Some of you know that I'm maintaining PGApack, a genetic algorithm package and a corresponding Python wrapper, PGApy. PGApack implements multi-objective optimization using NSGA-II [DPAM02] and NSGA-III [DJ14], [JD14].

Multi-objective optimization means that we have more than one objective (or fitness-) function. Since typically these functions can contradict each other – when a candidate solution becomes better in one objective it might become worse in another objective – we need to establish what better means in the context of multi-objective optimization. An individual \(A\) is strictly better than another individual \(B\) – we say individual \(A\) dominates individual \(B\) – when \(A\) is better or equal to \(B\) in all objectives and strictly better in at least one of them.

Since we do not have a single fitness for individuals, multi-objective algorithms typically establish a ranking of individuals by dominance. Individuals which are not dominated by any other individual get dominance rank 0. Then we remove all rank-0 individuals and all individuals that are then non-dominated get rank 1 etc. This procedure is typically called non-dominated ranking or non-dominated sorting. This rank is then one of the components of the new fitness function.

The NSGA-II paper [DPAM02] implements this by checking each individual in a population against every other individual to see if one dominates the other. Then it removes the current non-dominated individuals one-by-one and establishes ranks as outlined above. This algorithm has a run-time that grows quadratically with the number of individuals to sort. In computer science we write this in the big O notation as \(O(n^2)\) where \(n\) is the number of individuals.

In 2003 Jensen [Jen03] published an algorithm that can perform the dominance-ranking of all individuals with effort \(O\left(n\cdot(\log n)^{m-1}\right)\) where \(m\) is the number of objectives (and \(n\) is the population size as before). Jensen's algorithm was updated 10 years later because he had formulated his algorithm in a way that could not handle individuals with the same evaluation in one objective which he claimed was easy to remedy but was not so easy for others [FGP13]. A year later a slight modification was made to formally prove that the algorithm runtime complexity actually fits the formula above [BS14].

I've now implemented this algorithm and benchmarked the new against the old \(O(n^2)\) version. For the benchmark I'm using the DTLZ-1 algorithm [DTLZ05] as Jensen did in his original paper [Jen03]. Similar to Jensen I've measured times up to a population size of 2000 individuals. I'm also getting most of the speedup for low numbers of objectives. In the following graphics the Y-axis is in seconds of CPU-time while the X-axis is the population size. There are three graphics for 2, 10, and 20 objectives. The effort is increasing with each objectives, the other graphics are quite similar. Note that contrary to Jensen I'm using a linear, not a log-log plot. All measurements contain the full time the algorithm needs, not just the non-dominated sorting, the times also include, e.g., the time for computing the evaluation functions. I've not measured this time separately.

The following graphics displays the speedups I'm getting. Note that contrary to Jensen [Jen03] I'm not getting higher times for the new algorithm for large numbers of objectives and small number of individuals. This may be due to computers having become a little faster during the last 20 years. Interesting is that starting with 5 objectives, the speedup stays almost the same (this is relative to the \(O(n^2)\) algorithm). This is probably because both, my implementation of the old algorithm as well as the new one stop comparing objectives when they have established that individuals do not dominate each other: This is the case whenever in one individual an objective is larger while another objective is smaller. Since after some optimization steps, most of the individuals tend to be non-dominated, it seems that the non-dominance can be established quickly without looking at all objectives.

[BS14]

Maxim Buzdalov and Anatoly Shalyto. A provably asymptotically fast version of the generalized Jensen algorithm for non-dominated sorting. In Thomas Bartz-Beielstein, Jürgen Branke, Bogdan Filipič, and Jim Smith, editors, Parallel Problem Solving from Nature – PPSN XIII, volume 8672 of Lecture Notes in Computer Science, pages 528–537. Ljubljana, Slovenia, September 2014.

[DJ14]

Kalyanmoy Deb and Himanshu Jain. An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part I: Solving problems with box constraints. IEEE Transactions on Evolutionary Computation, 18(4):577–601, August 2014.

[DPAM02] (1,2)

Kalyanmoy Deb, Amrit Pratap, Sameer Agarwal, and T. Meyarivan. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 6(2):182–197, April 2002.

[DTLZ05]

Kalyanmoy Deb, Lothar Thiele, Marco Laumanns, and Eckart Zitzler. Scalable test problems for evolutionary multiobjective optimization. In Ajith Abraham, Lakhmi Jain, and Robert Goldberg, editors, Evolutionary Multiobjective Optimization, Theoretical Advances and Applications, Advanced Information and Knowledge Processing, pages 105–145. Springer, 2005.

[FGP13]

Félix-Antoine Fortin, Simon Grenier, and Marc Parizeau. Generalizing the improved run-time complexity algorithm for non-dominated sorting. In Christian Blum, editor, Genetic and Evolutionary Computation GECCO 2013, pages 615–622. Amsterdam, The Netherlands, July 2013.

[JD14]

Himanshu Jain and Kalyanmoy Deb. An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part II: Handling constraints and extending to an adaptive approach. IEEE Transactions on Evolutionary Computation, 18(4):602–622, August 2014.

[Jen03] (1,2,3)

Mikkel T. Jensen. Reducing the run-time complexity of multiobjective EAs: The NSGA-II and other algorithms. IEEE Transactions on Evolutionary Computation, 7(5):503–515, October 2003.

Dynamic DNS with Bind Stopped Working


In 2021 I set up a dynamic DNS server using the popular ISC bind DNS server and documented my configuration in a blog article [1].

Recently (around the second half of the year 2024) the DNS updates stopped working. In my configurations I was using the public key method for signing DNS updates also known as SIG(0). This was long supported in bind and is still documented to workin the latest version.

All DNS update requests were suddenly denied with a REFUSED status. Even turning on debugging with the -d 15 option to named I was not able to get any further information, not even when logging to standard output with -g.

It turns out that the public key signatures using SIG(0) have been disabled due to a denial of service attack vector documented in CVE-2024-1975. DNS updates with shared TSIG keys still work. Unfortunately this seemed not documented and no appropriate log messages are seen. I've made a bug report for bind.

As it turns out I would have to look at the documentation for the version 9.18 shipped with Debian stable aka bookworm at the time of this writing: It documents that SIG(0) has been removed due to CVE-2024-1975.

It looks like later versions of bind (e.g. version 9.20.6 and up) again support SIG(0) but with a quota mechanism to prevent a denial of service attack. But for prior versions it seems that the "fix" involves completely disabling SIG(0).

Modeling a Wire Antenna with Insulation


[Update 2024-08-20: Correction of formula for L and equivalent radius]

[Update 2024-09-19: Correction of citation of article by Stearns, Supplement to Antenna Book [5]]

For my antenna modeling code pymininec (which is a re-implementation of the original Mininec code in Basic) I was researching how to model insulated wires. I had asked Roy Lewallen, W7EL, the author of EZNEC what he is using in EZNEC and received a pointer to a paper by J. H. Richmond [1] which seems the first paper to propose an algorithm for modeling insulated wires in an antenna and actually implementing it. The algorithm was implemented by Jerry McCormack in his masters thesis [2] and later incorporated into a contractor report [3] (the contractor report is almost identical with the thesis except for updated Fortran code but has a different author). The original Fortran code is not only listed in the reports but is still made available by Ray L. Cross on his "Antenna Scatterers Analysis Program" (ASAP) page. The subroutine "DSHELL" implements the insulated wire algorithm. With some modifications I was able to run this code today. Many thanks go to Roy, W7EL for sending me this information.

To test my implementation in pymininec against real measured data, I dug up a paper by David Lamensdorf [4] where the effect of insulation were actually measured. Since at the time the insulation could only be modelled as an infinite cylinder, the experiment used insulation that goes beyond the end of the wire until no more changes in the measured parameters were observed.

Experiments with the formulation by Richmond [1] indicated that the resulting impedance matrix is very ill-conditioned: For short wire segments the formulation yields very large absolute values that are added to the main diagonal of the impedance matrix and subtracted from the second diagonals (the matrix is symmetric and the second diagonal is the same for the lower and upper triangle). Since these large values are subtracted during the matrix inversion this leads to Catastrophic Cancellation and results that are dominated by cancellation effects.

So I investigated further and found a paper [5] and a presentation [6] by Steve Stearns, K6OIK which gives distributed inductance and an equivalent-radius formula for modeling insulation. His presentation also has a great history of various approaches to modeling insulation with programs that originally do not support it (like the original NEC-2 and the original Mininec):

\begin{align*} a_e &= a \left(\frac{b}{a}\right)^{\left(1- \frac{1}{\varepsilon_r}\right)} \\ L &= \frac{\mu_0}{2\pi}\left(1-\frac{1}{\varepsilon_r} \right)\log\left(\frac{b}{a}\right) \\ \sigma_e &= \sigma\left(\frac{a}{b}\right)^{2\left(1-\frac{1}{\varepsilon_r}\right)} \\ \end{align*}

where \(a\) is the original radius of the wire, \(b\) is the radius of the wire including insulation, \(\sigma\) is the conductivity of the wire, \(\varepsilon_r\) is the relative dieelectric constant of the insulation, \(a_e\) is the equivalent radius and \(\sigma_e\) is the equivalent conductivity that is used for removing the effect of the changed radius on the computation of the skin effect loading of the wire. The inductance \(L\) is the inductance per length of the insulated wire (or wire segment).

I then searched for a paper that would confirm the formulas by Steve Stearns and found an even older paper by Tai Tsun Wu [7] which has exactly the formula for the equivalent radius \(a_e\) and for the distributed inductance \(L\).

The formula for the equivalent conductivity is just used to cancel the effect of the equivalent radius on the computation of the skin effect load. In a previous blog post [8] I had given the formula for the skin effect load. In that formula, the conductivity \(\sigma\) appears as its reciprocal \(\rho\), the resistivity. The square root of \(\sigma\) is always used as a product with the radius. So we can undo the equivalent radius correction by multiplying \(\sigma\) with the reciprocal of the equivalent radius factor. We get

\begin{align*} \sqrt{\sigma}a &= \sqrt{\sigma_e}a\left(\frac{b}{a}\right)^{\left (1-\frac{1}{\varepsilon_r}\right)} \\ \frac{\sigma_e}{\sigma} &= \frac{1}{\left(\frac{b}{a}\right)^{2\left (1-\frac{1}{\varepsilon_r}\right)}} = \left(\frac{a}{b}\right)^{2\left(1-\frac{1}{\varepsilon_r}\right)} \\ \end{align*}

In pymininec I don't need to correct the equivalent radius for the skin effect computation, I can simply use the original radius there.

In the following three graphics I'm comparing various implementations of insulated wires with the measurements by Lamensdorf [4]. The wire has a diameter (not radius!) of 0.25in, the frequency is 600MHz. I'm using only the measurements with \(\varepsilon_r = 3.2\), the higher relative permittivities are unrealistic in practice and will deviate from the formulas due to the unrealistic experimental setup. I'm plotting the real- (conductance) and imaginary (susceptance) parts of the admittance (reciprocal of the impedance). The three graphics are:

  • Without any insulation

  • Diameter of wire with insulation of 0.375in (insulation 0.125in)

  • Diameter of wire with insulation of 0.5in (insulation 0.25in)

The traces in the plots are:

  • ASAP with two, four, eight and 16 segments for each half of a dipole, the traces are marked 'AS02', 'AS04', 'AS08' and 'AS16', respectively, I'm not showing all four traces in each plot

  • 4NEC2 which, according to Stearns [6], implements a formula derived by Cebik [9], these traces are marked '4N2'

  • NEC-2 using Wu's [7] formula for equivalent radius and distributed inductance as well as Stearns' [6] formula for equivalent \(\sigma\) marked as 'NEC'

  • My pymininec implementation using Wu's [7] formula for equivalent radius and distributed inductance (but not the \(\sigma\) correction because I can use the original radius for the skin effect directly) marked as 'mini'

  • The measurements from Lamensdorf [4] are the blue and orange bubbles (for the real and imaginary parts of the impedance, respectively)

  • In the graphics with insulation I'm also showing the results of EZNEC, the traces are greyed out and can be turned on in the legend on the right side

The Formula by Cebik [9] and by Wu [7] are very similar, the traces '4N2' and 'NEC' are almost identical.

I'm greying out the EZNEC traces and some of the ASAP traces because they exhibit numerical problems due to the ill-conditioned impedance matrix resulting from Richmond's formulation. It is interesting to see that EZNEC also exhibits numerical problems for smaller segments. You can see the traces by clicking into the legend on the right side.

The segmentation in all examples is 21 segments for the wire. This means small segments for the lower end of the lengths computed. It triggers a small segment warning in EZNEC for the first three points (up to and including \(\frac{4.2}{32} = 0.06558125\) wavelength) but not for the clearly numerically invalid 6th value in the first graphics with insulation and the points exhibiting numeric instability in the second graphics with insulation.

Note that the measurements by Lamensdorf [4] were performed on monopole-antennas. I'm simulating dipoles. The impedance of monopole is half that of a dipole, so for admittance I'm multiplying by 2. All admittances are in milli-Siemens, called milli-mhos (mho being ohm spelled backwards) with the symbol ℧ at the time of the Lamensdorf [4] paper.

Plotting Antenna Pattern


I'm the author of an antenna plotting program called plot-antenna. It is written in python and has graphics backends for both, matplotlib and plotly. As of this writing the version available on pypi does not yet support different frequency resolution for pattern and impedance plots. The matplotlib backend is better when generating graphics for use in printed documentation while the plotly backend can generate interactive graphics for the web. The documentation contains some matplotlib examples, so I'm going into more detail here for the plotly backend.

I've recently added support for the "Antenna Scatterers Analysis Program" ASAP, an antenna simulation program written before the Fortran-77 Standard in 1974. It is still interesting today because it uses a formulation for the Method of Moments [1] [2] that is different from both, NEC [3] and Mininec [4]. And it pioneered the simulation of wire insulation when an antenna uses insulated elements.

In the following I'm using a 3-Element Yagi-Uda antenna that is in the Example Section of ASAP. The input file I'm using exports antenna pattern for 280 MHz to 305 MHz in 5 MHz steps. In addition it exports antenna impedance in 1 MHz steps.

When plotting, plot-antenna can use different scaling methods for the antenna pattern. The default is ARRL scaling [5] which I'm using in the following example.

The 3D plot can switch the frequency in the legend on the right side, by clicking on a frequency the display switches to the antenna pattern for that frequency. In addition on mouse-over the gain values at different positions in the 3D-plot can be seen. It is possible to zoom with the mouse wheel and turn the plot by clicking and dragging.

Of course plot-antenna can also display the conventionaly 2D plots for Azimuth and elevation:

With the 2D display in the legend on the right side, multiple traces can be enabled at once. This way it is easier to compare gain pattern than with the 3D plot where only one pattern can be shown at a time. Similar to the 3D-plot we can also see the gain values on mouse-over. It is possible to zoom (althought this is not very useful in a polar plot) and reset to the default view with the "Home" button in the menu.

Often it is useful to see the voltage standing-wave ratio and the antenna impedance. This is achieved with the VSWR plot. In the following plot I've turned on view of the impedance using the --swr-show-impedance option. The vertical green line indicates the position of minimum VSWR. Again we can see the plotted values on mouse-over. We can zoom in by selecting a rectangle with the mouse or by using the +/- buttons in the menu. The view can be reset to the default using the "Home" button in the menu.

Last not least we can plot the antenna geometry. This is again a 3D plot where we can show the positions of the antenna elements on mouse-over. In the legend on the right we can turn off the visibility of the feed point (in orange). Other antennas may show a ground plane in this view or additional impedance loading on the antenna. The geometry view is very useful to check if the antenna has been modeled correctly (e.g. if the feed point is at the correct position).

The program plot-antenna currently supports output from ASAP, the original Mininec core (written in Basic), my Python version pymininec, and output of NEC-2. The format is auto-detected. In addition, with the companion-program plot-eznec, the gain pattern exported from EZNEC can be displayed. As of this writing there is a not-yet released version that can also display VSWR plots using the data exported by EZNEC when plotting VSWR in a file called LastZ.txt.

Skin Effect Load Update


In my last post [1] I was wondering why my implementation of skin effect load in pymininec did not match the results we get from NEC-2. Turns out I was wrong in assuming that the skin effect load is purely resistive, it has a large reactive component.

Note: This article contains a lot of math. You may want to skip to the end and check how the new implementation in pymininec behaves compared to may last article which used only a purely resistive load from the skin effect resistivity.

The english Wikipedia article on skin effect does have the correct formula:

\begin{equation*} \newcommand{\Int}{{\mathrm\scriptscriptstyle int}} \newcommand{\ber}{\mathop{\mathrm{ber}}\nolimits} \newcommand{\bei}{\mathop{\mathrm{bei}}\nolimits} \end{equation*}
\begin{equation*} Z_\Int = \frac{k\rho}{2\pi r}\frac{J_0 (kr)}{J_1 (kr)} \end{equation*}

where

\begin{equation*} k = \sqrt{\frac{-j\omega\mu}{\rho}} \end{equation*}

and \(r\) is the wire radius, \(J_v\) are the Bessel functions of the first kind of order \(v\). \(Z_\Int\) is the impedance per unit length of wire.

NEC-2 uses a different formulation [2] (p.75):

\begin{equation*} Z_i = \frac{j\Delta_i}{a_i} \sqrt{\frac{\omega\mu}{2\pi\sigma}} \left[\frac{\ber (q) + j\bei (q)} {\ber^\prime(q)+ j\bei^\prime(q)}\right] \end{equation*}

with

\begin{equation*} q = (\omega\mu\sigma)^{1/2} a_i \end{equation*}

and \(a_i\) the wire radius, \(\Delta_i\) the length of the wire segment, \(\sigma\) the wire conductivity and \(\ber\) and \(\bei\) the Kelvin functions. In the following we will use \(r\) for the wire radius, not \(a_i\). Note that \(\ber^\prime\) is the derivative of \(\ber\), we can also write \(\ber\) as \(\ber_0\) (order 0) and \(\ber^\prime\) as \(\ber_1\) (order 1) and similar for \(\bei\).

In the following I'll be a little bit sloppy about the usage of the permeability of free space \(\mu_0\) vs. the permeability of the wire \(\mu\), for non-ferromagnetic wires these are really (almost) the same: \(\mu = \mu_r * \mu_0\) with \(\mu_r\approx 1\) for non-ferromagnetic material.

The NEC-2 code, uses a Fortran function ZINT which gets two parameters, the conductivity multiplied by the wavelength \(\lambda\) (parameter SIGL) and the radius divided by the wavelength \(\lambda\) (parameter ROLAM). Apart from an approximation of the Kelvin functions (the term computing the fraction of the Kelvin functions is BR1), ZINT computes:

X = SQRT (TPCMU * SIGL) * ROLAM
ZINT = FJ * SQRT (CMOTP / SIGL) * BR1 / ROLAM

with TPCMU = 2368.705, CMOTP=60.0, and FJ is \(j\). The computed X is passed to the computation of the Kelvin functions resulting in BR1 for the fraction term. Note that the computed ZINT does not include the length, it is equivalent with the Wikipedia definition in unit length. With the reverse-engineered magic numbers

CMOTP \(\approx \frac{\mu_0 c}{2\pi}\)

TPCMU \(\approx 2\pi c\mu_0\)

(\(c\) is the speed of light) we get for X from the code (which we call \(q\) now consistent with the formula from [2]):

\begin{align*} q &= \frac{r}{\lambda}\sqrt{2368.705\sigma\lambda} \\ &= r\sqrt{\frac{2368.705\sigma}{\lambda}} \\ &\approx r\sqrt{\frac{2\pi c\mu_0\sigma}{\lambda}} \\ &= r\sqrt{\frac{2\pi c\mu_0\sigma f}{c}} \\ &= r\sqrt{\frac{2\pi \mu_0\sigma \omega}{2\pi}} \\ &= r\sqrt{\omega\mu_0\sigma} \\ \end{align*}

which matches the definition of \(q\) in [2] (see above).

for \(Z_\Int\) of the Fortran implementation we get:

\begin{align*} Z_\Int &= \frac{j}{r} \sqrt{\frac{\mu_0 c}{2\pi\sigma\lambda}} \left[\frac{\ber_0 (q) + j\bei_0 (q)} {\ber_1(q)+ j\bei_1(q)}\right] \\ &= \frac{j}{r} \sqrt{\frac{\mu_0 c f}{2\pi\sigma c}} \left[\frac{\ber_0 (q) + j\bei_0 (q)} {\ber_1(q)+ j\bei_1(q)}\right] \\ &= \frac{j}{r} \sqrt{\frac{\mu_0 f}{2\pi\sigma}} \left[\frac{\ber_0 (q) + j\bei_0 (q)} {\ber_1(q)+ j\bei_1(q)}\right] \\ &= \frac{j}{r} \sqrt{\frac{\mu_0 \omega}{2\pi 2\pi\sigma}} \left[\frac{\ber_0 (q) + j\bei_0 (q)} {\ber_1(q)+ j\bei_1(q)}\right] \\ &= \frac{j}{2\pi r} \sqrt{\frac{\mu_0 \omega}{\sigma}} \left[\frac{\ber_0 (q) + j\bei_0 (q)} {\ber_1(q)+ j\bei_1(q)}\right] \\ \end{align*}

Which does NOT match the formula for \(Z_i\) from [2] above. It looks like the formula from [2] (p. 75) should have the frequency \(f\) instead of \(\omega\) under the square root. The formula derived from the NEC-2 implementation, however, is consistent with the Wikipedia formula above.

The term before the fraction of Kelvin functions differs by a factor from the Wikipedia formula (I'm leaving out \(\Delta_i\), making it a per unit length formula like the one from Wikipedia), also note that \(\rho=1/\sigma\):

From code:

\begin{equation*} \frac{j}{2\pi r}\sqrt{\frac{\omega\mu}{\sigma}} \end{equation*}

From Wikipedia:

\begin{equation*} \frac{k\rho}{2\pi r} = \sqrt{\frac{-j\omega\mu}{\rho}}\frac{\rho}{2\pi r} = \frac{1}{2\pi r}\sqrt{-j\omega\mu\rho} = \frac{1}{2\pi r}\sqrt{\frac{-j\omega\mu}{\sigma}} \end{equation*}

This is a factor of

\begin{equation*} \frac{\sqrt{-j}}{j} = -\frac{1+j}{\sqrt{2}} \end{equation*}

because \(\sqrt{-j} = \frac{1-j}{\sqrt{2}}\)

For the Kelvin functions we have the following equivalence to Bessel functions [3] (10.61, Kelvin Functions):

\begin{equation*} \ber_v x + j \bei_v x = J_v \left(xe^{3\pi j/4}\right) = e^{v\pi j} J_v \left(xe^{-\pi j / 4}\right) \end{equation*}

With the special cases of order 0 and 1:

\begin{align*} \ber_0 x + j \bei_0 x &= e^{0} J_0 \left(xe^{-\pi j / 4}\right) \\ &= J_0 \left(\frac{1-j}{\sqrt{2}}x\right) \\ \ber_1 x + j \bei_1 x &= e^{j\pi} J_1 \left(xe^{-\pi j / 4}\right) \\ &= -J_1 \left(\frac{1-j}{\sqrt{2}}x\right) \\ \end{align*}

The Wikipedia formula passes \(kr\) to the Bessel functions:

\begin{equation*} kr = r\sqrt{\frac{-j\omega\mu}{\rho}} = r\sqrt{-j\omega\mu\sigma} \end{equation*}

while the NEC-2 passes

\begin{equation*} q = r\sqrt{\omega\mu\sigma} \end{equation*}

which differ by the factor \(\sqrt{-j}\). This means that we pass a factor of \(\sqrt{2}\) too large to the Bessel functions. In turn due to the symmetry and the fact that the functions in the denominator are the derivative of the numerator, the output is too large by a factor of \(\sqrt{2}\) which compensates the factor of the first term.

Implementing the new formula in pymininec now the reactance part of the impedance behaves similar to NEC-2 and the results for the resistance are even closer than before.

Also when plotting reactance by wire resistivity the results are now as expected. The values for MMANA are still surprising, the reason might be that I was running MMANA under WINE in Linux, the resistance values computed by MMANA look OK, though.

Skin Effect Load and Impedance


I've recently implemented skin effect loading of wires in pymininec, my re-implementation of the Mininec antenna modeling code. The original Mininec code did not have this feature.

To attempt verification of my implementation I chose to follow an ancient publication by R. P. Haviland, W4MB [1] where he, among other things, investigates effect of wire resistance on feed point impedance. He has two graphics where the real and the imaginary part of the feed point impedance are plotted against the length/diameter ratio of the wire of a dipole. He mentions Mininec in the text but reading closely it is not apparent if for the two graphics with the feed point impedance he really used Mininec or something else. I've tried to reproduce his graphs with both PyNEC, a Python wrapper for a C++ implementation of NEC-2 and my pymininec implementation of mininec. The results of Haviland have the same tendencies as the NEC results in the following but do not match closely.

Note that the X-axis of these graphs is logarithmic and the wire gets thicker to the right (the X-axis is length/diameter of the wire). In all cases we compare ideal wires without resistance to wires with copper resistivity.

Now the resistance part is not surprising: The real part of the antenna impedance gets higher when the wires have higher resistance (because they get thinner).

The interesting part is the imaginary part (the reactance) of the antenna. The NEC result is that the reactance gets higher with higher resistance of the wire (to the left) while my pymininec implementation has a slight decrease of the reactance.

To investigate this further I've decided to just change the wire resistivity without modifying the wire radius to show that the effect is independent of the wire radius. The radius is 66µm at a wavelength of 1m, \(\frac{l}{d}\approx 7576\).

In addition to PyNEC and pymininec I've also included results for MMANA-GAL and EZNEC. MMANA is originally based on Mininec. The EZNEC results match closely to the PyNEC results (it's the same engine, one implemented in Fortran, one in C++) but MMANA keeps the reactance completely constant at \(42.63\Omega\) at all resistivities of the wire. Note that the most interesting wire resistivities are the first three big dots on the left, the first is an ideal wire (at 0 resistivity) followed by copper and alu.

I currently don't have an idea which result comes closest to the physical reality.

Beat


Musician know the effect when tuning instruments: If two instruments are playing not quite the same tune, it sounds like the tune has a periodic variation in volume. The period gets larger the better tuned the two instruments are and stops when they both have the same tune. We call this effect "Beat".

Ham radio operators know the "opposite" of that effect when amplitude-modulating a radio frequency with a sine wave in the audio range: The result are two radio frequencies, one lower by the frequency of the audio sine wave, one higher by that frequency.

Amplitude modulation is simply a multiplication: We multiply the radio wave with the audio. We have

$$m(t) = A(t) \cdot\cos(2\pi f t)$$

Where $t$ is the time, $m(t)$ is the modulated signal, $\cos(2\pi f_r t)$ is the radio frequency signal with frequency $f_r$ and $A(t)$ is the audio signal. For a simple audio sine wave we get:

$$m(t) = \cos(2\pi f_a t)\cdot\cos(2\pi f_r t)$$

with $f_a$ the audio frequency.

Now we want to show that both effects are the same thing. We do not need radio frequencies, we can do the same in the audio spectrum.

From Wikipedia we take one of the trigonometric Product sum identities:

$$\displaystyle\cos\theta \cos\phi = \frac{\cos(\theta-\phi) + \cos(\theta+\phi)}{2}$$

When we apply this to our modulation example, we get:

$$\displaystyle \begin{align} \cos(2\pi f_a t) \cos(2\pi f_r t) & = \frac{\cos(2\pi f_r t-2\pi f_a t) + \cos(2\pi f_r t+2\pi f_a t)}{2} \\ & = \frac{\cos(2\pi(f_r-f_a)t)) + \cos(2\pi(f_r+f_a)t)}{2} \\ \end{align} $$

So multiplying two cosines with different frequencies results in the sum of two cosine functions, one with the frequency of the difference of the original frequencies, one with the sum.

So no matter how we create the modulated signal, by multiplying two signals or adding two signals we get the same result.

I the following example we multiply a 3Hz signal (the beat) with a 440 Hz signal (that's an 'A' in musical terms). We get a modulated signal that changes volume with the beat frequency as expected.

In [1]:
import numpy as np
# Three seconds with 18000 samples
t = np.linspace (0, 3, 18000)
# Product of 3Hz and 440Hz tune
y = np.cos (2*np.pi*3*t) * np.cos (2*np.pi*440*t)
# Plot only 1/9 of the three seconds = 1/3s = one period of the 3Hz signal
limit = int (2000)
In [2]:
import plotly.graph_objs as go
import plotly.io as pio
from plotly_fix_fn import setup_plotly
setup_plotly ('content/2024/05/plotly', '11-en')
show_opt = dict (include_plotlyjs = 'directory')

fig = go.Figure ()
fig.add_trace (go.Scatter (x = t [:limit], y = y [:limit]))
fig.update_layout (title = "Modulated signal", xaxis_title = "Time (s)", yaxis_title = "Amplitude")
#fig_widget = go.FigureWidget (fig)
fig.show (** show_opt)

Now we do a fourier transform of the result to see the frequency spectrum, we see that the result of the multiplication consists of two frequencies, one 3Hz below the 440 Hz and one 3Hz above.

In [3]:
from numpy.fft import rfft, fftfreq
fft   = rfft (y, norm = 'forward')
freqs = fftfreq (len (y), d = 3/18000)
fig = go.Figure ()
fig.add_trace (go.Scatter (x = freqs [:5000], y = np.abs (fft [:5000])))
fig.update_layout (title = "Frequency spectrum", xaxis_title = "Frequency (Hz)")
fig.update_xaxes (range = [430, 450])
fig.show (** show_opt)

We can arrive at the same result by just adding the two frequencies (and dividing by 2 to get the same amplitude):

In [4]:
y = 0.5 * (np.cos (2*np.pi*(440-3)*t) + np.cos (2*np.pi*(440+3)*t))
fig = go.Figure ()
fig.add_trace (go.Scatter (x = t [:limit], y = y [:limit]))
fig.update_layout (title = "Modulated signal", xaxis_title = "Time (s)", yaxis_title = "Amplitude")
fig.show (** show_opt)

Finally we verify again that this produces the same audio spectrum as before by doing another fourier analysis.

In [5]:
fft   = rfft (y, norm='forward')
freqs = fftfreq (len (y), d = 3/18000)
fig = go.Figure ()
fig.add_trace (go.Scatter (x = freqs [:5000], y = np.abs (fft [:5000])))
fig.update_layout (title = "Frequency spectrum", xaxis_title = "Frequency (Hz)")
fig.update_xaxes (range = [430, 450])
fig.show (** show_opt)

So what you hear when tuning instruments is not an illusion: Adding the two frequencies is the same thing as when modulating a tune with a beat frequency.

Chipselect Glitch with SPI NOR on IMX-28


For a customer I'm maintaining a Linux environment (including kernel) for a custom board that uses a module based on the IMX-28 ARM processor. For the user space I'm using ELBE, the Embedded Linux Build Environment that allows building a user space filesystem from a Debian Linux release. Migrating from an earlier Debian version to the current stable release code-named Bookworm we needed a kernel upgrade because systemd requires some features not present in earlier kernels.

After upgrading from a Kernel in the 4.14 longterm support series to the 6.1 longterm support series our SPI NOR flash memory was no longer working. The JFFS2 file system we're using produced errors that indicated that we had a problem writing to the NOR flash (some error messages indicated that the file system expected bytes to be 0 which were 0xFF which lead to the conclusion that these bytes had not been written).

After verifying that the clock rate of the SPI bus when writing to the flash device was unchanged between kernel versions I looked at the signals of the SPI bus.

image of glitch in chip select

Glitch in Chip Select During Transfer

image of glitch in chip select higher temporal resolution

Glitch in Chip Select During Transfer: Time Zoom

From the Oszilloscope pictures (blue is the chip select, yellow is the SPI clock) it can be seen that there is a short glitch in the chip select line. This has the effect of terminating the page program command to the NOR flash prematurely. From the zoomed-in version it can be seen that the chip-select has a glitch in the middle of a bit during the transfer. The exact position changes slightly from transfer to transfer (it doesn't hit the same bit every time but it does hit the same byte).

Further experimentation showed that the problem occurs only when the SPI transfer is initiated via direct memory access (DMA), not when the non-DMA version (programmed input-output, PIO) was used. The SPI driver optimizes transfers: Only longer transfers are done via DMA, short transfers use PIO because the DMA overhead exceeds the gain. When looking into the DMA implementation of the IMX-28 in drivers/dma/mxs-dma.c it turns out that there was a change in a flag passed to the DMA transfer, the old version has:

if (flags & DMA_CTRL_ACK)
        ccw->bits |= CCW_WAIT4END;

The new version has:

if (flags & MXS_DMA_CTRL_WAIT4END)
        ccw->bits |= CCW_WAIT4END;

The bit CCW_WAIT4END according the the IMX-28 processor reference: "A value of one indicates that the channel waits for the end of command signal to be sent from the APBH device to the DMA before starting the next DMA command." This flag is set on the actual DMA transfer to the NOR flash. The IMX-28 is special in that it can transfer register settings also via DMA. Prior to the memory transfer a register transfer is initiated that sets the IGNORE_CRC bit in the processors SSP register. From the processor reference on the IGNORE_CRC bit: "In SPI/SSI modes: When set to 1, deassert the chip select (SSn) pin after the command is executed".

So what happens when the CCW_WAIT4END is not set is that the chip select is deasserted too early.

The DMA transfer initiated from the SPI driver in drivers/spi/spi-mxs.c does not set the MXS_DMA_CTRL_WAIT4END flag, so the CCW_WAIT4END is not set on the transfer.

Now the fix is obvious, we simply need to set this flag on the DMA transfer initiated by the SPI driver in drivers/spi/spi-mxs.c.