<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>David Álvarez Rosa</title><link>https://david.alvarezrosa.com/</link><description>Personal site of David Álvarez Rosa</description><language>en</language><lastBuildDate>Thu, 25 Jun 2026 18:02:00 +0100</lastBuildDate><atom:link href="https://david.alvarezrosa.com/index.xml" rel="self" type="application/rss+xml"/><item><title>Tuning a Server for Benchmarking</title><link>https://david.alvarezrosa.com/posts/tuning-a-server-for-benchmarking/</link><pubDate>Thu, 25 Jun 2026 18:02:00 +0100</pubDate><guid>https://david.alvarezrosa.com/posts/tuning-a-server-for-benchmarking/</guid><description>&lt;p&gt;Optimizing code starts with measuring it, and a measurement is only
useful if it is repeatable: a 2% improvement is invisible under 5% of
noise. Yet on an untuned machine the same binary can easily run several
percent faster or slower between runs. In this post we take a tiny
benchmark and tune the machine step by step, re-measuring after every
change, until runs become deterministic.&lt;sup id="fnref:1"&gt;&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;h2 id="a-noisy-baseline"&gt;
A noisy baseline
&lt;a class="anchor" href="#a-noisy-baseline"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Our running example sums an array of doubles, in short bursts. Real
services rarely hammer the CPU continuously: they handle a request, sit
idle, and wake up for the next one. Each timed iteration here runs a
burst of 256 sums after a 2 ms idle gap, with the gap excluded from the
measurement&lt;sup id="fnref:2"&gt;&lt;a href="#fn:2" class="footnote-ref" role="doc-noteref"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;BM_Sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;benchmark&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;State&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;alignas&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;double&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;iota&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;begin&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nl"&gt;_&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PauseTiming&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// Idle between bursts, like a real service
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;this_thread&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;sleep_for&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;chrono&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;milliseconds&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResumeTiming&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;accumulate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cbegin&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cend&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;benchmark&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;DoNotOptimize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;BENCHMARK&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BM_Sum&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Compile it in release with all optimizations, &lt;code&gt;-O3&lt;/code&gt;, and &lt;code&gt;-march=native -mtune=native -flto -ffast-math&lt;/code&gt;. Then run ten repetitions and
aggregate them&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ ./benchmark --benchmark_repetitions&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;10&lt;/span&gt; --benchmark_min_time&lt;span class="o"&gt;=&lt;/span&gt;200x
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;BM_Sum_mean &lt;span class="m"&gt;99575&lt;/span&gt; ns
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;BM_Sum_stddev &lt;span class="m"&gt;2704&lt;/span&gt; ns
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;BM_Sum_cv 2.72 %
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The interesting line is &lt;code&gt;cv&lt;/code&gt;, the coefficient of variation: standard
deviation divided by mean. Almost &lt;strong&gt;3%&lt;/strong&gt; of run-to-run noise&amp;mdash;any
optimization smaller than that is invisible. Let&amp;rsquo;s bring it down.&lt;/p&gt;
&lt;h2 id="know-your-hardware"&gt;
Know your hardware
&lt;a class="anchor" href="#know-your-hardware"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Before turning any knob, look at what you are tuning. &lt;code&gt;lstopo&lt;/code&gt; draws
the whole machine in one picture: caches, cores, SMT pairs, and the PCIe
devices hanging off them. Start with my laptop&lt;/p&gt;
&lt;figure&gt;
&lt;figcaption&gt;&lt;p&gt;&lt;span class="figure-number"&gt;Figure 1: &lt;/span&gt;&lt;strong&gt;My laptop (Intel Core Ultra 5 135U).&lt;/strong&gt; Three kinds of cores: two P-cores with two hardware threads each (dotted), eight E-cores in clusters of four sharing an L2, and two low-power E-cores (bottom left) sitting outside the L3 entirely.&lt;/p&gt;&lt;/figcaption&gt;
&lt;a href="https://david.alvarezrosa.com/images/lstopo-laptop.png"&gt;
&lt;picture&gt;
&lt;source
type="image/webp"
srcset="https://david.alvarezrosa.com/images/lstopo-laptop_hu_5f7f7ed74b20a9a8.webp 240w, https://david.alvarezrosa.com/images/lstopo-laptop_hu_1b6999ae1cbbfb90.webp 360w, https://david.alvarezrosa.com/images/lstopo-laptop_hu_108c0f48cdcd27ae.webp 420w, https://david.alvarezrosa.com/images/lstopo-laptop_hu_268ad41b5a7e3cc1.webp 480w, https://david.alvarezrosa.com/images/lstopo-laptop_hu_6489d9f79c3a6a53.webp 768w, https://david.alvarezrosa.com/images/lstopo-laptop_hu_601e39e6acf65d8b.webp 789w"
sizes="auto"&gt;
&lt;img
src="https://david.alvarezrosa.com/images/lstopo-laptop.png"
srcset="https://david.alvarezrosa.com/images/lstopo-laptop_hu_d0fd7bbb33c16fac.png 240w, https://david.alvarezrosa.com/images/lstopo-laptop_hu_57b9c29e5ea5ab96.png 360w, https://david.alvarezrosa.com/images/lstopo-laptop_hu_d477a9b0840f6879.png 420w, https://david.alvarezrosa.com/images/lstopo-laptop_hu_6813fff6f78bc56d.png 480w, https://david.alvarezrosa.com/images/lstopo-laptop_hu_220340a29c7d9030.png 768w, https://david.alvarezrosa.com/images/lstopo-laptop_hu_f236efbe6b4f94e0.png 789w"
sizes="auto"
width="789"
height="493"
alt="Lstopo laptop"
loading="lazy"
decoding="async"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/figure&gt;
&lt;p&gt;Here the choice of core changes what you measure: land on CPU 4 and you
get an E-core at lower clocks; on CPU 12 you lose the L3 too. Now
compare that against my homelab server&lt;/p&gt;
&lt;figure&gt;
&lt;figcaption&gt;&lt;p&gt;&lt;span class="figure-number"&gt;Figure 2: &lt;/span&gt;&lt;strong&gt;My homelab server (AMD Ryzen 7 PRO 8700GE).&lt;/strong&gt; Eight identical cores with identical caches; the NVMe drives and the NIC hang off PCIe on the right.&lt;/p&gt;&lt;/figcaption&gt;
&lt;a href="https://david.alvarezrosa.com/images/lstopo-homelab.png"&gt;
&lt;picture&gt;
&lt;source
type="image/webp"
srcset="https://david.alvarezrosa.com/images/lstopo-homelab_hu_85a3477db4ec85a3.webp 240w, https://david.alvarezrosa.com/images/lstopo-homelab_hu_327c39959d97da39.webp 360w, https://david.alvarezrosa.com/images/lstopo-homelab_hu_254e8b9f23cccb50.webp 420w, https://david.alvarezrosa.com/images/lstopo-homelab_hu_4224edcd26c1ba.webp 480w, https://david.alvarezrosa.com/images/lstopo-homelab_hu_5341a9a6a54006c6.webp 483w"
sizes="auto"&gt;
&lt;img
src="https://david.alvarezrosa.com/images/lstopo-homelab.png"
srcset="https://david.alvarezrosa.com/images/lstopo-homelab_hu_d60e81b4439dd8d4.png 240w, https://david.alvarezrosa.com/images/lstopo-homelab_hu_d46aeafb64f0af8f.png 360w, https://david.alvarezrosa.com/images/lstopo-homelab_hu_e4b3c67d0f79a033.png 420w, https://david.alvarezrosa.com/images/lstopo-homelab_hu_3c5114a99e77f577.png 480w, https://david.alvarezrosa.com/images/lstopo-homelab_hu_1c40d1a22e7205ee.png 483w"
sizes="auto"
width="483"
height="324"
alt="Lstopo homelab"
loading="lazy"
decoding="async"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/figure&gt;
&lt;p&gt;On the server every core is as good as any other: homogeneous machines
make better benchmarking boxes. The PCIe side matters once a benchmark
touches I/O: it shows which NVMe or NIC you are exercising and, on
multi-socket machines, which NUMA node it hangs off.&lt;/p&gt;
&lt;h2 id="pin-to-a-core"&gt;
Pin to a core
&lt;a class="anchor" href="#pin-to-a-core"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;The scheduler is free to migrate the benchmark between cores, and every
migration throws away warm caches. On hybrid CPUs it&amp;rsquo;s worse:
performance and efficiency cores run the same code at very different
speeds, so results turn bimodal depending on where the process lands.
Pin the benchmark to a single core (on hybrid parts, a P-core)&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ taskset -c &lt;span class="m"&gt;2&lt;/span&gt; ./benchmark ...
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The mean falls to &lt;strong&gt;55.3 µs&lt;/strong&gt; and the CV better than halves, to &lt;strong&gt;1.06%&lt;/strong&gt;.
The win is bigger than migration costs alone would suggest: every burst
now wakes the same core, so that core&amp;rsquo;s clock never has time to sag
between bursts.&lt;sup id="fnref:3"&gt;&lt;a href="#fn:3" class="footnote-ref" role="doc-noteref"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;h2 id="lock-the-cpu-frequency"&gt;
Lock the CPU frequency
&lt;a class="anchor" href="#lock-the-cpu-frequency"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;By default Linux scales the CPU frequency with load, so the benchmark
starts on a cold, slow clock and finishes on a hot, fast one. Switch
the frequency governor to &lt;code&gt;performance&lt;/code&gt; to keep clocks locked high&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo cpupower frequency-set --governor performance
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;and verify it took effect&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;performance
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Re-measuring gives a mean of &lt;strong&gt;54.9 µs&lt;/strong&gt; and a CV of &lt;strong&gt;0.79%&lt;/strong&gt;. The
increment looks modest only because pinning already kept our core&amp;rsquo;s
clock warm: on its own, the performance governor takes the unpinned
baseline from 99.6 µs straight to 54.5 µs. Either way, no burst ever
wakes up on a cold clock again.&lt;/p&gt;
&lt;h2 id="disable-hyperthreading"&gt;
Disable hyperthreading
&lt;a class="anchor" href="#disable-hyperthreading"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;CPU still shares its execution units and L1/L2 caches with its SMT
sibling: anything the scheduler places there perturbs our measurement.
Disable SMT entirely&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ &lt;span class="nb"&gt;echo&lt;/span&gt; off &lt;span class="p"&gt;|&lt;/span&gt; sudo tee /sys/devices/system/cpu/smt/control
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The CV drops to &lt;strong&gt;0.26%&lt;/strong&gt;, three times better: the core now has its
execution units and caches all to itself.&lt;/p&gt;
&lt;h2 id="disable-turbo-boost"&gt;
Disable turbo boost
&lt;a class="anchor" href="#disable-turbo-boost"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Even with the performance governor, turbo frequencies vary with
temperature and power budget: the same run on a warm machine clocks
lower than on a cool one. Disable turbo for stable clocks&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; sudo tee /sys/devices/system/cpu/cpufreq/boost
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;On this machine nothing changes, since our short bursts never gave the
silicon time to boost anyway. On a machine where turbo does engage,
expect the mean to climb instead: you are giving up peak performance.
That trade is fine, since when optimizing we care about &lt;em&gt;relative&lt;/em&gt;
numbers, and those are now comparable across runs.&lt;sup id="fnref:4"&gt;&lt;a href="#fn:4" class="footnote-ref" role="doc-noteref"&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;h2 id="summary"&gt;
Summary
&lt;a class="anchor" href="#summary"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Here is the whole journey in one table, each row adding one change on
top of all the previous ones. We went from almost &lt;strong&gt;3%&lt;/strong&gt; of noise down to
&lt;strong&gt;0.26%&lt;/strong&gt;, and got 1.8x faster along the way; differences of half a
percent are now real, measurable signal.&lt;sup id="fnref:5"&gt;&lt;a href="#fn:5" class="footnote-ref" role="doc-noteref"&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Step&lt;/th&gt;
&lt;th&gt;Mean&lt;/th&gt;
&lt;th&gt;StdDev&lt;/th&gt;
&lt;th&gt;CV&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Untuned&lt;/td&gt;
&lt;td&gt;99.6 µs&lt;/td&gt;
&lt;td&gt;2.70 µs&lt;/td&gt;
&lt;td&gt;2.72%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;+ pinned to one core&lt;/td&gt;
&lt;td&gt;55.3 µs&lt;/td&gt;
&lt;td&gt;0.59 µs&lt;/td&gt;
&lt;td&gt;1.06%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;+ performance governor&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;54.9 µs&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0.43 µs&lt;/td&gt;
&lt;td&gt;0.79%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;+ hyperthreading off&lt;/td&gt;
&lt;td&gt;55.3 µs&lt;/td&gt;
&lt;td&gt;0.15 µs&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.26%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;+ turbo disabled&lt;/td&gt;
&lt;td&gt;55.5 µs&lt;/td&gt;
&lt;td&gt;0.14 µs&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.26%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;On busier machines there is a longer tail of knobs worth trying:
disabling address space layout randomization, the NMI watchdog, or
transparent huge pages. The &lt;a href="https://github.com/david-alvarez-rosa/CppPlayground/blob/main/scripts/bench-remote.sh"&gt;bench-remote.sh&lt;/a&gt; script applies all. None
of it survives a reboot, which is exactly what you want: tune, measure,
and reboot back to a normal machine.&lt;/p&gt;
&lt;br /&gt;
&lt;p&gt;Long live reproducible benchmarks!&lt;/p&gt;
&lt;div class="footnotes" role="doc-endnotes"&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id="fn:1"&gt;
&lt;p&gt;Note that tuning for
&lt;em&gt;benchmarking&lt;/em&gt; is not the same as tuning for &lt;em&gt;performance:&lt;/em&gt; a benchmark
wants the machine repeatable, even at the cost of some peak speed. A
production box, however, wants every last bit of speed.&amp;#160;&lt;a href="#fnref:1" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:2"&gt;
&lt;p&gt;&lt;code&gt;PauseTiming&lt;/code&gt; / &lt;code&gt;ResumeTiming&lt;/code&gt; keep the sleep out of the
measured time, and &lt;code&gt;DoNotOptimize&lt;/code&gt; keeps the result alive past the
optimizer; without it the compiler deletes the entire loop.&amp;#160;&lt;a href="#fnref:2" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:3"&gt;
&lt;p&gt;Pinning puts the benchmark &lt;em&gt;onto&lt;/em&gt; the core but does
not keep &lt;em&gt;other&lt;/em&gt; tasks off it. On a busy box, go further and reserve
the core for the benchmark alone, either on the kernel command line
(&lt;code&gt;isolcpus=2 nohz_full=2 rcu_nocbs=2&lt;/code&gt;) or at runtime with a &lt;code&gt;cpuset&lt;/code&gt;
cgroup.&amp;#160;&lt;a href="#fnref:3" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:4"&gt;
&lt;p&gt;Low-latency
production tuning makes the &lt;em&gt;opposite&lt;/em&gt; call and keeps turbo on: there,
every nanosecond counts. The most latency-sensitive trading shops go
further and run overclocked servers, locked at a fixed all-core
frequency above stock&amp;mdash;speed &lt;em&gt;and&lt;/em&gt; stable clocks, bought with better
cooling.&amp;#160;&lt;a href="#fnref:4" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:5"&gt;
&lt;p&gt;Feel free to reproduce on
your machine using the &lt;a href="https://github.com/david-alvarez-rosa/CppPlayground/blob/main/scratch/benchmark.cpp"&gt;benchmark&lt;/a&gt; from my &lt;a href="https://github.com/david-alvarez-rosa/CppPlayground"&gt;CppPlayground&lt;/a&gt; repository.&amp;#160;&lt;a href="#fnref:5" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;</description></item><item><title>Self-Hosting on the Dark Web</title><link>https://david.alvarezrosa.com/posts/self-hosting-on-the-dark-web/</link><pubDate>Mon, 01 Jun 2026 10:55:00 +0100</pubDate><guid>https://david.alvarezrosa.com/posts/self-hosting-on-the-dark-web/</guid><description>&lt;p&gt;This site is now reachable over Tor as a hidden service, at a &lt;code&gt;.onion&lt;/code&gt;
address that resolves only inside the Tor network.&lt;sup id="fnref:1"&gt;&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref"&gt;1&lt;/a&gt;&lt;/sup&gt; &lt;a href="https://www.torproject.org/"&gt;Tor&lt;/a&gt; relays and
encrypts your traffic as it passes through thousands of volunteer-run
servers, so that no single party can link who you are to what you are
doing; a hidden service extends that anonymity to the server itself.&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s built by the nonprofit &lt;a href="https://www.torproject.org/"&gt;Tor Project&lt;/a&gt;, which advances human rights and
freedoms through free software and open networks, so that anyone can use
the internet free from tracking, surveillance, and censorship. The
network only works because people use it, so consider &lt;a href="https://donate.torproject.org/"&gt;supporting them&lt;/a&gt; or
running a relay&amp;mdash;your contribution helps millions stay safe and private
online every day.&lt;/p&gt;
&lt;h2 id="the-hidden-service"&gt;
The hidden service
&lt;a class="anchor" href="#the-hidden-service"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Install Tor and point a hidden service at a local port. Edit
&lt;code&gt;/etc/tor/torrc&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cfg" data-lang="cfg"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;HiddenServiceDir /var/lib/tor/blog/&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;HiddenServicePort 80 127.0.0.1:8080&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The directory must be a dedicated, Tor-owned path&amp;mdash;not your web
root.&lt;sup id="fnref:2"&gt;&lt;a href="#fn:2" class="footnote-ref" role="doc-noteref"&gt;2&lt;/a&gt;&lt;/sup&gt; Restart Tor and read the
address it generates&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo systemctl restart tor@default
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo cat /var/lib/tor/blog/hostname
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;dhevt6e4rtgbtr3jh53xrpwmgtilkah6nyjujocsspssrsexc7omxhid.onion
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="serving-the-site"&gt;
Serving the site
&lt;a class="anchor" href="#serving-the-site"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Tor forwards the onion&amp;rsquo;s port 80 to &lt;code&gt;127.0.0.1:8080&lt;/code&gt;, so the web server
just needs to listen there. Add an nginx server block for it&amp;mdash;no TLS,
no HTTP/2, no QUIC, since Tor speaks plain TCP and provides its own
encryption.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-nginx" data-lang="nginx"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;server&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="kn"&gt;listen&lt;/span&gt; &lt;span class="n"&gt;127.0.0.1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;8080&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="kn"&gt;server_name&lt;/span&gt; &lt;span class="s"&gt;dhevt6e4rtgbtr3jh53xrpwmgtilkah6nyjujocsspssrsexc7omxhid.onion&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="kn"&gt;root&lt;/span&gt; &lt;span class="s"&gt;/srv/tor.david.alvarezrosa.com&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="kn"&gt;index&lt;/span&gt; &lt;span class="s"&gt;index.html&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="kn"&gt;error_page&lt;/span&gt; &lt;span class="mi"&gt;404&lt;/span&gt; &lt;span class="s"&gt;/404/index.html&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="kn"&gt;location&lt;/span&gt; &lt;span class="s"&gt;/&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="kn"&gt;try_files&lt;/span&gt; &lt;span class="nv"&gt;$uri&lt;/span&gt; &lt;span class="nv"&gt;$uri/&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;404&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Reload nginx and the site is live on Tor.&lt;/p&gt;
&lt;h2 id="building-for-the-onion"&gt;
Building for the onion
&lt;a class="anchor" href="#building-for-the-onion"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;A static site bakes its base URL into absolute links, so a clearnet
build would point visitors back to the clearnet domain even when served
over Tor. The fix is to build a second copy with the onion as its base
URL&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ hugo --minify --baseURL&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;http://dhevt6e4rtgbtr3jh53xrpwmgtilkah6nyjujocsspssrsexc7omxhid.onion/&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The deploy pipeline does this automatically: every push builds the site
once per target&amp;mdash;clearnet and Tor&amp;mdash;and rsyncs each to its own web
root, so the two stay in sync without any manual work.&lt;sup id="fnref:3"&gt;&lt;a href="#fn:3" class="footnote-ref" role="doc-noteref"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;br /&gt;
&lt;p&gt;That&amp;rsquo;s it. Read this site over Tor at
&lt;code&gt;dhevt6e4rtgbtr3jh53xrpwmgtilkah6nyjujocsspssrsexc7omxhid.onion&lt;/code&gt;.&lt;/p&gt;
&lt;div class="footnotes" role="doc-endnotes"&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id="fn:1"&gt;
&lt;p&gt;Open it in the
&lt;a href="https://www.torproject.org/"&gt;Tor Browser&lt;/a&gt;. There is no certificate authority, no DNS, and no exposed
IP&amp;mdash;the address is derived directly from a public key, and the
connection is end-to-end encrypted by Tor itself.&amp;#160;&lt;a href="#fnref:1" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:2"&gt;
&lt;p&gt;Tor stores the service&amp;rsquo;s private key and &lt;code&gt;hostname&lt;/code&gt; file here
and insists on owning it (&lt;code&gt;chmod 700&lt;/code&gt;, user &lt;code&gt;debian-tor&lt;/code&gt;). Point it at
your site files and Tor refuses to start.&amp;#160;&lt;a href="#fnref:2" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:3"&gt;
&lt;p&gt;See &lt;a href="https://david.alvarezrosa.com/posts/first-steps-on-a-new-server/"&gt;First
Steps on a New Server&lt;/a&gt; for the underlying machine; the full configuration
lives in my &lt;a href="https://github.com/david-alvarez-rosa/homelab"&gt;homelab&lt;/a&gt; repository, and the &lt;a href="https://github.com/david-alvarez-rosa/personal-website"&gt;site&amp;rsquo;s own repository&lt;/a&gt; holds the
GitHub Actions workflow that builds and deploys the Tor copy.&amp;#160;&lt;a href="#fnref:3" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;</description></item><item><title>First Steps on a New Server</title><link>https://david.alvarezrosa.com/posts/first-steps-on-a-new-server/</link><pubDate>Sun, 17 May 2026 17:26:00 +0100</pubDate><guid>https://david.alvarezrosa.com/posts/first-steps-on-a-new-server/</guid><description>&lt;p&gt;Over the last decade I&amp;rsquo;ve been playing with dozens of servers from
multiple providers. These are the steps I&amp;rsquo;ve been perfecting to get up
to speed fast and feel right at home on a new machine. Wrote it down
here mostly as a personal reference, but hopefully useful to someone
else too.&lt;/p&gt;
&lt;h2 id="hardware-distro-and-dns"&gt;
Hardware, distro, and DNS
&lt;a class="anchor" href="#hardware-distro-and-dns"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Clean Linux install with one large root partition plus big
swap.&lt;sup id="fnref:1"&gt;&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref"&gt;1&lt;/a&gt;&lt;/sup&gt; While I run Arch on my laptop, Debian tends to be a better
fit for servers because of its stability and long support window.&lt;/p&gt;
&lt;p&gt;Point your domain&lt;sup id="fnref:2"&gt;&lt;a href="#fn:2" class="footnote-ref" role="doc-noteref"&gt;2&lt;/a&gt;&lt;/sup&gt; to your server&amp;rsquo;s IP at your DNS provider: an A record for IPv4
and an AAAA record for IPv6. Wait a few minutes, then verify both.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ dig alvarezrosa.com A +short
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;213.32.19.229
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ dig alvarezrosa.com AAAA +short
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;2001:41d0:305:2100::febc
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Hardware doesn&amp;rsquo;t matter: a VPS, a Raspberry Pi, or a dedicated box will
do.&lt;/p&gt;
&lt;h2 id="first-login"&gt;
First login
&lt;a class="anchor" href="#first-login"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Log in as root, change the password, and update.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ ssh root@alvarezrosa.com
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ passwd
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ apt update &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; apt full-upgrade
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Create a non-root user with sudo privileges.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ useradd --create-home --groups sudo david
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ passwd david
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Log out, then reconnect as the new user.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ ssh david@alvarezrosa.com
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;From here on, stay on this account and use sudo when you need it.&lt;/p&gt;
&lt;h2 id="dotfiles"&gt;
Dotfiles
&lt;a class="anchor" href="#dotfiles"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;I like to set up dotfiles early. Debugging on an unfamiliar shell is
its own kind of miserable.&lt;sup id="fnref:3"&gt;&lt;a href="#fn:3" class="footnote-ref" role="doc-noteref"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ git init
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ git remote add origin https://github.com/david-alvarez-rosa/dotfiles.git
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ git fetch origin
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ git checkout -t origin/main
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ git submodule update --init --recursive
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ git config status.showUntrackedFiles no
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Switch to zsh and install starship.&lt;sup id="fnref:4"&gt;&lt;a href="#fn:4" class="footnote-ref" role="doc-noteref"&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo apt install zsh starship
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ chsh --shell &lt;span class="k"&gt;$(&lt;/span&gt;which zsh&lt;span class="k"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Log out and back in to confirm the shell loads correctly.&lt;/p&gt;
&lt;h2 id="ssh-keys"&gt;
SSH keys
&lt;a class="anchor" href="#ssh-keys"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Copy your public key to the server from your local machine.&lt;sup id="fnref:5"&gt;&lt;a href="#fn:5" class="footnote-ref" role="doc-noteref"&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ ssh-copy-id david@alvarezrosa.com
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Confirm you can get in without a password.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ ssh david@alvarezrosa.com
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;If you need root access over SSH, install the key there too.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo install -d -m &lt;span class="m"&gt;700&lt;/span&gt; /root/.ssh
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo install -m &lt;span class="m"&gt;600&lt;/span&gt; ~/.ssh/authorized_keys /root/.ssh/authorized_keys
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Once that&amp;rsquo;s working, disable password auth at least for
root.&lt;sup id="fnref:6"&gt;&lt;a href="#fn:6" class="footnote-ref" role="doc-noteref"&gt;6&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;h2 id="timezone-locale-and-hostname"&gt;
Timezone, locale, and hostname
&lt;a class="anchor" href="#timezone-locale-and-hostname"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Set the timezone and verify with &lt;code&gt;date&lt;/code&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ timedatectl list-timezones
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo timedatectl set-timezone Europe/Madrid
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ date
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Then enable &lt;code&gt;en_US.UTF-8&lt;/code&gt; locale and make it the default.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo vim /etc/locale.gen &lt;span class="c1"&gt;# Uncomment en_US.UTF-8&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo locale-gen
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo update-locale &lt;span class="nv"&gt;LANG&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;en_US.UTF-8
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Set a sensible hostname and make sure &lt;code&gt;/etc/hosts&lt;/code&gt; matches.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo hostnamectl set-hostname homelab
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ cat /etc/hosts
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;127.0.0.1 localhost
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;::1 localhost ip6-localhost ip6-loopback
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;127.0.1.1 homelab
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="firewall"&gt;
Firewall
&lt;a class="anchor" href="#firewall"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Deny all inbound traffic and allow only the ports you need.&lt;sup id="fnref:7"&gt;&lt;a href="#fn:7" class="footnote-ref" role="doc-noteref"&gt;7&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo apt install ufw
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo ufw default deny incoming
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo ufw allow 22/tcp
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo ufw &lt;span class="nb"&gt;enable&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Add more rules only as services are exposed.&lt;/p&gt;
&lt;h2 id="automatic-security-updates"&gt;
Automatic security updates
&lt;a class="anchor" href="#automatic-security-updates"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Security patches shouldn&amp;rsquo;t depend on remembering to log in every few
days.&lt;sup id="fnref:8"&gt;&lt;a href="#fn:8" class="footnote-ref" role="doc-noteref"&gt;8&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo apt install unattended-upgrades apt-listchanges
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo dpkg-reconfigure --priority&lt;span class="o"&gt;=&lt;/span&gt;low unattended-upgrades
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;After that, security updates mostly take care of themselves.&lt;/p&gt;
&lt;h2 id="intrusion-prevention"&gt;
Intrusion prevention
&lt;a class="anchor" href="#intrusion-prevention"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;fail2ban&lt;/code&gt; watches authentication logs and temporarily blocks IPs that
look like they&amp;rsquo;re brute-forcing your services.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo apt install fail2ban
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo systemctl &lt;span class="nb"&gt;enable&lt;/span&gt; --now fail2ban
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="web-server"&gt;
Web server
&lt;a class="anchor" href="#web-server"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Install a web server to verify everything works end to end.&lt;sup id="fnref:9"&gt;&lt;a href="#fn:9" class="footnote-ref" role="doc-noteref"&gt;9&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo apt install nginx
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo systemctl &lt;span class="nb"&gt;enable&lt;/span&gt; --now nginx
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo ufw allow 80/tcp
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Open your domain in a browser. You should see the default nginx page. Then
enable HTTPS with Let&amp;rsquo;s Encrypt.&lt;sup id="fnref:10"&gt;&lt;a href="#fn:10" class="footnote-ref" role="doc-noteref"&gt;10&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo ufw allow 443/tcp
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo apt install certbot python3-certbot-nginx
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo certbot
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Follow the prompts; Certbot rewrites the nginx config and sets up
renewal automatically. Confirm your domain loads over HTTPS.&lt;/p&gt;
&lt;br /&gt;
&lt;p&gt;That&amp;rsquo;s the baseline. From here, the machine is yours&amp;mdash;go build on it.&lt;/p&gt;
&lt;div class="footnotes" role="doc-endnotes"&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id="fn:1"&gt;
&lt;p&gt;Predicting future partitioning needs is easy for a desktop,
but can be difficult for a server. One large root filesystem is easier
to manage.&amp;#160;&lt;a href="#fnref:1" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:2"&gt;
&lt;p&gt;This post uses my domain alvarezrosa.com as an
example.&amp;#160;&lt;a href="#fnref:2" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:3"&gt;
&lt;p&gt;These commands treat the home directory
as a Git repository, which lets you track dotfiles without symlink
gymnastics. GitHub access can be configured shortly after this.&amp;#160;&lt;a href="#fnref:3" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:4"&gt;
&lt;p&gt;Oh My Zsh is a common shell
add-on, but it isn&amp;rsquo;t required for the server itself. starship is a fast
cross-shell prompt.&amp;#160;&lt;a href="#fnref:4" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:5"&gt;
&lt;p&gt;If you
don&amp;rsquo;t have a key on your local machine yet, generate one first with
&lt;code&gt;ssh-keygen&lt;/code&gt;.&amp;#160;&lt;a href="#fnref:5" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:6"&gt;
&lt;p&gt;Debian&amp;rsquo;s default is already &lt;code&gt;PermitRootLogin prohibit-password&lt;/code&gt;, which only allows key-based root logins.&amp;#160;&lt;a href="#fnref:6" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:7"&gt;
&lt;p&gt;Make
sure SSH is allowed before enabling the firewall, or you will lock
yourself out of the machine.&amp;#160;&lt;a href="#fnref:7" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:8"&gt;
&lt;p&gt;Logs for unattended updates live in
&lt;code&gt;/var/log/unattended-upgrades/&lt;/code&gt;.&amp;#160;&lt;a href="#fnref:8" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:9"&gt;
&lt;p&gt;I&amp;rsquo;ve
been using Apache for quite a few years, but nginx is more lightweight
and handles concurrent connections more efficiently.&amp;#160;&lt;a href="#fnref:9" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:10"&gt;
&lt;p&gt;Certbot obtains free TLS
certificates, updates the nginx configuration for you, and sets up
automatic renewal.&amp;#160;&lt;a href="#fnref:10" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;</description></item><item><title>Fundamental Theorem of Calculus</title><link>https://david.alvarezrosa.com/posts/fundamental-theorem-of-calculus/</link><pubDate>Wed, 22 Apr 2026 20:14:00 +0100</pubDate><guid>https://david.alvarezrosa.com/posts/fundamental-theorem-of-calculus/</guid><description>&lt;p&gt;Although the notion of area is intuitive, its mathematical treatment
requires a rigorous definition. This post introduces the Riemann
integral, and proves the fundamental theorem of calculus&amp;mdash;a beautiful
result that connects integrals and derivatives.&lt;/p&gt;
&lt;h2 id="riemann-integral"&gt;
Riemann integral
&lt;a class="anchor" href="#riemann-integral"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Given a bounded&lt;sup id="fnref:1"&gt;&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref"&gt;1&lt;/a&gt;&lt;/sup&gt; function \(f\colon[a,b]\to\mathbb{R}\), we can approximate the
area under its graph by rectangles. Choose a partition of its domain&lt;/p&gt;
&lt;p&gt;\[
\mathcal{P}=\{x_0,x_1,\ldots,x_n\mid a=x_0&amp;lt;x_1&amp;lt;\cdots&amp;lt;x_n=b\}.
\]&lt;/p&gt;
&lt;p&gt;For each subinterval \([x_{k-1},x_k]\), define the width \(\Delta
x_k=x_k-x_{k-1}\), and let \(m_k\) and \(M_k\) denote the infimum and
supremum of \(f\) on that subinterval. The lower and upper sums are&lt;/p&gt;
&lt;p&gt;\[
L(f,\mathcal{P})=\sum_{k=1}^{n}m_k\Delta x_k,
\qquad
U(f,\mathcal{P})=\sum_{k=1}^{n}M_k\Delta x_k.
\]&lt;/p&gt;
&lt;p&gt;We define \(f\) to be Riemann integrable&lt;sup id="fnref:2"&gt;&lt;a href="#fn:2" class="footnote-ref" role="doc-noteref"&gt;2&lt;/a&gt;&lt;/sup&gt; on
\([a,b]\) iff for every \(\varepsilon&amp;gt;0\) there exists a partition
\(\mathcal{P}\) such that
\(U(f,\mathcal{P})-L(f,\mathcal{P})&amp;lt;\varepsilon\), in which case&lt;/p&gt;
&lt;p&gt;\[
\int_a^b f
=\sup_{\mathcal{P}}L(f,\mathcal{P})
=\inf_{\mathcal{P}}U(f,\mathcal{P}).
\]&lt;/p&gt;
&lt;h2 id="calculus-machinery"&gt;
Calculus machinery
&lt;a class="anchor" href="#calculus-machinery"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;The proof requires the mean value theorem, which in turn rests on
Rolle&amp;rsquo;s theorem and Fermat&amp;rsquo;s proposition.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Fermat&amp;rsquo;s Proposition.&lt;/strong&gt; Let \(I\subset\mathbb{R}\) be open and \(f\colon
I\to\mathbb{R}\) differentiable at \(a\in I\). If \(f\) has a local
extremum at \(a\), then \(f^{\prime}(a)=0\).&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Proof.&lt;/em&gt; Assume \(f\) has a local maximum&lt;sup id="fnref:3"&gt;&lt;a href="#fn:3" class="footnote-ref" role="doc-noteref"&gt;3&lt;/a&gt;&lt;/sup&gt; at \(a\). Then there exists
\(\delta&amp;gt;0\) such that \(f(x)-f(a)\le 0\) for all
\(x\in(a-\delta,a+\delta)\). Therefore&lt;/p&gt;
&lt;p&gt;\[
\frac{f(x)-f(a)}{x-a}\ge 0 \quad (x&amp;lt;a),
\qquad
\frac{f(x)-f(a)}{x-a}\le 0 \quad (x&amp;gt;a).
\]&lt;/p&gt;
&lt;p&gt;Taking limits, \(f^{\prime}_-(a)\ge 0\) and \(f^{\prime}_+(a)\le 0\).
Since \(f\) is differentiable at \(a\),
\(f^{\prime}_-(a)=f^{\prime}_+(a)=f^{\prime}(a)\), hence
\(f^{\prime}(a)=0\). \(\square\)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Rolle&amp;rsquo;s Theorem.&lt;/strong&gt; If \(f\colon[a,b]\to\mathbb{R}\) is continuous on
\([a,b]\), differentiable on \((a,b)\), and \(f(a)=f(b)\), then there
exists \(\xi\in(a,b)\) such that \(f^{\prime}(\xi)=0\).&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Proof.&lt;/em&gt; By the extreme value theorem,&lt;sup id="fnref:4"&gt;&lt;a href="#fn:4" class="footnote-ref" role="doc-noteref"&gt;4&lt;/a&gt;&lt;/sup&gt; \(f\)
attains its minimum \(m\) and maximum \(M\) on \([a,b]\). If \(m=M\),
then \(f\) is constant and any \(\xi\in(a,b)\) works. Otherwise, since
\(f(a)=f(b)\), at least one extremum is attained at some
\(\xi\in(a,b)\); by Fermat, \(f^{\prime}(\xi)=0\). \(\square\)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Mean Value Theorem.&lt;/strong&gt;&lt;sup id="fnref:5"&gt;&lt;a href="#fn:5" class="footnote-ref" role="doc-noteref"&gt;5&lt;/a&gt;&lt;/sup&gt; If
\(f\) is continuous on \([a,b]\) and differentiable on \((a,b)\), then
there exists \(\xi\in(a,b)\) such that&lt;/p&gt;
&lt;p&gt;\[
f^{\prime}(\xi)=\frac{f(b)-f(a)}{b-a}.
\]&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Proof.&lt;/em&gt; Define&lt;/p&gt;
&lt;p&gt;\[
g(x)=f(a)+\frac{f(b)-f(a)}{b-a}(x-a),
\qquad
h(x)=f(x)-g(x).
\]&lt;/p&gt;
&lt;p&gt;Then \(h\) is continuous on \([a,b]\), differentiable on \((a,b)\), and
\(h(a)=h(b)=0\). By Rolle&amp;rsquo;s theorem, there exists \(\xi\in(a,b)\) with
\(h^{\prime}(\xi)=0\), which gives&lt;/p&gt;
&lt;p&gt;\[
f^{\prime}(\xi)-\frac{f(b)-f(a)}{b-a}=0.\,\square
\]&lt;/p&gt;
&lt;h2 id="fundamental-theorem-of-calculus"&gt;
Fundamental theorem of calculus
&lt;a class="anchor" href="#fundamental-theorem-of-calculus"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;We now have everything needed to prove the main result.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Fundamental Theorem of Calculus.&lt;/strong&gt;&lt;sup id="fnref:6"&gt;&lt;a href="#fn:6" class="footnote-ref" role="doc-noteref"&gt;6&lt;/a&gt;&lt;/sup&gt; Let
\(f\colon[a,b]\to\mathbb{R}\) be Riemann integrable, and let
\(F\colon[a,b]\to\mathbb{R}\) be continuous on \([a,b]\), differentiable
on \((a,b)\), and satisfy \(F^{\prime}(x)=f(x)\) for all \(x\in(a,b)\).
Then&lt;/p&gt;
&lt;p&gt;\[
\int_a^b f = F(b)-F(a).
\]&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Proof.&lt;/em&gt; Fix a partition \(\mathcal{P}=\{x_0,\ldots,x_n\}\). For
each \([x_{k-1},x_k]\), the mean value theorem applied to \(F\) gives
\(z_k\in(x_{k-1},x_k)\) such that&lt;/p&gt;
&lt;p&gt;\[
F(x_k)-F(x_{k-1})=f(z_k)\,\Delta x_k.
\]&lt;/p&gt;
&lt;p&gt;Since \(m_k\le f(z_k)\le M_k\), we obtain&lt;/p&gt;
&lt;p&gt;\[
L(f,\mathcal{P})
\le
\sum_{k=1}^{n}\left(F(x_k)-F(x_{k-1})\right) = F(b)-F(a)
\le
U(f,\mathcal{P}).
\]&lt;/p&gt;
&lt;p&gt;Taking supremum and infimum over all partitions and using integrability,
we get&lt;/p&gt;
&lt;p&gt;\[
\int_a^b f=F(b)-F(a).\,\square
\]&lt;/p&gt;
&lt;p&gt;Thus computing an area reduces to evaluating an antiderivative at two
points.&lt;sup id="fnref:7"&gt;&lt;a href="#fn:7" class="footnote-ref" role="doc-noteref"&gt;7&lt;/a&gt;&lt;/sup&gt; This theorem is fundamental
because it unifies differentiation and integration, the two central
operations of calculus.&lt;/p&gt;
&lt;div class="footnotes" role="doc-endnotes"&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id="fn:1"&gt;
&lt;p&gt;Note that continuity is not required here;
boundedness alone ensures the subinterval infima and suprema are
finite.&amp;#160;&lt;a href="#fnref:1" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:2"&gt;
&lt;p&gt;Every continuous function
on \([a,b]\) is Riemann integrable; so is every monotone function. The
exact characterization is Lebesgue&amp;rsquo;s criterion: \(f\) is Riemann
integrable iff it is bounded and continuous almost everywhere.&amp;#160;&lt;a href="#fnref:2" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:3"&gt;
&lt;p&gt;The local minimum case is
identical, with all inequalities reversed.&amp;#160;&lt;a href="#fnref:3" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:4"&gt;
&lt;p&gt;Topological result: \([a,b]\)
is compact (Heine-Borel), the continuous image of a compact set is
compact, and compact subsets of \(\mathbb{R}\) are closed and bounded,
so they contain their \(\inf\) and \(\sup\), which are finite.&amp;#160;&lt;a href="#fnref:4" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:5"&gt;
&lt;p&gt;Geometrically: there is always a point where
the tangent line is parallel to the secant through the endpoints.&amp;#160;&lt;a href="#fnref:5" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:6"&gt;
&lt;p&gt;A broader formulation also
includes the statement that \(x\mapsto\int_a^x f(t)\,dt\) is an
antiderivative of \(f\) under suitable regularity assumptions.&amp;#160;&lt;a href="#fnref:6" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:7"&gt;
&lt;p&gt;For example, \(\int_0^1 x^2\,dx = F(1)-F(0) = 1/3\) with
\(F(x)=x^3/3\). No partitions needed.&amp;#160;&lt;a href="#fnref:7" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;</description></item><item><title>Optimizing a Lock-Free Ring Buffer</title><link>https://david.alvarezrosa.com/posts/optimizing-a-lock-free-ring-buffer/</link><pubDate>Tue, 24 Mar 2026 11:51:00 +0000</pubDate><guid>https://david.alvarezrosa.com/posts/optimizing-a-lock-free-ring-buffer/</guid><description>&lt;p&gt;A single-producer single-consumer (SPSC) queue is a great example of how
far constraints can take a design. In this post, you will learn how to
implement a ring buffer from scratch: start with the simplest design,
make it thread-safe, and then gradually remove overhead while preserving
FIFO behavior and predictable latency. This pattern is widely used to
share data between threads in the lowest-latency environments.&lt;/p&gt;
&lt;h2 id="what-is-a-ring-buffer"&gt;
What is a ring buffer?
&lt;a class="anchor" href="#what-is-a-ring-buffer"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt; &lt;sup id="fnref:1"&gt;&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref"&gt;1&lt;/a&gt;&lt;/sup&gt;You might have run into
the term circular buffer, or perhaps cyclic queue. These are simply
other names for a &lt;em&gt;ring buffer:&lt;/em&gt; a queue where a producer generates data
and inserts it into the buffer, and a consumer later pulls it back out,
in first-in-first-out order.&lt;/p&gt;
&lt;p&gt;What makes a ring buffer distinctive is how it stores data and the
constraints it enforces. It has a fixed capacity; it neither expands
nor shrinks. As a result, when the buffer fills up, the producer must
either wait until space becomes available or overwrite entries that have
not been read yet, depending on what the application expects.&lt;/p&gt;
&lt;p&gt;The consumer&amp;rsquo;s job is straightforward: read items as they arrive. When
the ring buffer is empty, the consumer has to block, spin, or move on to
other work. Each successful read releases a slot the producer can
reuse. In the ideal case, the producer stays just a bit ahead, and the
system turns into a quiet game of &lt;em&gt;&amp;ldquo;catch me if you can,&amp;rdquo;&lt;/em&gt; with minimal
waiting on both sides.&lt;/p&gt;
&lt;h2 id="single-threaded-ring-buffer"&gt;
Single-threaded ring buffer
&lt;a class="anchor" href="#single-threaded-ring-buffer"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Let&amp;rsquo;s start with a single-threaded ring buffer, which is just an
array&lt;sup id="fnref:2"&gt;&lt;a href="#fn:2" class="footnote-ref" role="doc-noteref"&gt;2&lt;/a&gt;&lt;/sup&gt; and two indices. We
leave one slot permanently unused to distinguish &amp;ldquo;full&amp;rdquo; from &amp;ldquo;empty.&amp;rdquo;
Push writes to head and advances it; pop reads from tail and advances
it.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;template&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;typename&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RingBufferV1&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;head_&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;tail_&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;We can now implement the &lt;code&gt;push&lt;/code&gt; (or write) operation&lt;sup id="fnref:3"&gt;&lt;a href="#fn:3" class="footnote-ref" role="doc-noteref"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;new_head&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;head_&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_head&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="na"&gt;[[unlikely]]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="c1"&gt;// Wrap-around
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;new_head&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_head&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;tail_&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="na"&gt;[[unlikely]]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="c1"&gt;// Full
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;head_&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;head_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;new_head&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Next we implement the &lt;code&gt;pop&lt;/code&gt; (or read) operation&lt;sup id="fnref:4"&gt;&lt;a href="#fn:4" class="footnote-ref" role="doc-noteref"&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;pop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;head_&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;tail_&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="na"&gt;[[unlikely]]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="c1"&gt;// Empty
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tail_&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;next_tail&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tail_&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;next_tail&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="na"&gt;[[unlikely]]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="c1"&gt;// Wrap-around
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;next_tail&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;tail_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;next_tail&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="thread-safe-ring-buffer"&gt;
Thread-safe ring buffer
&lt;a class="anchor" href="#thread-safe-ring-buffer"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;You probably already noticed that the previous version is not
thread-safe. The easiest way to solve this is to add a &lt;code&gt;mutex&lt;/code&gt; around
push and pop.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;template&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;typename&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RingBufferV2&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;mutable&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;mutex&lt;/span&gt; &lt;span class="n"&gt;mutex_&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;lock&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;lock_guard&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;mutex&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;mutex_&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt; &lt;span class="c1"&gt;// Thread-safe
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;// ...
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;pop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;lock&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;lock_guard&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;mutex&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;mutex_&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt; &lt;span class="c1"&gt;// Thread-safe
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;// ...
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;It&amp;rsquo;s correct and often fast enough: around &lt;strong&gt;12M ops/s&lt;/strong&gt;&lt;sup id="fnref:5"&gt;&lt;a href="#fn:5" class="footnote-ref" role="doc-noteref"&gt;5&lt;/a&gt;&lt;/sup&gt; on
consoomer hardware. However, it pays for mutual exclusion even though
the producer and consumer never write the same index. The ownership is
asymmetric: the producer is the only writer of head, and the consumer is
the only writer of tail. That asymmetry is the lever to remove locks.&lt;/p&gt;
&lt;h2 id="lock-free-ring-buffer"&gt;
Lock-free ring buffer
&lt;a class="anchor" href="#lock-free-ring-buffer"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;We can remove the locks by using atomics instead&lt;sup id="fnref:6"&gt;&lt;a href="#fn:6" class="footnote-ref" role="doc-noteref"&gt;6&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;template&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;typename&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RingBufferV3&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;alignas&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;hardware_destructive_interference_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;atomic_size_t&lt;/span&gt; &lt;span class="n"&gt;head_&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;alignas&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;hardware_destructive_interference_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;atomic_size_t&lt;/span&gt; &lt;span class="n"&gt;tail_&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The push implementation becomes&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;head&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;head_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;next_head&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;head&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;next_head&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="na"&gt;[[unlikely]]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;next_head&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;next_head&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;tail_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="na"&gt;[[unlikely]]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;head&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;head_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;next_head&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;And the pop implementation&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;pop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;tail&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tail_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tail&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;head_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="na"&gt;[[unlikely]]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tail&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;next_tail&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tail&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;next_tail&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="na"&gt;[[unlikely]]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;next_tail&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;tail_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;next_tail&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Simply removing the locks yields &lt;strong&gt;35M ops/s&lt;/strong&gt;, more than double the
throughput of the locked version! You have probably noticed that we are
using the default &lt;code&gt;std::memory_order_seq_cst&lt;/code&gt; memory order for loading /
storing the atomics, which is the slowest. Let&amp;rsquo;s manually tune the
memory order&lt;sup id="fnref:7"&gt;&lt;a href="#fn:7" class="footnote-ref" role="doc-noteref"&gt;7&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;head&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;head_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;memory_order_relaxed&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;next_head&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;head&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;next_head&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="na"&gt;[[unlikely]]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;next_head&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;next_head&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;tail_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;memory_order_acquire&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="na"&gt;[[unlikely]]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;head&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;head_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;next_head&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;memory_order_release&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;pop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;tail&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tail_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;memory_order_relaxed&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tail&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;head_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;memory_order_acquire&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="na"&gt;[[unlikely]]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tail&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;next_tail&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tail&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;next_tail&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="na"&gt;[[unlikely]]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;next_tail&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;tail_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;next_tail&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;memory_order_release&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Rerunning the benchmark now gives an astonishing &lt;strong&gt;108M ops/s&lt;/strong&gt;&amp;mdash;3x the
previous version and 9x the original locked version. Worth the effort,
right?&lt;/p&gt;
&lt;h2 id="further-optimization"&gt;
Further optimization
&lt;a class="anchor" href="#further-optimization"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;We already have a fast ring buffer, but we can push it further. The
main slowdown comes from the reader and writer constantly touching each
other&amp;rsquo;s indexes. That makes the CPU bounce cache lines&lt;sup id="fnref:8"&gt;&lt;a href="#fn:8" class="footnote-ref" role="doc-noteref"&gt;8&lt;/a&gt;&lt;/sup&gt; between cores, which is
expensive.&lt;/p&gt;
&lt;p&gt;To reduce this, the reader can keep a local cached copy&lt;sup id="fnref:9"&gt;&lt;a href="#fn:9" class="footnote-ref" role="doc-noteref"&gt;9&lt;/a&gt;&lt;/sup&gt; of the write index,
and the writer keeps a local cached copy of the read index. Then they
don&amp;rsquo;t need to re-check the other side on every single operation: only
once in a while.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;template&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;typename&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RingBufferV5&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;buffer_&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;alignas&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;hardware_destructive_interference_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;atomic_size_t&lt;/span&gt; &lt;span class="n"&gt;head_&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;alignas&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;hardware_destructive_interference_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;head_cached_&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;alignas&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;hardware_destructive_interference_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;atomic_size_t&lt;/span&gt; &lt;span class="n"&gt;tail_&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;alignas&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;hardware_destructive_interference_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;tail_cached_&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The push operation is updated to first consult the cached tail
&lt;code&gt;tail_cached_&lt;/code&gt; and if that fails retry after updating the cache&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;next_head&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;tail_cached_&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="na"&gt;[[unlikely]]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;tail_cached_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tail_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;memory_order_acquire&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;next_head&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;tail_cached_&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The pop operation is updated in a similar way to first consult the
cached head&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tail&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;head_cached_&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="na"&gt;[[unlikely]]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;head_cached_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;head_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;memory_order_acquire&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tail&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;head_cached_&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Throughput is now &lt;strong&gt;305M ops/s&lt;/strong&gt;&amp;mdash;nearly 3x faster than the manually
tuned lock-free version and 25x faster than the original locking
approach.&lt;/p&gt;
&lt;h2 id="summary"&gt;
Summary
&lt;a class="anchor" href="#summary"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;If you want to reproduce these results, run the included &lt;a href="https://github.com/david-alvarez-rosa/CppPlayground/blob/main/dsa/ring_buffer.cpp"&gt;benchmark&lt;/a&gt;
compiled with at least &lt;code&gt;-O3&lt;/code&gt; optimization level.&lt;sup id="fnref:10"&gt;&lt;a href="#fn:10" class="footnote-ref" role="doc-noteref"&gt;10&lt;/a&gt;&lt;/sup&gt; The benchmark pins
the producer and consumer threads to dedicated CPU cores to minimize
scheduling noise.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Version&lt;/th&gt;
&lt;th&gt;Throughput&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;Not thread-safe&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;12M ops/s&lt;/td&gt;
&lt;td&gt;Mutex / lock&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;35M ops/s&lt;/td&gt;
&lt;td&gt;Lock-free (atomics)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;108M ops/s&lt;/td&gt;
&lt;td&gt;Lock-free (atomics) + memory order&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;305M ops/s&lt;/td&gt;
&lt;td&gt;Lock-free (atomics) + memory order + index cache&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Long live lock-free and wait-free data structures!&lt;/p&gt;
&lt;div class="footnotes" role="doc-endnotes"&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id="fn:1"&gt;
&lt;p&gt;
&lt;a href="https://david.alvarezrosa.com/images/ringbuffer.jpg"&gt;
&lt;picture&gt;
&lt;source
type="image/webp"
srcset="https://david.alvarezrosa.com/images/ringbuffer_hu_d9760c71b602a164.webp 240w, https://david.alvarezrosa.com/images/ringbuffer_hu_54e094ede0a5bd9b.webp 360w, https://david.alvarezrosa.com/images/ringbuffer_hu_18cfb3a3266da49d.webp 420w, https://david.alvarezrosa.com/images/ringbuffer_hu_74cb1c40a06f80cb.webp 480w, https://david.alvarezrosa.com/images/ringbuffer_hu_3888d06599824ada.webp 768w, https://david.alvarezrosa.com/images/ringbuffer_hu_bf6ac6011ccf9f75.webp 1024w"
sizes="(max-width: 860px) 100vw, 21rem"&gt;
&lt;img
src="https://david.alvarezrosa.com/images/ringbuffer.jpg"
srcset="https://david.alvarezrosa.com/images/ringbuffer_hu_3a972a3e59f5f97b.jpg 240w, https://david.alvarezrosa.com/images/ringbuffer_hu_69b420ee6ed8d23f.jpg 360w, https://david.alvarezrosa.com/images/ringbuffer_hu_1754f47c8611601c.jpg 420w, https://david.alvarezrosa.com/images/ringbuffer_hu_2c797d93b3d094e.jpg 480w, https://david.alvarezrosa.com/images/ringbuffer_hu_24f5f6ed33d36391.jpg 768w, https://david.alvarezrosa.com/images/ringbuffer_hu_ea45e03f995d3c8b.jpg 1024w"
sizes="(max-width: 860px) 100vw, 21rem"
width="1024"
height="836"
alt="Ringbuffer"
loading="eager"
fetchpriority="high"
decoding="async"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;strong&gt;Ring buffer with 32 slots.&lt;/strong&gt; The
producer has filled 15 of them, indicated by blue. The consumer is
behind the producer, reading data from the slots, freeing them as it
does so. A free slot is indicated by orange.&amp;#160;&lt;a href="#fnref:1" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:2"&gt;
&lt;p&gt;By using &lt;code&gt;std::array&lt;/code&gt; we are forcing clients to define the
buffer size as &lt;code&gt;constexpr&lt;/code&gt;. It&amp;rsquo;s also common to use instead a
&lt;code&gt;std::vector&lt;/code&gt; to remove that restriction. A further optimization is to
constrain the capacity to a power of two, allowing wrap-around via bit
masking &lt;code&gt;head &amp;amp; (N - 1)&lt;/code&gt; instead of a branch.&amp;#160;&lt;a href="#fnref:2" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:3"&gt;
&lt;p&gt;Note how one
item is left unused to indicate that the queue is full. When &lt;code&gt;head_&lt;/code&gt; is
one item ahead of &lt;code&gt;tail_&lt;/code&gt;, the queue is full.&amp;#160;&lt;a href="#fnref:3" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:4"&gt;
&lt;p&gt;Again note that
&lt;code&gt;head_ == tail_&lt;/code&gt; indicates that the queue is empty.&amp;#160;&lt;a href="#fnref:4" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:5"&gt;
&lt;p&gt;Compiled with
&lt;code&gt;clang&lt;/code&gt; compiler with highest &lt;code&gt;-O3&lt;/code&gt; optimization level, and
&lt;code&gt;-march=native -ffast-math&lt;/code&gt;. Consumer and producer threads are pinned
to dedicated cores (Intel Core Ultra 5 135U). See &lt;a href="https://github.com/david-alvarez-rosa/CppPlayground/blob/main/dsa/ring_buffer.cpp"&gt;benchmark&lt;/a&gt;.&amp;#160;&lt;a href="#fnref:5" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:6"&gt;
&lt;p&gt;Note that we are
manually aligning &lt;code&gt;alignas&lt;/code&gt; the atomics to ensure they fall in different
cache lines (commonly 64 bytes). This prevents false sharing, hence
optimizes CPU cache usage.&amp;#160;&lt;a href="#fnref:6" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:7"&gt;
&lt;p&gt;&lt;code&gt;release&lt;/code&gt; ensures prior writes are visible to threads
that &lt;code&gt;acquire&lt;/code&gt; the same variable. Both sides use &lt;code&gt;relaxed&lt;/code&gt; for their
own index since no other thread writes to it.&amp;#160;&lt;a href="#fnref:7" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:8"&gt;
&lt;p&gt;It&amp;rsquo;s useful
to observe the number of cache misses with &lt;code&gt;perf stat -e cache-misses&lt;/code&gt;;
they are greatly reduced in this approach.&amp;#160;&lt;a href="#fnref:8" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:9"&gt;
&lt;p&gt;This
advanced optimization was inspired by &lt;a href="https://rigtorp.se"&gt;Erik Rigtorp&lt;/a&gt;. For an earlier
publication, see: P. P. C. Lee, T. Bu, and G. Chandranmenon, &amp;ldquo;A
lock-free, cache-efficient multi-core synchronization mechanism for
line-rate network traffic monitoring,&amp;rdquo; &lt;em&gt;2010 IEEE International
Symposium on Parallel &amp;amp; Distributed Processing
(IPDPS)&lt;/em&gt;&amp;mdash;&lt;a href="https://doi.org/10.1109/IPDPS.2010.5470368"&gt;doi:10.1109/IPDPS.2010.5470368&lt;/a&gt; (&lt;a href="https://www.cse.cuhk.edu.hk/~pclee/www/pubs/ipdps10.pdf"&gt;PDF&lt;/a&gt;).&amp;#160;&lt;a href="#fnref:9" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:10"&gt;
&lt;p&gt;Alongside &lt;code&gt;-O3&lt;/code&gt;,
the benchmark was compiled with &lt;code&gt;-march=native&lt;/code&gt; and &lt;code&gt;-ffast-math&lt;/code&gt;,
though these flags shouldn&amp;rsquo;t make a difference here.&amp;#160;&lt;a href="#fnref:10" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;</description></item><item><title>Deriving Type Erasure</title><link>https://david.alvarezrosa.com/posts/deriving-type-erasure/</link><pubDate>Tue, 10 Mar 2026 09:46:00 +0000</pubDate><guid>https://david.alvarezrosa.com/posts/deriving-type-erasure/</guid><description>&lt;p&gt;Ever looked at &lt;code&gt;std::any&lt;/code&gt; and wondered what&amp;rsquo;s going on behind the
scenes? Beneath the intimidating interface is a classic technique
called type erasure: concrete types hidden behind a small, uniform
wrapper.&lt;/p&gt;
&lt;p&gt;Starting from familiar tools like virtual functions and templates, we&amp;rsquo;ll
build a minimal &lt;code&gt;std::any&lt;/code&gt;. By the end, you&amp;rsquo;ll have a clear
understanding of how type erasure works under the hood.&lt;/p&gt;
&lt;h2 id="polymorphism-with-interfaces"&gt;
Polymorphism with interfaces
&lt;a class="anchor" href="#polymorphism-with-interfaces"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;The typical way to achieve polymorphism is to define an interface
consisting of pure-virtual methods you want to be able to call. Then,
for each implementation that you want to use polymorphically, you create
a subclass that inherits from the base class and implement those
methods.&lt;/p&gt;
&lt;p&gt;As an example, let&amp;rsquo;s implement shape classes that have an &lt;code&gt;area()&lt;/code&gt;
method. We start with an interface&lt;sup id="fnref:1"&gt;&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref"&gt;1&lt;/a&gt;&lt;/sup&gt;
class&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Shape&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="o"&gt;~&lt;/span&gt;&lt;span class="n"&gt;Shape&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;area&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;And add a couple of concrete implementations for &lt;code&gt;Square&lt;/code&gt; and &lt;code&gt;Circle&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Square&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;Shape&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;side_&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;explicit&lt;/span&gt; &lt;span class="n"&gt;Square&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;side&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;side_&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;side&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;area&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;side_&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;side_&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Circle&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;Shape&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;radius_&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;explicit&lt;/span&gt; &lt;span class="n"&gt;Circle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;radius&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;radius_&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;radius&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;area&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;numbers&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;pi&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;radius_&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;radius_&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Now, we can use these implementations generically, by coding against the
interface&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;printArea&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Shape&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#34;Area is {:.2f}&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;area&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Simple enough, right?&lt;/p&gt;
&lt;h2 id="polymorphism-with-templates"&gt;
Polymorphism with templates
&lt;a class="anchor" href="#polymorphism-with-templates"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Inheritance is a good solution to problems that require polymorphism,
but sometimes the concrete types you want to handle polymorphically
cannot share a common base class.&lt;sup id="fnref:2"&gt;&lt;a href="#fn:2" class="footnote-ref" role="doc-noteref"&gt;2&lt;/a&gt;&lt;/sup&gt; In that case, if the types provide the same
interface, you can use a template to get polymorphism instead&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;printArea&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;auto&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#34;Area is {:.2f}&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;area&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;You can use this method with &lt;code&gt;Square&lt;/code&gt;, &lt;code&gt;Circle&lt;/code&gt;, or any type that
provides a zero-argument &lt;code&gt;area()&lt;/code&gt; returning &lt;code&gt;double&lt;/code&gt;. Templates work
because the compiler generates a version of the function for each
concrete type you use, and the call is valid as long as that generated
code would compile&lt;sup id="fnref:3"&gt;&lt;a href="#fn:3" class="footnote-ref" role="doc-noteref"&gt;3&lt;/a&gt;&lt;/sup&gt; for the given type.&lt;/p&gt;
&lt;p&gt;Unfortunately, template-based polymorphism has two main downsides.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;First,&lt;/strong&gt; templates do not give you one shared runtime base type like
&lt;code&gt;Shape&lt;/code&gt;. Each instantiation is a distinct type, so there is no common
type for a homogeneous container; you cannot store a mix of &lt;code&gt;Square&lt;/code&gt; and
&lt;code&gt;Circle&lt;/code&gt; in one array and handle them uniformly the way you can with a
pointer to base technique&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="o"&gt;???&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;square&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;circle&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The &lt;strong&gt;second&lt;/strong&gt; drawback&lt;sup id="fnref:4"&gt;&lt;a href="#fn:4" class="footnote-ref" role="doc-noteref"&gt;4&lt;/a&gt;&lt;/sup&gt; is a little more subtle. Anybody
who uses the template-based &lt;code&gt;area(const auto&amp;amp;)&lt;/code&gt; method must either
explicitly specify the concrete type, or be a template itself, to pass
along the template type of &lt;code&gt;area()&lt;/code&gt;.&lt;/p&gt;
&lt;h2 id="deriving-std-any"&gt;
Deriving std::any
&lt;a class="anchor" href="#deriving-std-any"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Imagine &lt;code&gt;Square&lt;/code&gt; and &lt;code&gt;Circle&lt;/code&gt; are fixed types with no shared base class,
and you cannot change them to inherit from one. But you still want to
handle them through a single common interface.&lt;/p&gt;
&lt;p&gt;One way to do that is to introduce wrappers. Define your own &lt;code&gt;Shape&lt;/code&gt;
interface, then create wrapper classes that inherit from &lt;code&gt;Shape&lt;/code&gt; and
contain a &lt;code&gt;Square&lt;/code&gt; or &lt;code&gt;Circle&lt;/code&gt;; each wrapper implements the virtual
methods by simply forwarding calls to the wrapped object&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SquareWrapper&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;Shape&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;Square&lt;/span&gt; &lt;span class="n"&gt;square_&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;explicit&lt;/span&gt; &lt;span class="n"&gt;SquareWrapper&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Square&lt;/span&gt; &lt;span class="n"&gt;square&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;square_&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;move&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;square&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;area&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;square_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;area&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CircleWrapper&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;Shape&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;Circle&lt;/span&gt; &lt;span class="n"&gt;circle_&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;explicit&lt;/span&gt; &lt;span class="n"&gt;CircleWrapper&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Circle&lt;/span&gt; &lt;span class="n"&gt;circle&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;circle_&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;move&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;circle&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;area&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;circle_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;area&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Now we can work directly with instances of &lt;code&gt;Shape&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;printAreas&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;unique_ptr&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Shape&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;auto&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nl"&gt;shape&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#34;Area is {:.2f}&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;area&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;unique_ptr&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Shape&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;{};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;shapes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;emplace_back&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;make_unique&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;SquareWrapper&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Square&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;}));&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;shapes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;emplace_back&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;make_unique&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;CircleWrapper&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Circle&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;}));&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;printAreas&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;shapes&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This approach works, but it has an obvious downside: you need a separate
wrapper type (like &lt;code&gt;CircleWrapper&lt;/code&gt;) for every concrete type you want to
adapt (like &lt;code&gt;Circle&lt;/code&gt;), which quickly turns into a pile of
boilerplate. Luckily, templates can offload much of that work to the
compiler by generating the needed code for each type automatically&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;template&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;typename&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ShapeWrapper&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;Shape&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="n"&gt;shape_&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;explicit&lt;/span&gt; &lt;span class="n"&gt;ShapeWrapper&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;shape_&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;move&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;area&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;shape_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;area&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;What we built above is the basis of the &amp;ldquo;type erasure&amp;rdquo; idiom. All
that&amp;rsquo;s left is to hide all this machinery behind another class, so that
callers don&amp;rsquo;t have to deal with our custom interfaces and
templates&lt;sup id="fnref:5"&gt;&lt;a href="#fn:5" class="footnote-ref" role="doc-noteref"&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AnyShape&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Shape&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="c1"&gt;// The interface
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="o"&gt;~&lt;/span&gt;&lt;span class="n"&gt;Shape&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;area&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;template&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;typename&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ShapeWrapper&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;Shape&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="c1"&gt;// The wrappers
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="n"&gt;shape_&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;explicit&lt;/span&gt; &lt;span class="n"&gt;ShapeWrapper&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;shape_&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;move&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;area&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;shape_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;area&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;unique_ptr&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Shape&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;shape_&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;template&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;typename&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;explicit&lt;/span&gt; &lt;span class="n"&gt;AnyShape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;shape_&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;make_unique&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;ShapeWrapper&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;forward&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;))}&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;area&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;shape_&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;area&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;It works the same as before, but the wrapper logic is hidden from the
consoomer&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;printAreas&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;AnyShape&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;auto&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nl"&gt;shape&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#34;Area is {:.2f}&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;area&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;AnyShape&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;{};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;shapes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;emplace_back&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Square&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;shapes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;emplace_back&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Circle&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;printAreas&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;shapes&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="generic-std-any"&gt;
Generic std::any
&lt;a class="anchor" href="#generic-std-any"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Both &lt;code&gt;Shape&lt;/code&gt; and &lt;code&gt;ShapeWrapper&lt;/code&gt; have accepted standard names: the former
is the type-erasure &lt;em&gt;concept&lt;/em&gt;&lt;sup id="fnref:6"&gt;&lt;a href="#fn:6" class="footnote-ref" role="doc-noteref"&gt;6&lt;/a&gt;&lt;/sup&gt; (the interface we program against), and the
latter is the &lt;em&gt;model&lt;/em&gt; (a templated wrapper that implements the interface
and forwards to a concrete type).&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s rewrite our original type erasure example to use the standard
parlance. Nothing needs to be changed except a few type names&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;memory&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Any&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Concept&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="o"&gt;~&lt;/span&gt;&lt;span class="n"&gt;Concept&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;template&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;typename&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Model&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;Concept&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="n"&gt;obj_&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;explicit&lt;/span&gt; &lt;span class="n"&gt;Model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;obj_&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;move&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;obj_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;unique_ptr&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Concept&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;obj_&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;template&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;typename&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;explicit&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;obj_&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;make_unique&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Model&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;forward&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;))}&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;obj_&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;That&amp;rsquo;s it! The class &lt;code&gt;Any&lt;/code&gt; is a simplified version of
&lt;code&gt;std::any&lt;/code&gt;,&lt;sup id="fnref:7"&gt;&lt;a href="#fn:7" class="footnote-ref" role="doc-noteref"&gt;7&lt;/a&gt;&lt;/sup&gt; which is even used in the STL itself
(namely, in &lt;code&gt;std::function&lt;/code&gt;). But that&amp;rsquo;s for another post.&lt;/p&gt;
&lt;div class="footnotes" role="doc-endnotes"&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id="fn:1"&gt;
&lt;p&gt;Remember that interfaces that
are intended to be used through a &lt;code&gt;Base&amp;amp;&lt;/code&gt; or &lt;code&gt;Base*&lt;/code&gt; must have a virtual
destructor, to ensure derived classes are properly destructed &lt;a href="https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#c127-a-class-with-a-virtual-function-should-have-a-virtual-or-protected-destructor"&gt;(C.127)&lt;/a&gt;.&amp;#160;&lt;a href="#fnref:1" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:2"&gt;
&lt;p&gt;In some cases, you may not have
control of the concrete types (e.g. think STL types like &lt;code&gt;std::string&lt;/code&gt;),
or it may not even be possible for the concrete type to inherit
(e.g. builtins like &lt;code&gt;int&lt;/code&gt;).&amp;#160;&lt;a href="#fnref:2" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:3"&gt;
&lt;p&gt;If you tried to pass in a type that doesn&amp;rsquo;t
conform to the &amp;lsquo;interface&amp;rsquo; (say, &lt;code&gt;std::string&lt;/code&gt;), the compiler would hit
an error when you try to compile the method call, complaining that
&lt;code&gt;std::string&lt;/code&gt; doesn&amp;rsquo;t have an &lt;code&gt;area&lt;/code&gt; method.&amp;#160;&lt;a href="#fnref:3" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:4"&gt;
&lt;p&gt;Since you&amp;rsquo;re employing polymorphism in the
first place, most callers will likely fall into the second group, and
will need to be templates themselves too so they can pass the type
through. That can quickly spread templates across the codebase, making
it harder to read and structure, increasing compile times, and producing
larger binaries with slower startup.&amp;#160;&lt;a href="#fnref:4" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:5"&gt;
&lt;p&gt;This implementation always heap-allocates. Production
&lt;code&gt;std::any&lt;/code&gt; implementations often use small buffer optimization (SBO)
techniques to store small objects inline and avoid allocation.&amp;#160;&lt;a href="#fnref:5" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:6"&gt;
&lt;p&gt;The type erasure concept is an
OO-style interface (a vtable). It&amp;rsquo;s unrelated to C++20 &lt;code&gt;concept&lt;/code&gt;
(compile-time predicates).&amp;#160;&lt;a href="#fnref:6" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:7"&gt;
&lt;p&gt;For a Rust version, see Waifod&amp;rsquo;s post &lt;a href="https://waifod.dev/blog/polymorphism-type-erasure/"&gt;Polymorphism in
C++ and Rust: Type Erasure&lt;/a&gt;.&amp;#160;&lt;a href="#fnref:7" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;</description></item><item><title>Devirtualization and Static Polymorphism</title><link>https://david.alvarezrosa.com/posts/devirtualization-and-static-polymorphism/</link><pubDate>Wed, 25 Feb 2026 09:45:00 +0000</pubDate><guid>https://david.alvarezrosa.com/posts/devirtualization-and-static-polymorphism/</guid><description>&lt;p&gt;Ever wondered why your &amp;ldquo;clean&amp;rdquo; polymorphic design underperforms in
benchmarks? Virtual dispatch enables polymorphism, but it comes with
hidden overhead: pointer indirection, larger object layouts, and fewer
inlining opportunities.&lt;/p&gt;
&lt;p&gt;Compilers do their best to &lt;em&gt;devirtualize&lt;/em&gt; these calls, but it isn&amp;rsquo;t
always possible. On latency-sensitive paths, it&amp;rsquo;s beneficial to
manually replace dynamic dispatch with &lt;em&gt;static polymorphism&lt;/em&gt;, so calls
are resolved at compile time and the abstraction has effectively zero
runtime cost.&lt;/p&gt;
&lt;h2 id="virtual-dispatch"&gt;
Virtual dispatch
&lt;a class="anchor" href="#virtual-dispatch"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Runtime polymorphism occurs when a base interface exposes a virtual
method that derived classes override. Calls made through a &lt;code&gt;Base&amp;amp;&lt;/code&gt; are
then dispatched to the appropriate override at runtime. Under the hood,
a virtual table (&lt;code&gt;vtable&lt;/code&gt;) is created &lt;em&gt;for each class&lt;/em&gt;, and a pointer
(&lt;code&gt;vptr&lt;/code&gt;) to the &lt;code&gt;vtable&lt;/code&gt; is added &lt;em&gt;to each instance&lt;/em&gt;.&lt;/p&gt;
&lt;figure&gt;
&lt;figcaption&gt;&lt;p&gt;&lt;span class="figure-number"&gt;Figure 1: &lt;/span&gt;&lt;strong&gt;Virtual dispatch diagram.&lt;/strong&gt; The method &lt;code&gt;foo&lt;/code&gt; is declared virtual in &lt;code&gt;Base&lt;/code&gt; and overridden in &lt;code&gt;Derived&lt;/code&gt;. Both classes get a &lt;code&gt;vtable&lt;/code&gt;, and each object gets a &lt;code&gt;vptr&lt;/code&gt; pointing to the corresponding &lt;code&gt;vtable&lt;/code&gt;.&lt;/p&gt;&lt;/figcaption&gt;
&lt;a href="https://david.alvarezrosa.com/images/diagram.png"&gt;
&lt;picture&gt;
&lt;source
type="image/webp"
srcset="https://david.alvarezrosa.com/images/diagram_hu_f1e84088d8783e72.webp 240w, https://david.alvarezrosa.com/images/diagram_hu_899951f6b9f325b5.webp 360w, https://david.alvarezrosa.com/images/diagram_hu_2b78aec3394843b9.webp 420w, https://david.alvarezrosa.com/images/diagram_hu_93e9332bfb10c67b.webp 480w, https://david.alvarezrosa.com/images/diagram_hu_a57a2b02d23b1399.webp 669w"
sizes="auto"&gt;
&lt;img
src="https://david.alvarezrosa.com/images/diagram.png"
srcset="https://david.alvarezrosa.com/images/diagram_hu_91d4e15182fa0d09.png 240w, https://david.alvarezrosa.com/images/diagram_hu_f6a8443852f8427a.png 360w, https://david.alvarezrosa.com/images/diagram_hu_86b53c0c65b99cff.png 420w, https://david.alvarezrosa.com/images/diagram_hu_34342dfafdf28308.png 480w, https://david.alvarezrosa.com/images/diagram_hu_325a9457c9d56c19.png 669w"
sizes="auto"
width="669"
height="207"
alt="Diagram"
loading="lazy"
decoding="async"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/figure&gt;
&lt;p&gt;On a virtual call, the compiler loads the &lt;code&gt;vptr&lt;/code&gt;, selects the right slot
in the &lt;code&gt;vtable&lt;/code&gt;, and performs an indirect call through that function
pointer. The drawback is that the extra &lt;code&gt;vptr&lt;/code&gt; increases object size,
and the indirection through the &lt;code&gt;vtable&lt;/code&gt; makes the call hard to predict.
This prevents inlining, increases branch mispredictions, and reduces
cache efficiency.&lt;/p&gt;
&lt;p&gt;The best way to observe this phenomenon is by inspecting the
assembly&lt;sup id="fnref:1"&gt;&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref"&gt;1&lt;/a&gt;&lt;/sup&gt; code emitted
by the compiler for a minimal example&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Base&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;bar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Base&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;77&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;For a non-virtual member function &lt;code&gt;foo&lt;/code&gt; like in the example above, the
free function &lt;code&gt;bar&lt;/code&gt; issues a direct call&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-asm" data-lang="asm"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nf"&gt;bar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;Base&lt;/span&gt;&lt;span class="p"&gt;*):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;sub&lt;/span&gt; &lt;span class="no"&gt;rsp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;call&lt;/span&gt; &lt;span class="no"&gt;Base&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="no"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;// Direct call
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;add&lt;/span&gt; &lt;span class="no"&gt;rsp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;add&lt;/span&gt; &lt;span class="no"&gt;eax&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;77&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;ret&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;However, declaring &lt;code&gt;foo&lt;/code&gt; as &lt;code&gt;virtual&lt;/code&gt; changes &lt;code&gt;bar&lt;/code&gt;&amp;rsquo;s assembly into an
indirect, vtable-based call&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-asm" data-lang="asm"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nf"&gt;bar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;Base&lt;/span&gt;&lt;span class="p"&gt;*):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;sub&lt;/span&gt; &lt;span class="no"&gt;rsp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;mov&lt;/span&gt; &lt;span class="no"&gt;rax&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;QWORD&lt;/span&gt; &lt;span class="no"&gt;PTR&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="no"&gt;rdi&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="c1"&gt;// vptr (pointer to vtable)
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;call&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="no"&gt;QWORD&lt;/span&gt; &lt;span class="no"&gt;PTR&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="no"&gt;rax&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="c1"&gt;// Virtual call
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;add&lt;/span&gt; &lt;span class="no"&gt;rsp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;add&lt;/span&gt; &lt;span class="no"&gt;eax&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;77&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;ret&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="devirtualization"&gt;
Devirtualization
&lt;a class="anchor" href="#devirtualization"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Sometimes the compiler can statically deduce which override a virtual
call will hit. In those cases, it &lt;em&gt;devirtualizes&lt;/em&gt; the call and emits a
direct call instead (skipping the &lt;code&gt;vtable&lt;/code&gt;). For example,
devirtualization is straightforward&lt;sup id="fnref:2"&gt;&lt;a href="#fn:2" class="footnote-ref" role="doc-noteref"&gt;2&lt;/a&gt;&lt;/sup&gt; when the runtime type is clearly fixed&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="nc"&gt;Base&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="nc"&gt;Derived&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Base&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;77&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;bar&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;Derived&lt;/span&gt; &lt;span class="n"&gt;derived&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;derived&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// compiler knows this is Derived::foo
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The compiler is able to devirtualize even through a base pointer, as
long as it can track the allocation and prove there is only one possible
concrete type. The problem is that with traditional compilation, object
files are created per translation unit (TU)&amp;mdash;compiled and optimized in
isolation. The linker simply stitches those objects together, so
cross-TU optimizations are inherently limited. That&amp;rsquo;s where compiler
flags are useful.&lt;/p&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;code&gt;-fwhole-program&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;tells the compiler &amp;ldquo;this translation unit is the
entire program.&amp;rdquo; If no class derives from &lt;code&gt;Base&lt;/code&gt; in this TU, the
compiler is free to assume nothing ever does, and can devirtualize
calls on &lt;code&gt;Base&lt;/code&gt;.&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;-flto&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;link-time optimization. Keeps an intermediate
representation in the object files and optimizes across all of them at
link time, effectively treating multiple source files as a single TU.&lt;/dd&gt;
&lt;/dl&gt;
&lt;p&gt;On the language side, &lt;code&gt;final&lt;/code&gt; is a lightweight way to give the compiler
the same guarantee for specific methods&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Base&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;bar&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Derived&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;Base&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="k"&gt;override&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// override
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;bar&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="k"&gt;final&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// final
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Derived&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;derived&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;derived&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;derived&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;bar&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Here, &lt;code&gt;foo()&lt;/code&gt; can still be overridden, so &lt;code&gt;derived-&amp;gt;foo()&lt;/code&gt; remains a
virtual call. However, &lt;code&gt;bar()&lt;/code&gt; is marked as &lt;code&gt;final&lt;/code&gt;, so the compiler
emits a direct call even though it&amp;rsquo;s declared &lt;code&gt;virtual&lt;/code&gt; in the base&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-asm" data-lang="asm"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;Derived&lt;/span&gt;&lt;span class="p"&gt;*):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;push&lt;/span&gt; &lt;span class="no"&gt;rbx&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;sub&lt;/span&gt; &lt;span class="no"&gt;rsp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;mov&lt;/span&gt; &lt;span class="no"&gt;rax&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;QWORD&lt;/span&gt; &lt;span class="no"&gt;PTR&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="no"&gt;rdi&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;mov&lt;/span&gt; &lt;span class="no"&gt;QWORD&lt;/span&gt; &lt;span class="no"&gt;PTR&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="no"&gt;rsp&lt;/span&gt;&lt;span class="err"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="no"&gt;rdi&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;call&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="no"&gt;QWORD&lt;/span&gt; &lt;span class="no"&gt;PTR&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="no"&gt;rax&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="c1"&gt;// Virtual call
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;mov&lt;/span&gt; &lt;span class="no"&gt;rdi&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;QWORD&lt;/span&gt; &lt;span class="no"&gt;PTR&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="no"&gt;rsp&lt;/span&gt;&lt;span class="err"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;mov&lt;/span&gt; &lt;span class="no"&gt;ebx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;eax&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;call&lt;/span&gt; &lt;span class="no"&gt;Derived&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="no"&gt;bar&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;// Direct call
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;add&lt;/span&gt; &lt;span class="no"&gt;rsp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;add&lt;/span&gt; &lt;span class="no"&gt;eax&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;ebx&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;pop&lt;/span&gt; &lt;span class="no"&gt;rbx&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;ret&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="static-polymorphism"&gt;
Static polymorphism
&lt;a class="anchor" href="#static-polymorphism"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;When the compiler can&amp;rsquo;t devirtualize, one option is to use static
polymorphism instead. The canonical tool for this is the Curiously
Recurring Template Pattern&lt;sup id="fnref:3"&gt;&lt;a href="#fn:3" class="footnote-ref" role="doc-noteref"&gt;3&lt;/a&gt;&lt;/sup&gt; (CRTP). With
CRTP, the base class is templated on the derived class, and invokes
methods on it via &lt;code&gt;static_cast&lt;/code&gt;&amp;mdash;no virtual keyword involved&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;template&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;typename&lt;/span&gt; &lt;span class="n"&gt;Derived&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Base&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;77&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="k"&gt;static_cast&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Derived&lt;/span&gt;&lt;span class="o"&gt;*&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;bar&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Derived&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;Base&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Derived&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;bar&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;88&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;Derived&lt;/span&gt; &lt;span class="n"&gt;derived&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;derived&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;With &lt;code&gt;-O3&lt;/code&gt; optimization, the compiler inlines everything and
constant-folds the result. No &lt;code&gt;vtable&lt;/code&gt;, no &lt;code&gt;vptr&lt;/code&gt;, no indirection.
Fully optimized&lt;sup id="fnref:4"&gt;&lt;a href="#fn:4" class="footnote-ref" role="doc-noteref"&gt;4&lt;/a&gt;&lt;/sup&gt; call.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-asm" data-lang="asm"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;mov&lt;/span&gt; &lt;span class="no"&gt;eax&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;165&lt;/span&gt; &lt;span class="c1"&gt;// 77 + 88
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nf"&gt;ret&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;Deducing this.&lt;/strong&gt; C++23&amp;rsquo;s &lt;em&gt;deducing this&lt;/em&gt; keeps the same static-dispatch
model but makes it easier to write. Instead of templating the entire
class (and writing &lt;code&gt;Base&amp;lt;Derived&amp;gt;&lt;/code&gt; everywhere), you template only the
member function that needs access to the derived type, and let the
compiler deduce &lt;code&gt;self&lt;/code&gt; from &lt;code&gt;*this&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Base&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt; &lt;span class="k"&gt;auto&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;77&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bar&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Derived&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;Base&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;bar&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;88&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This yields identical optimized code: &lt;code&gt;foo&lt;/code&gt; is instantiated as
&lt;code&gt;foo&amp;lt;Derived&amp;gt;&lt;/code&gt;, and the call to &lt;code&gt;bar&lt;/code&gt; is resolved statically and
inlined.&lt;/p&gt;
&lt;div class="footnotes" role="doc-endnotes"&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id="fn:1"&gt;
&lt;p&gt;Assembly generated with &lt;code&gt;gcc&lt;/code&gt; at &lt;code&gt;-O3&lt;/code&gt; on x86-64. Similar
results were observed with &lt;code&gt;clang&lt;/code&gt; on the same platform.&amp;#160;&lt;a href="#fnref:1" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:2"&gt;
&lt;p&gt;The compiler emits a direct call
to &lt;code&gt;Derived::foo&lt;/code&gt; (or inlines it), because &lt;code&gt;derived&lt;/code&gt; cannot have any
other dynamic type.&amp;#160;&lt;a href="#fnref:2" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:3"&gt;
&lt;p&gt;The curiously recurring template pattern
is an idiom where a class X derives from a class template instantiated
with X itself as a template argument. More generally, this is known as
F-bound polymorphism, a form of F-bounded quantification.&amp;#160;&lt;a href="#fnref:3" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:4"&gt;
&lt;p&gt;The trade-off is that each &lt;code&gt;Base&amp;lt;Derived&amp;gt;&lt;/code&gt;
instantiation is a distinct, unrelated type, so there&amp;rsquo;s no common
runtime base to upcast to. Any shared functionality that operates
across different derived types must itself be templated.&amp;#160;&lt;a href="#fnref:4" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;</description></item><item><title>David Álvarez Rosa</title><link>https://david.alvarezrosa.com/about/</link><pubDate>Sat, 24 Jan 2026 12:31:00 +0000</pubDate><guid>https://david.alvarezrosa.com/about/</guid><description>&lt;style&gt;
.byline { display: none; }
sup { display: none; }
.side img { max-width: 225px; box-shadow: none; margin-top: -26px;}
@media (max-width: 860px) { .side img { margin-top: 0!important; } }
main p:first-of-type::first-letter { float: revert; font-size: revert; font-family: revert; padding: revert; }
&lt;/style&gt;
&lt;p&gt; &lt;sup id="fnref:1"&gt;&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref"&gt;1&lt;/a&gt;&lt;/sup&gt;Mathematician and engineer based in sunny Dublin, passionate about
low-latency, high-performance systems.&lt;/p&gt;
&lt;p&gt;Currently working in algorithmic trading at Susquehanna. Previously
designed and built systems at Amazon serving 10M+ monthly active
customers, developed semantic caching for LLMs at Sopra Steria, and
conducted quantitative cybersecurity risk analysis at Deloitte.&lt;/p&gt;
&lt;p&gt;Holds a BSc in Mathematics, a BEng in Industrial Engineering, and an MSc
in Artificial Intelligence.&lt;/p&gt;
&lt;h2 id="experience"&gt;
Experience
&lt;a class="anchor" href="#experience"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Software Engineer @ Susquehanna&lt;/em&gt;
&lt;sup id="fnref:2"&gt;&lt;a href="#fn:2" class="footnote-ref" role="doc-noteref"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;br /&gt;
High-frequency options trading. Low-latency market data and trading
signals. Mentor and interviewer.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Software Engineer II @ Amazon&lt;/em&gt;
&lt;sup id="fnref:3"&gt;&lt;a href="#fn:3" class="footnote-ref" role="doc-noteref"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;br /&gt;
Designed systems for 10M+ monthly active customers. Contributed to 100+
internal repos. Won org hackathon. Promoted in 18 months (top 5%).
Mentored 3. On-call. Interviewer.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Machine Learning Engineer @ Sopra Steria&lt;/em&gt;
&lt;sup id="fnref:4"&gt;&lt;a href="#fn:4" class="footnote-ref" role="doc-noteref"&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;br /&gt;
Researched, designed, and built a semantic cache for LLMs.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Risk Analyst @ Deloitte&lt;/em&gt;
&lt;sup id="fnref:5"&gt;&lt;a href="#fn:5" class="footnote-ref" role="doc-noteref"&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;br /&gt;
Quantitative analysis of technological and cybersecurity risks for
top-tier banking companies.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Visiting Researcher @ Vector Institute&lt;/em&gt;
&lt;sup id="fnref:6"&gt;&lt;a href="#fn:6" class="footnote-ref" role="doc-noteref"&gt;6&lt;/a&gt;&lt;/sup&gt;&lt;br /&gt;
Research thesis on multimodal learning (recomprehension.com).&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Machine Learning Engineer @ BCN eMotorsport&lt;/em&gt;
&lt;sup id="fnref:7"&gt;&lt;a href="#fn:7" class="footnote-ref" role="doc-noteref"&gt;7&lt;/a&gt;&lt;/sup&gt;&lt;br /&gt;
Perception at Driverless UPC. Served as LiDAR lead and collaborated on
computer vision for a fully autonomous car.&lt;/p&gt;
&lt;h2 id="education"&gt;
Education
&lt;a class="anchor" href="#education"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;MSc in Artificial Intelligence&lt;/em&gt;
&lt;sup id="fnref:8"&gt;&lt;a href="#fn:8" class="footnote-ref" role="doc-noteref"&gt;8&lt;/a&gt;&lt;/sup&gt;&lt;br /&gt;
Official study program focused on AI research and enabling PhD.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;MSc in Mathematics&lt;/em&gt;&lt;br /&gt;
Math-lover part-time student. Dropout (joined Amazon).&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Research Thesis&lt;/em&gt;
&lt;sup id="fnref:9"&gt;&lt;a href="#fn:9" class="footnote-ref" role="doc-noteref"&gt;9&lt;/a&gt;&lt;/sup&gt;&lt;br /&gt;
Research thesis on multimodal learning at University of Toronto.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;BSc in Mathematics&lt;/em&gt;
&lt;sup id="fnref:10"&gt;&lt;a href="#fn:10" class="footnote-ref" role="doc-noteref"&gt;10&lt;/a&gt;&lt;/sup&gt;&lt;br /&gt;
Rigorous and proof-oriented degree with a robust mathematical base.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;BEng in Industrial Engineering&lt;/em&gt;
&lt;sup id="fnref:11"&gt;&lt;a href="#fn:11" class="footnote-ref" role="doc-noteref"&gt;11&lt;/a&gt;&lt;/sup&gt;&lt;br /&gt;
Multidisciplinary and integrative vision of industrial engineering.&lt;/p&gt;
&lt;h2 id="licenses-and-certifications"&gt;
Licenses &amp;amp; certifications
&lt;a class="anchor" href="#licenses-and-certifications"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Certificate in Advanced English (C1) &amp;mdash; Cambridge University&lt;br /&gt;
Machine Learning &amp;mdash; Stanford University&lt;br /&gt;
Deep Learning &amp;mdash; deeplearning.ai&lt;br /&gt;
Blockchain &amp;amp; Financial Technology &amp;mdash; Hong Kong University&lt;br /&gt;
Nova Talent Member &amp;mdash; Nova&lt;/p&gt;
&lt;h2 id="volunteering"&gt;
Volunteering
&lt;a class="anchor" href="#volunteering"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Mathematics Tutor&lt;/em&gt;&lt;br /&gt;
Academic training for the Mathematical Olympiads.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Volunteer @ Banco de Alimentos&lt;/em&gt;&lt;br /&gt;
Food collection for people in need.&lt;/p&gt;
&lt;h2 id="honors-and-awards"&gt;
Honors &amp;amp; awards
&lt;a class="anchor" href="#honors-and-awards"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Mathematical Olympiad&lt;/em&gt;&lt;br /&gt;
Silver in local (Pamplona), honors in national (Barcelona).&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Physics Olympiad&lt;/em&gt;&lt;br /&gt;
Gold in local (Pamplona), silver in national (Seville).&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Mobility Scholarship &amp;mdash; Cellex (CFIS)&lt;/em&gt;
&lt;sup id="fnref:12"&gt;&lt;a href="#fn:12" class="footnote-ref" role="doc-noteref"&gt;12&lt;/a&gt;&lt;/sup&gt;&lt;br /&gt;
Scholarship to carry out my research thesis at Toronto (€6k).&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Tuition and Housing Scholarship &amp;mdash; Cellex (CFIS)&lt;/em&gt;&lt;br /&gt;
University tuition and housing (€19k).&lt;/p&gt;
&lt;p&gt;&lt;em&gt;General Scholarship &amp;mdash; Government of Spain&lt;/em&gt;&lt;br /&gt;
Full university tuition plus an annual stipend (€11k).&lt;/p&gt;
&lt;h2 id="languages"&gt;
Languages
&lt;a class="anchor" href="#languages"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;English &amp;mdash; Proficient&lt;br /&gt;
Spanish &amp;mdash; Native&lt;br /&gt;
Catalan &amp;mdash; Intermediate&lt;/p&gt;
&lt;h2 id="contact"&gt;
Contact
&lt;a class="anchor" href="#contact"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;You can reach me at &lt;a href="mailto:david@alvarezrosa.com"&gt;david@alvarezrosa.com&lt;/a&gt; (preferred) or +34 647 13
39 30.&lt;/p&gt;
&lt;div class="footnotes" role="doc-endnotes"&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id="fn:1"&gt;
&lt;p&gt;
&lt;a href="https://david.alvarezrosa.com/images/portrait.png"&gt;
&lt;picture&gt;
&lt;source
type="image/webp"
srcset="https://david.alvarezrosa.com/images/portrait_hu_319e06f3c2d9aab2.webp 240w, https://david.alvarezrosa.com/images/portrait_hu_4e057814c04d88db.webp 360w, https://david.alvarezrosa.com/images/portrait_hu_d5de9f0c77799ac7.webp 420w, https://david.alvarezrosa.com/images/portrait_hu_d81cb5cb28601bee.webp 480w, https://david.alvarezrosa.com/images/portrait_hu_c72fd7c3b5ee7b71.webp 768w, https://david.alvarezrosa.com/images/portrait_hu_8bf0b1db22877231.webp 769w"
sizes="(max-width: 860px) 100vw, 21rem"&gt;
&lt;img
src="https://david.alvarezrosa.com/images/portrait.png"
srcset="https://david.alvarezrosa.com/images/portrait_hu_59126c26a9daad14.png 240w, https://david.alvarezrosa.com/images/portrait_hu_cdd353e6b8f0a4c1.png 360w, https://david.alvarezrosa.com/images/portrait_hu_3c0c6311c66f7d5e.png 420w, https://david.alvarezrosa.com/images/portrait_hu_2b3b314dae9fc5d1.png 480w, https://david.alvarezrosa.com/images/portrait_hu_86c8494b121ec559.png 768w, https://david.alvarezrosa.com/images/portrait_hu_7724dbb6cb35ffc8.png 769w"
sizes="(max-width: 860px) 100vw, 21rem"
width="769"
height="930"
alt="Portrait"
loading="eager"
fetchpriority="high"
decoding="async"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;strong&gt;That&amp;rsquo;s me!&lt;/strong&gt; March
2022.&amp;#160;&lt;a href="#fnref:1" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:2"&gt;
&lt;p&gt;Jul 2024&amp;ndash;Present&lt;br /&gt;
Dublin, Ireland&amp;#160;&lt;a href="#fnref:2" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:3"&gt;
&lt;p&gt;Mar 2022&amp;ndash;Aug 2024&lt;br /&gt;
Madrid, Spain&amp;#160;&lt;a href="#fnref:3" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:4"&gt;
&lt;p&gt;Apr 2024&amp;ndash;Jul 2024&lt;br /&gt;
Remote&amp;#160;&lt;a href="#fnref:4" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:5"&gt;
&lt;p&gt;Sep 2021&amp;ndash;Mar 2022&lt;br /&gt;
Madrid, Spain&amp;#160;&lt;a href="#fnref:5" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:6"&gt;
&lt;p&gt;Sep 2020&amp;ndash;Jun 2021&lt;br /&gt;
Toronto, Canada&amp;#160;&lt;a href="#fnref:6" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:7"&gt;
&lt;p&gt;Sep 2019&amp;ndash;Feb 2020&lt;br /&gt;
Barcelona, Spain&amp;#160;&lt;a href="#fnref:7" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:8"&gt;
&lt;p&gt;GPA 9.00/10&lt;br /&gt;
Honors in 6 subjects&amp;#160;&lt;a href="#fnref:8" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:9"&gt;
&lt;p&gt;GPA 10/10 (A+)&amp;#160;&lt;a href="#fnref:9" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:10"&gt;
&lt;p&gt;GPA 8.12/10 (top 10%)&lt;br /&gt;
Honors in 9 subjects&amp;#160;&lt;a href="#fnref:10" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:11"&gt;
&lt;p&gt;GPA 8.03/10 (top 2%)&lt;br /&gt;
Honors in 14 subjects&amp;#160;&lt;a href="#fnref:11" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:12"&gt;
&lt;p&gt;Canceled due to Covid-19&amp;#160;&lt;a href="#fnref:12" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;</description></item><item><title>About this Site</title><link>https://david.alvarezrosa.com/posts/about-this-site/</link><pubDate>Fri, 11 May 2018 21:18:00 +0100</pubDate><guid>https://david.alvarezrosa.com/posts/about-this-site/</guid><description>&lt;p&gt;After nearly a decade, I&amp;rsquo;ve rebuilt my personal site from scratch&amp;mdash;a
blog on software, self-hosting, and lessons learned along the way. This
first post is backdated to when I originally launched a site on this
domain, back in 2018.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What?&lt;/strong&gt;&amp;mdash;A personal blog. Posts aim to be concise, practical, and
rigorous.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Why?&lt;/strong&gt;&amp;mdash;To give back to the &lt;em&gt;libre&lt;/em&gt; software community. To keep a
record of my learning journey. To deepen my own understanding by
writing things down.&lt;sup id="fnref:1"&gt;&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Who?&lt;/strong&gt;&amp;mdash;See the About page.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Where?&lt;/strong&gt;&amp;mdash;Self-hosted at my mother&amp;rsquo;s house in northern Spain (ha!),
exposed to the Internet through a WireGuard tunnel to a VPS, and served
via a global CDN.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;When?&lt;/strong&gt;&amp;mdash;Once or twice a month.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;How?&lt;/strong&gt;&amp;mdash;Written in Emacs Org-mode, exported to Hugo with a custom
theme.&lt;sup id="fnref:2"&gt;&lt;a href="#fn:2" class="footnote-ref" role="doc-noteref"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;If you spot an error or have suggestions, I welcome feedback&amp;mdash;reach out
via the contact link on the About page.&lt;/p&gt;
&lt;div class="footnotes" role="doc-endnotes"&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id="fn:1"&gt;
&lt;p&gt;As Einstein put it, &lt;em&gt;&amp;ldquo;If you can&amp;rsquo;t explain it
simply, you don&amp;rsquo;t understand it well enough.&amp;rdquo;&lt;/em&gt; Feynman followed the same
principle, testing his understanding by teaching from first principles.&amp;#160;&lt;a href="#fnref:1" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:2"&gt;
&lt;p&gt;Inspired by the work of Edward Tufte and Matthew Butterick.&amp;#160;&lt;a href="#fnref:2" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;</description></item><item><title>Building a Mouse Jiggler</title><link>https://david.alvarezrosa.com/posts/building-a-mouse-jiggler/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://david.alvarezrosa.com/posts/building-a-mouse-jiggler/</guid><description>&lt;p&gt;A microcontroller that pretends to be a keyboard and mouse is one of the
most useful weekend projects I&amp;rsquo;ve put together. It can keep a machine
awake during a long compile, replay a tedious sequence of shortcuts at
the press of a button, or type out whatever you want&amp;mdash;all from a board
that costs a couple of euros.&lt;/p&gt;
&lt;p&gt;I built mine on an RP2040&lt;sup id="fnref:1"&gt;&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref"&gt;1&lt;/a&gt;&lt;/sup&gt;&amp;mdash;Raspberry Pi&amp;rsquo;s first in-house silicon, whose
native USB controller makes HID emulation possible without any extra
hardware. Source and prebuilt firmware are on &lt;a href="https://github.com/david-alvarez-rosa/FakeKeyboardMouse"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;
&lt;figure&gt;
&lt;a href="https://david.alvarezrosa.com/images/mouse-demo.gif"&gt;&lt;img src="https://david.alvarezrosa.com/images/mouse-demo.gif" alt="Mouse demo" loading="lazy" decoding="async"&gt;&lt;/a&gt;
&lt;/figure&gt;
&lt;p&gt;A single press of the &lt;code&gt;BOOTSEL&lt;/code&gt; button starts the automation; another
press stops it.&lt;/p&gt;
&lt;h2 id="hardware"&gt;
Hardware
&lt;a class="anchor" href="#hardware"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Any RP2040 board works. I picked up a cheap MINI USB RP2040 Development
Board Module from AliExpress&amp;mdash;dual core, 4 MB flash, around three euros
shipped.&lt;/p&gt;
&lt;figure&gt;
&lt;figcaption&gt;&lt;p&gt;&lt;span class="figure-number"&gt;Figure 1: &lt;/span&gt;&lt;strong&gt;The order.&lt;/strong&gt; €3.86 a board with free shipping&amp;mdash;I grabbed seven so a brick or two along the way wouldn&amp;rsquo;t end the project.&lt;/p&gt;&lt;/figcaption&gt;
&lt;a href="https://david.alvarezrosa.com/images/invoice.png"&gt;
&lt;picture&gt;
&lt;source
type="image/webp"
srcset="https://david.alvarezrosa.com/images/invoice_hu_1173c08ec2fc351.webp 240w, https://david.alvarezrosa.com/images/invoice_hu_f15f12c117e94547.webp 360w, https://david.alvarezrosa.com/images/invoice_hu_75668569038aa73a.webp 420w, https://david.alvarezrosa.com/images/invoice_hu_fd778df41c768017.webp 480w, https://david.alvarezrosa.com/images/invoice_hu_a018e0a918cb835e.webp 560w"
sizes="auto"&gt;
&lt;img
src="https://david.alvarezrosa.com/images/invoice.png"
srcset="https://david.alvarezrosa.com/images/invoice_hu_db8d38854c2e2358.png 240w, https://david.alvarezrosa.com/images/invoice_hu_78953e78f6eb5eec.png 360w, https://david.alvarezrosa.com/images/invoice_hu_5c2988246c6cc8ff.png 420w, https://david.alvarezrosa.com/images/invoice_hu_4dead0e6e7e576b6.png 480w, https://david.alvarezrosa.com/images/invoice_hu_bbc36d4189fa97cb.png 560w"
sizes="auto"
width="560"
height="284"
alt="Invoice"
loading="lazy"
decoding="async"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/figure&gt;
&lt;p&gt;The official Raspberry Pi Pico is the better documented option, but
these no-name clones are pin-compatible and the same firmware runs on
either.&lt;/p&gt;
&lt;h2 id="flashing-the-firmware"&gt;
Flashing the firmware
&lt;a class="anchor" href="#flashing-the-firmware"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;While holding the &lt;code&gt;BOOTSEL&lt;/code&gt; button, plug the board into your computer.
It enumerates as a USB mass storage device. Identify it with &lt;code&gt;lsblk&lt;/code&gt;,
create a mount point, and mount it.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ lsblk
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo mkdir /mnt/micro
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo mount /dev/sda1 /mnt/micro
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Grab the latest &lt;code&gt;.uf2&lt;/code&gt; from the &lt;a href="https://github.com/david-alvarez-rosa/FakeKeyboardMouse/releases"&gt;releases&lt;/a&gt; page. Three binaries are
published: &lt;code&gt;fake_keyboard&lt;/code&gt; for keyboard only, &lt;code&gt;fake_mouse&lt;/code&gt; for mouse
only, and &lt;code&gt;fake_keyboard_mouse&lt;/code&gt; for both. Copy whichever one you want.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ cp fake_keyboard.uf2 /mnt/micro
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The board reboots automatically and re-enumerates as a keyboard, a
mouse, or both. Press &lt;code&gt;BOOTSEL&lt;/code&gt; to start the automation, press it again
to stop.&lt;/p&gt;
&lt;h2 id="building-from-source"&gt;
Building from source
&lt;a class="anchor" href="#building-from-source"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;If you&amp;rsquo;d rather build it yourself, install the ARM toolchain.&lt;sup id="fnref:2"&gt;&lt;a href="#fn:2" class="footnote-ref" role="doc-noteref"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo pacman -S arm-none-eabi-gcc arm-none-eabi-newlib &lt;span class="c1"&gt;# Arch&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo apt-get install gcc-arm-none-eabi libnewlib-arm-none-eabi &lt;span class="c1"&gt;# Debian&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Fetch the Pico SDK and &lt;code&gt;picotool&lt;/code&gt; submodules.&lt;sup id="fnref:3"&gt;&lt;a href="#fn:3" class="footnote-ref" role="doc-noteref"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ git submodule update --init --recursive
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Build with CMake, pointing &lt;code&gt;PICO_SDK_PATH&lt;/code&gt; at the SDK submodule.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ &lt;span class="nv"&gt;PICO_SDK_PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;./lib/pico-sdk cmake -B build -G Ninja
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ cmake --build build
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The build emits three &lt;code&gt;.uf2&lt;/code&gt; files under &lt;code&gt;build&lt;/code&gt;, matching the binaries
on the releases page. Flash one as described above.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ cp ./build/fake_keyboard.uf2 /mnt/micro
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="hello-world"&gt;
Hello, world
&lt;a class="anchor" href="#hello-world"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;The fastest way to confirm the board, toolchain, and SDK are all wired
up correctly is to flash a minimal program and watch it print over USB
serial.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;stdio.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;#34;pico/stdlib.h&amp;#34;&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;stdio_init_all&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#34;Hello, world!&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;sleep_ms&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;A matching &lt;code&gt;CMakeLists.txt&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cmake" data-lang="cmake"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;cmake_minimum_required&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;VERSION&lt;/span&gt; &lt;span class="s"&gt;3.13&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="err"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;include&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;./lib/pico-sdk/pico_sdk_init.cmake&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;project&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;HelloWorld&lt;/span&gt; &lt;span class="s"&gt;C&lt;/span&gt; &lt;span class="s"&gt;CXX&lt;/span&gt; &lt;span class="s"&gt;ASM&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="err"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;CMAKE_EXPORT_COMPILE_COMMANDS&lt;/span&gt; &lt;span class="s"&gt;ON&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="err"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;pico_sdk_init&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="err"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;add_executable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;hello_world&lt;/span&gt; &lt;span class="s"&gt;main.cpp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;pico_enable_stdio_usb&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;hello_world&lt;/span&gt; &lt;span class="s"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;pico_enable_stdio_uart&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;hello_world&lt;/span&gt; &lt;span class="s"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;target_link_libraries&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;hello_world&lt;/span&gt; &lt;span class="s"&gt;pico_stdlib&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;pico_add_extra_outputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;hello_world&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Build and flash the same way as before.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ &lt;span class="nv"&gt;PICO_SDK_PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;./lib/pico-sdk cmake -B build -G Ninja
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ cmake --build build
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ cp ./build/hello_world.uf2 /mnt/micro
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Once it reboots, attach to the serial TTY&lt;sup id="fnref:4"&gt;&lt;a href="#fn:4" class="footnote-ref" role="doc-noteref"&gt;4&lt;/a&gt;&lt;/sup&gt; and watch
the greeting roll in.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ screen /dev/ttyACM0
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Hello, world!
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Hello, world!
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Hello, world!
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;If that works, you have everything you need to iterate on your own HID
firmware.&lt;/p&gt;
&lt;h2 id="specifications"&gt;
Specifications
&lt;a class="anchor" href="#specifications"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;The only specs the manufacturer shipped with the board are these
photos&amp;mdash;no datasheet, no pinout diagram, nothing.&lt;sup id="fnref:5"&gt;&lt;a href="#fn:5" class="footnote-ref" role="doc-noteref"&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;
&lt;a href="https://david.alvarezrosa.com/images/spec-1.jpg"&gt;
&lt;picture&gt;
&lt;source
type="image/webp"
srcset="https://david.alvarezrosa.com/images/spec-1_hu_ace676d0fdd65565.webp 240w, https://david.alvarezrosa.com/images/spec-1_hu_fb456404adab0837.webp 360w, https://david.alvarezrosa.com/images/spec-1_hu_80ba7cd02dd3578e.webp 420w, https://david.alvarezrosa.com/images/spec-1_hu_35ae8ff95a4a15a5.webp 480w, https://david.alvarezrosa.com/images/spec-1_hu_dec0186e93010c9d.webp 768w, https://david.alvarezrosa.com/images/spec-1_hu_46241ad3971f26fd.webp 1000w"
sizes="(max-width: 860px) 100vw, 21rem"&gt;
&lt;img
src="https://david.alvarezrosa.com/images/spec-1.jpg"
srcset="https://david.alvarezrosa.com/images/spec-1_hu_a9a506ff2527efc8.jpg 240w, https://david.alvarezrosa.com/images/spec-1_hu_b4fba0d93d532807.jpg 360w, https://david.alvarezrosa.com/images/spec-1_hu_734545726961f427.jpg 420w, https://david.alvarezrosa.com/images/spec-1_hu_82c6a6e65592577f.jpg 480w, https://david.alvarezrosa.com/images/spec-1_hu_dc6fe68347263483.jpg 768w, https://david.alvarezrosa.com/images/spec-1_hu_6450c7ec968d9795.jpg 1000w"
sizes="(max-width: 860px) 100vw, 21rem"
width="1000"
height="1000"
alt="Spec 1"
loading="eager"
fetchpriority="high"
decoding="async"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/th&gt;
&lt;th&gt;
&lt;a href="https://david.alvarezrosa.com/images/spec-2.jpg"&gt;
&lt;picture&gt;
&lt;source
type="image/webp"
srcset="https://david.alvarezrosa.com/images/spec-2_hu_f7e2cd33388d4652.webp 240w, https://david.alvarezrosa.com/images/spec-2_hu_d15e1ead0508ca99.webp 360w, https://david.alvarezrosa.com/images/spec-2_hu_8e31f5caad333b38.webp 420w, https://david.alvarezrosa.com/images/spec-2_hu_2a62ccb897887c0.webp 480w, https://david.alvarezrosa.com/images/spec-2_hu_1d96d64a0a20c681.webp 768w, https://david.alvarezrosa.com/images/spec-2_hu_ea9b64cbfbb27a36.webp 1000w"
sizes="auto"&gt;
&lt;img
src="https://david.alvarezrosa.com/images/spec-2.jpg"
srcset="https://david.alvarezrosa.com/images/spec-2_hu_e5dea1b990c2413b.jpg 240w, https://david.alvarezrosa.com/images/spec-2_hu_427d6ff2bf1e092.jpg 360w, https://david.alvarezrosa.com/images/spec-2_hu_dfc285778e56033a.jpg 420w, https://david.alvarezrosa.com/images/spec-2_hu_4cbad71be1b88aac.jpg 480w, https://david.alvarezrosa.com/images/spec-2_hu_645388c4656bb2fd.jpg 768w, https://david.alvarezrosa.com/images/spec-2_hu_971900ddd6afb0f3.jpg 1000w"
sizes="auto"
width="1000"
height="1000"
alt="Spec 2"
loading="lazy"
decoding="async"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/th&gt;
&lt;th&gt;
&lt;a href="https://david.alvarezrosa.com/images/spec-3.jpg"&gt;
&lt;picture&gt;
&lt;source
type="image/webp"
srcset="https://david.alvarezrosa.com/images/spec-3_hu_2021c8d23a4967b1.webp 240w, https://david.alvarezrosa.com/images/spec-3_hu_288be69116431c14.webp 360w, https://david.alvarezrosa.com/images/spec-3_hu_5472cde341dc9535.webp 420w, https://david.alvarezrosa.com/images/spec-3_hu_d8c4bf6a313490f5.webp 480w, https://david.alvarezrosa.com/images/spec-3_hu_7ca5490d91e8311d.webp 768w, https://david.alvarezrosa.com/images/spec-3_hu_f5e9e0d2d5e8efb2.webp 1000w"
sizes="auto"&gt;
&lt;img
src="https://david.alvarezrosa.com/images/spec-3.jpg"
srcset="https://david.alvarezrosa.com/images/spec-3_hu_e049382bab876264.jpg 240w, https://david.alvarezrosa.com/images/spec-3_hu_4ed328239d79f46c.jpg 360w, https://david.alvarezrosa.com/images/spec-3_hu_31612ab4c7429273.jpg 420w, https://david.alvarezrosa.com/images/spec-3_hu_f64fbe4aca94f719.jpg 480w, https://david.alvarezrosa.com/images/spec-3_hu_5a684e56b4e7856e.jpg 768w, https://david.alvarezrosa.com/images/spec-3_hu_4138b5c79f75e6cc.jpg 1000w"
sizes="auto"
width="1000"
height="1000"
alt="Spec 3"
loading="lazy"
decoding="async"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;a href="https://david.alvarezrosa.com/images/spec-4.jpg"&gt;
&lt;picture&gt;
&lt;source
type="image/webp"
srcset="https://david.alvarezrosa.com/images/spec-4_hu_2bd8267d0ab4c687.webp 240w, https://david.alvarezrosa.com/images/spec-4_hu_e631816862754947.webp 360w, https://david.alvarezrosa.com/images/spec-4_hu_791299137893c599.webp 420w, https://david.alvarezrosa.com/images/spec-4_hu_6cf67341a595e614.webp 480w, https://david.alvarezrosa.com/images/spec-4_hu_7080a69a1f57a338.webp 768w, https://david.alvarezrosa.com/images/spec-4_hu_67dc7b02a2bf7e0b.webp 1000w"
sizes="auto"&gt;
&lt;img
src="https://david.alvarezrosa.com/images/spec-4.jpg"
srcset="https://david.alvarezrosa.com/images/spec-4_hu_23fb01a7a38f530c.jpg 240w, https://david.alvarezrosa.com/images/spec-4_hu_363e42eae8b8f040.jpg 360w, https://david.alvarezrosa.com/images/spec-4_hu_8b716f501ab0fa7.jpg 420w, https://david.alvarezrosa.com/images/spec-4_hu_575fbae2ccbd8fdc.jpg 480w, https://david.alvarezrosa.com/images/spec-4_hu_4a55d4aece3f7bd3.jpg 768w, https://david.alvarezrosa.com/images/spec-4_hu_2fb1cefc7f4b0dec.jpg 1000w"
sizes="auto"
width="1000"
height="1000"
alt="Spec 4"
loading="lazy"
decoding="async"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/td&gt;
&lt;td&gt;
&lt;a href="https://david.alvarezrosa.com/images/spec-5.jpg"&gt;
&lt;picture&gt;
&lt;source
type="image/webp"
srcset="https://david.alvarezrosa.com/images/spec-5_hu_c18a2195bbc8db8d.webp 240w, https://david.alvarezrosa.com/images/spec-5_hu_f251ca3633e9559.webp 360w, https://david.alvarezrosa.com/images/spec-5_hu_1442809bc40a4177.webp 420w, https://david.alvarezrosa.com/images/spec-5_hu_b3382400a9cc6229.webp 480w, https://david.alvarezrosa.com/images/spec-5_hu_4ab94fdbeb42a1aa.webp 768w, https://david.alvarezrosa.com/images/spec-5_hu_27f61d34eac10aeb.webp 1000w"
sizes="auto"&gt;
&lt;img
src="https://david.alvarezrosa.com/images/spec-5.jpg"
srcset="https://david.alvarezrosa.com/images/spec-5_hu_2da6a021b667ca67.jpg 240w, https://david.alvarezrosa.com/images/spec-5_hu_9420f4cf2d48e6a.jpg 360w, https://david.alvarezrosa.com/images/spec-5_hu_f929e8d938fd9e8e.jpg 420w, https://david.alvarezrosa.com/images/spec-5_hu_f4f88792bcdd9586.jpg 480w, https://david.alvarezrosa.com/images/spec-5_hu_bb49d3ac1e5c1177.jpg 768w, https://david.alvarezrosa.com/images/spec-5_hu_4098e0770bbc1f3.jpg 1000w"
sizes="auto"
width="1000"
height="1000"
alt="Spec 5"
loading="lazy"
decoding="async"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/td&gt;
&lt;td&gt;
&lt;a href="https://david.alvarezrosa.com/images/spec-6.jpg"&gt;
&lt;picture&gt;
&lt;source
type="image/webp"
srcset="https://david.alvarezrosa.com/images/spec-6_hu_f8271f2f35fe4388.webp 240w, https://david.alvarezrosa.com/images/spec-6_hu_b4bc82cd8a88b09.webp 360w, https://david.alvarezrosa.com/images/spec-6_hu_5ae75c6845bbd070.webp 420w, https://david.alvarezrosa.com/images/spec-6_hu_c8e4ba005b9ebc88.webp 480w, https://david.alvarezrosa.com/images/spec-6_hu_41465945289864e9.webp 768w, https://david.alvarezrosa.com/images/spec-6_hu_ad2ae086052cf416.webp 1000w"
sizes="auto"&gt;
&lt;img
src="https://david.alvarezrosa.com/images/spec-6.jpg"
srcset="https://david.alvarezrosa.com/images/spec-6_hu_d468fc1a74f53461.jpg 240w, https://david.alvarezrosa.com/images/spec-6_hu_b4048257382ba25a.jpg 360w, https://david.alvarezrosa.com/images/spec-6_hu_14d2153816cf0612.jpg 420w, https://david.alvarezrosa.com/images/spec-6_hu_dda6f576f2443ba1.jpg 480w, https://david.alvarezrosa.com/images/spec-6_hu_2f1c3432af415e90.jpg 768w, https://david.alvarezrosa.com/images/spec-6_hu_eef5d8ef646a836b.jpg 1000w"
sizes="auto"
width="1000"
height="1000"
alt="Spec 6"
loading="lazy"
decoding="async"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;figure&gt;
&lt;a href="https://david.alvarezrosa.com/images/spec-7.jpg"&gt;
&lt;picture&gt;
&lt;source
type="image/webp"
srcset="https://david.alvarezrosa.com/images/spec-7_hu_8fb5d933b45f593.webp 240w, https://david.alvarezrosa.com/images/spec-7_hu_c682ed6dd15cbc33.webp 360w, https://david.alvarezrosa.com/images/spec-7_hu_8f979d4b342eb60b.webp 420w, https://david.alvarezrosa.com/images/spec-7_hu_3c39427a5829db7f.webp 480w, https://david.alvarezrosa.com/images/spec-7_hu_ab59c36b3fbba0ef.webp 768w, https://david.alvarezrosa.com/images/spec-7_hu_b7f0b4e76124b903.webp 1000w"
sizes="auto"&gt;
&lt;img
src="https://david.alvarezrosa.com/images/spec-7.jpg"
srcset="https://david.alvarezrosa.com/images/spec-7_hu_eecdb22388633458.jpg 240w, https://david.alvarezrosa.com/images/spec-7_hu_be92270014a2d817.jpg 360w, https://david.alvarezrosa.com/images/spec-7_hu_e5fbb72854fb8220.jpg 420w, https://david.alvarezrosa.com/images/spec-7_hu_b1b82c6cf7d57639.jpg 480w, https://david.alvarezrosa.com/images/spec-7_hu_247f47d5dd393f59.jpg 768w, https://david.alvarezrosa.com/images/spec-7_hu_53e4239727a38be5.jpg 1000w"
sizes="auto"
width="1000"
height="1000"
alt="Spec 7"
loading="lazy"
decoding="async"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/figure&gt;
&lt;br /&gt;
&lt;p&gt;That&amp;rsquo;s all the moving parts. The repository ships a few example
scripts to get you started&amp;mdash;swap them out, recompile, and the board
will type&lt;sup id="fnref:6"&gt;&lt;a href="#fn:6" class="footnote-ref" role="doc-noteref"&gt;6&lt;/a&gt;&lt;/sup&gt; or wiggle in whatever pattern you like.&lt;/p&gt;
&lt;div class="footnotes" role="doc-endnotes"&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id="fn:1"&gt;
&lt;p&gt;
&lt;a href="https://david.alvarezrosa.com/images/spec-2.jpg"&gt;
&lt;picture&gt;
&lt;source
type="image/webp"
srcset="https://david.alvarezrosa.com/images/spec-2_hu_f7e2cd33388d4652.webp 240w, https://david.alvarezrosa.com/images/spec-2_hu_d15e1ead0508ca99.webp 360w, https://david.alvarezrosa.com/images/spec-2_hu_8e31f5caad333b38.webp 420w, https://david.alvarezrosa.com/images/spec-2_hu_2a62ccb897887c0.webp 480w, https://david.alvarezrosa.com/images/spec-2_hu_1d96d64a0a20c681.webp 768w, https://david.alvarezrosa.com/images/spec-2_hu_ea9b64cbfbb27a36.webp 1000w"
sizes="auto"&gt;
&lt;img
src="https://david.alvarezrosa.com/images/spec-2.jpg"
srcset="https://david.alvarezrosa.com/images/spec-2_hu_e5dea1b990c2413b.jpg 240w, https://david.alvarezrosa.com/images/spec-2_hu_427d6ff2bf1e092.jpg 360w, https://david.alvarezrosa.com/images/spec-2_hu_dfc285778e56033a.jpg 420w, https://david.alvarezrosa.com/images/spec-2_hu_4cbad71be1b88aac.jpg 480w, https://david.alvarezrosa.com/images/spec-2_hu_645388c4656bb2fd.jpg 768w, https://david.alvarezrosa.com/images/spec-2_hu_971900ddd6afb0f3.jpg 1000w"
sizes="auto"
width="1000"
height="1000"
alt="Spec 2"
loading="lazy"
decoding="async"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;strong&gt;The board.&lt;/strong&gt;
Roughly the size of a thumbnail&amp;mdash;the USB-A plug is etched into the PCB
instead of soldered on.&amp;#160;&lt;a href="#fnref:1" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:2"&gt;
&lt;p&gt;&lt;code&gt;arm-none-eabi-gcc&lt;/code&gt; is the cross-compiler targeting ARM
microcontrollers; &lt;code&gt;arm-none-eabi-newlib&lt;/code&gt; provides a slim C standard
library suited for embedded targets.&amp;#160;&lt;a href="#fnref:2" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:3"&gt;
&lt;p&gt;The Pico SDK ships the
C/C++ libraries for RP2040 development; &lt;code&gt;picotool&lt;/code&gt; is a command-line
utility for inspecting boards and uploading firmware.&amp;#160;&lt;a href="#fnref:3" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:4"&gt;
&lt;p&gt;The board exposes USB CDC
as &lt;code&gt;/dev/ttyACM0&lt;/code&gt; on Linux. Exit &lt;code&gt;screen&lt;/code&gt; with &lt;code&gt;Ctrl-a k&lt;/code&gt;.&amp;#160;&lt;a href="#fnref:4" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:5"&gt;
&lt;p&gt;Welcome to
no-name AliExpress electronics. The RP2040 itself is well documented,
so in practice the official datasheet is what you&amp;rsquo;ll lean on.&amp;#160;&lt;a href="#fnref:5" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:6"&gt;
&lt;p&gt;
&lt;a href="https://david.alvarezrosa.com/images/keyboard-demo.gif"&gt;&lt;img src="https://david.alvarezrosa.com/images/keyboard-demo.gif" alt="Keyboard demo" loading="lazy" decoding="async"&gt;&lt;/a&gt;
&lt;strong&gt;The keyboard variant in
action.&lt;/strong&gt;&amp;#160;&lt;a href="#fnref:6" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;</description></item><item><title>Exploring CPU Caches</title><link>https://david.alvarezrosa.com/posts/exploring-cpu-caches/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://david.alvarezrosa.com/posts/exploring-cpu-caches/</guid><description>&lt;p&gt;Pending.&lt;/p&gt;</description></item><item><title>Incremental clang-tidy tool</title><link>https://david.alvarezrosa.com/posts/incremental-clang-tidy-tool/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://david.alvarezrosa.com/posts/incremental-clang-tidy-tool/</guid><description>&lt;p&gt;How to incrementally run clang-tidy or cppcheck&lt;/p&gt;</description></item><item><title>Laissez Faire, Laissez Mourir</title><link>https://david.alvarezrosa.com/posts/laissez-faire-laissez-mourir/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://david.alvarezrosa.com/posts/laissez-faire-laissez-mourir/</guid><description>&lt;p&gt;There is an arrangement in which whoever cannot pay to survive is
told, &amp;ldquo;Die, then.&amp;rdquo; We live in it. It has been on my mind as the
world prepares to mint its first trillionaire. What follows is
numbers. They are worse than you think. Elon Musk is worth $834
billion&amp;mdash;a fifth of what the bottom half of America, 66 million
households, owns combined&lt;sup id="fnref:1"&gt;&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref"&gt;1&lt;/a&gt;&lt;/sup&gt;&amp;mdash;and his approved pay package
runs to one trillion. This is what a trillion looks like.&lt;/p&gt;
&lt;figure class="trillion fullwidth"&gt;
&lt;div class="trillion-scroll"&gt;
&lt;svg width="94090" height="432" viewBox="0 0 94090 432" role="img"
aria-label="A horizontal halftone field of one million specks,
each speck one million dollars, together one trillion dollars.
The field scrolls to the right for ninety-four thousand
pixels before it ends."&gt;
&lt;defs&gt;
&lt;pattern id="speck" width="6" height="6" patternUnits="userSpaceOnUse"&gt;
&lt;circle cx="3" cy="3" r="2" fill="#111"/&gt;
&lt;/pattern&gt;
&lt;/defs&gt;
&lt;text class="halo" x="2" y="16"&gt;&amp;#9660; each speck: one million dollars&amp;#8201;&amp;#8212;&amp;#8201;one comfortable retirement&lt;/text&gt;
&lt;text class="halo" x="2" y="36"&gt;(the median family&amp;#8217;s everything, $192,900: a fifth of a speck)&lt;/text&gt;
&lt;rect x="0" y="48" width="93750" height="384" fill="url(#speck)"/&gt;
&lt;text class="halo" x="937.5" y="86" text-anchor="middle"&gt;$10 billion&lt;/text&gt;
&lt;text class="halo" x="1875" y="86" text-anchor="middle"&gt;$20 billion&lt;/text&gt;
&lt;text class="halo" x="2812.5" y="86" text-anchor="middle"&gt;$30 billion&lt;/text&gt;
&lt;text class="halo" x="3750" y="86" text-anchor="middle"&gt;$40 billion&lt;/text&gt;
&lt;text class="halo" x="4687.5" y="86" text-anchor="middle"&gt;$50 billion&lt;/text&gt;
&lt;text class="halo" x="5625" y="86" text-anchor="middle"&gt;$60 billion&lt;/text&gt;
&lt;text class="halo" x="6562.5" y="86" text-anchor="middle"&gt;$70 billion&lt;/text&gt;
&lt;text class="halo" x="7500" y="86" text-anchor="middle"&gt;$80 billion&lt;/text&gt;
&lt;text class="halo" x="8437.5" y="86" text-anchor="middle"&gt;$90 billion&lt;/text&gt;
&lt;text class="halo big" x="9375" y="86" text-anchor="middle"&gt;one hundred billion&lt;/text&gt;
&lt;text class="halo" x="10312.5" y="86" text-anchor="middle"&gt;$110 billion&lt;/text&gt;
&lt;text class="halo" x="11250" y="86" text-anchor="middle"&gt;$120 billion&lt;/text&gt;
&lt;text class="halo" x="12187.5" y="86" text-anchor="middle"&gt;$130 billion&lt;/text&gt;
&lt;text class="halo" x="13125" y="86" text-anchor="middle"&gt;$140 billion&lt;/text&gt;
&lt;text class="halo" x="14062.5" y="86" text-anchor="middle"&gt;$150 billion&lt;/text&gt;
&lt;text class="halo" x="15000" y="86" text-anchor="middle"&gt;$160 billion&lt;/text&gt;
&lt;text class="halo" x="15937.5" y="86" text-anchor="middle"&gt;$170 billion&lt;/text&gt;
&lt;text class="halo" x="16875" y="86" text-anchor="middle"&gt;$180 billion&lt;/text&gt;
&lt;text class="halo" x="17812.5" y="86" text-anchor="middle"&gt;$190 billion&lt;/text&gt;
&lt;text class="halo big" x="18750" y="86" text-anchor="middle"&gt;two hundred billion&lt;/text&gt;
&lt;text class="halo" x="19687.5" y="86" text-anchor="middle"&gt;$210 billion&lt;/text&gt;
&lt;text class="halo" x="20625" y="86" text-anchor="middle"&gt;$220 billion&lt;/text&gt;
&lt;text class="halo" x="21562.5" y="86" text-anchor="middle"&gt;$230 billion&lt;/text&gt;
&lt;text class="halo" x="22500" y="86" text-anchor="middle"&gt;$240 billion&lt;/text&gt;
&lt;text class="halo" x="23437.5" y="86" text-anchor="middle"&gt;$250 billion&lt;/text&gt;
&lt;text class="halo" x="24375" y="86" text-anchor="middle"&gt;$260 billion&lt;/text&gt;
&lt;text class="halo" x="25312.5" y="86" text-anchor="middle"&gt;$270 billion&lt;/text&gt;
&lt;text class="halo" x="26250" y="86" text-anchor="middle"&gt;$280 billion&lt;/text&gt;
&lt;text class="halo" x="27187.5" y="86" text-anchor="middle"&gt;$290 billion&lt;/text&gt;
&lt;text class="halo big" x="28125" y="86" text-anchor="middle"&gt;three hundred billion&lt;/text&gt;
&lt;text class="halo" x="29062.5" y="86" text-anchor="middle"&gt;$310 billion&lt;/text&gt;
&lt;text class="halo" x="30000" y="86" text-anchor="middle"&gt;$320 billion&lt;/text&gt;
&lt;text class="halo" x="30937.5" y="86" text-anchor="middle"&gt;$330 billion&lt;/text&gt;
&lt;text class="halo" x="31875" y="86" text-anchor="middle"&gt;$340 billion&lt;/text&gt;
&lt;text class="halo" x="32812.5" y="86" text-anchor="middle"&gt;$350 billion&lt;/text&gt;
&lt;text class="halo" x="33750" y="86" text-anchor="middle"&gt;$360 billion&lt;/text&gt;
&lt;text class="halo" x="34687.5" y="86" text-anchor="middle"&gt;$370 billion&lt;/text&gt;
&lt;text class="halo" x="35625" y="86" text-anchor="middle"&gt;$380 billion&lt;/text&gt;
&lt;text class="halo" x="36562.5" y="86" text-anchor="middle"&gt;$390 billion&lt;/text&gt;
&lt;text class="halo big" x="37500" y="86" text-anchor="middle"&gt;four hundred billion&lt;/text&gt;
&lt;text class="halo" x="38437.5" y="86" text-anchor="middle"&gt;$410 billion&lt;/text&gt;
&lt;text class="halo" x="39375" y="86" text-anchor="middle"&gt;$420 billion&lt;/text&gt;
&lt;text class="halo" x="40312.5" y="86" text-anchor="middle"&gt;$430 billion&lt;/text&gt;
&lt;text class="halo" x="41250" y="86" text-anchor="middle"&gt;$440 billion&lt;/text&gt;
&lt;text class="halo" x="42187.5" y="86" text-anchor="middle"&gt;$450 billion&lt;/text&gt;
&lt;text class="halo" x="43125" y="86" text-anchor="middle"&gt;$460 billion&lt;/text&gt;
&lt;text class="halo" x="44062.5" y="86" text-anchor="middle"&gt;$470 billion&lt;/text&gt;
&lt;text class="halo" x="45000" y="86" text-anchor="middle"&gt;$480 billion&lt;/text&gt;
&lt;text class="halo" x="45937.5" y="86" text-anchor="middle"&gt;$490 billion&lt;/text&gt;
&lt;text class="halo big" x="46875" y="86" text-anchor="middle"&gt;five hundred billion&lt;/text&gt;
&lt;text class="halo" x="47812.5" y="86" text-anchor="middle"&gt;$510 billion&lt;/text&gt;
&lt;text class="halo" x="48750" y="86" text-anchor="middle"&gt;$520 billion&lt;/text&gt;
&lt;text class="halo" x="49687.5" y="86" text-anchor="middle"&gt;$530 billion&lt;/text&gt;
&lt;text class="halo" x="50625" y="86" text-anchor="middle"&gt;$540 billion&lt;/text&gt;
&lt;text class="halo" x="51562.5" y="86" text-anchor="middle"&gt;$550 billion&lt;/text&gt;
&lt;text class="halo" x="52500" y="86" text-anchor="middle"&gt;$560 billion&lt;/text&gt;
&lt;text class="halo" x="53437.5" y="86" text-anchor="middle"&gt;$570 billion&lt;/text&gt;
&lt;text class="halo" x="54375" y="86" text-anchor="middle"&gt;$580 billion&lt;/text&gt;
&lt;text class="halo" x="55312.5" y="86" text-anchor="middle"&gt;$590 billion&lt;/text&gt;
&lt;text class="halo big" x="56250" y="86" text-anchor="middle"&gt;six hundred billion&lt;/text&gt;
&lt;text class="halo" x="57187.5" y="86" text-anchor="middle"&gt;$610 billion&lt;/text&gt;
&lt;text class="halo" x="58125" y="86" text-anchor="middle"&gt;$620 billion&lt;/text&gt;
&lt;text class="halo" x="59062.5" y="86" text-anchor="middle"&gt;$630 billion&lt;/text&gt;
&lt;text class="halo" x="60000" y="86" text-anchor="middle"&gt;$640 billion&lt;/text&gt;
&lt;text class="halo" x="60937.5" y="86" text-anchor="middle"&gt;$650 billion&lt;/text&gt;
&lt;text class="halo" x="61875" y="86" text-anchor="middle"&gt;$660 billion&lt;/text&gt;
&lt;text class="halo" x="62812.5" y="86" text-anchor="middle"&gt;$670 billion&lt;/text&gt;
&lt;text class="halo" x="63750" y="86" text-anchor="middle"&gt;$680 billion&lt;/text&gt;
&lt;text class="halo" x="64687.5" y="86" text-anchor="middle"&gt;$690 billion&lt;/text&gt;
&lt;text class="halo big" x="65625" y="86" text-anchor="middle"&gt;seven hundred billion&lt;/text&gt;
&lt;text class="halo" x="66562.5" y="86" text-anchor="middle"&gt;$710 billion&lt;/text&gt;
&lt;text class="halo" x="67500" y="86" text-anchor="middle"&gt;$720 billion&lt;/text&gt;
&lt;text class="halo" x="68437.5" y="86" text-anchor="middle"&gt;$730 billion&lt;/text&gt;
&lt;text class="halo" x="69375" y="86" text-anchor="middle"&gt;$740 billion&lt;/text&gt;
&lt;text class="halo" x="70312.5" y="86" text-anchor="middle"&gt;$750 billion&lt;/text&gt;
&lt;text class="halo" x="71250" y="86" text-anchor="middle"&gt;$760 billion&lt;/text&gt;
&lt;text class="halo" x="72187.5" y="86" text-anchor="middle"&gt;$770 billion&lt;/text&gt;
&lt;text class="halo" x="73125" y="86" text-anchor="middle"&gt;$780 billion&lt;/text&gt;
&lt;text class="halo" x="74062.5" y="86" text-anchor="middle"&gt;$790 billion&lt;/text&gt;
&lt;text class="halo big" x="75000" y="86" text-anchor="middle"&gt;eight hundred billion&lt;/text&gt;
&lt;text class="halo" x="75937.5" y="86" text-anchor="middle"&gt;$810 billion&lt;/text&gt;
&lt;text class="halo" x="76875" y="86" text-anchor="middle"&gt;$820 billion&lt;/text&gt;
&lt;text class="halo" x="77812.5" y="86" text-anchor="middle"&gt;$830 billion&lt;/text&gt;
&lt;text class="halo" x="78750" y="86" text-anchor="middle"&gt;$840 billion&lt;/text&gt;
&lt;text class="halo" x="79687.5" y="86" text-anchor="middle"&gt;$850 billion&lt;/text&gt;
&lt;text class="halo" x="80625" y="86" text-anchor="middle"&gt;$860 billion&lt;/text&gt;
&lt;text class="halo" x="81562.5" y="86" text-anchor="middle"&gt;$870 billion&lt;/text&gt;
&lt;text class="halo" x="82500" y="86" text-anchor="middle"&gt;$880 billion&lt;/text&gt;
&lt;text class="halo" x="83437.5" y="86" text-anchor="middle"&gt;$890 billion&lt;/text&gt;
&lt;text class="halo big" x="84375" y="86" text-anchor="middle"&gt;nine hundred billion&lt;/text&gt;
&lt;text class="halo" x="85312.5" y="86" text-anchor="middle"&gt;$910 billion&lt;/text&gt;
&lt;text class="halo" x="86250" y="86" text-anchor="middle"&gt;$920 billion&lt;/text&gt;
&lt;text class="halo" x="87187.5" y="86" text-anchor="middle"&gt;$930 billion&lt;/text&gt;
&lt;text class="halo" x="88125" y="86" text-anchor="middle"&gt;$940 billion&lt;/text&gt;
&lt;text class="halo" x="89062.5" y="86" text-anchor="middle"&gt;$950 billion&lt;/text&gt;
&lt;text class="halo" x="90000" y="86" text-anchor="middle"&gt;$960 billion&lt;/text&gt;
&lt;text class="halo" x="90937.5" y="86" text-anchor="middle"&gt;$970 billion&lt;/text&gt;
&lt;text class="halo" x="91875" y="86" text-anchor="middle"&gt;$980 billion&lt;/text&gt;
&lt;text class="halo" x="92812.5" y="86" text-anchor="middle"&gt;$990 billion&lt;/text&gt;
&lt;text class="halo" x="300" y="416"&gt;&amp;#9758; scroll&lt;/text&gt;
&lt;line x1="93750.5" y1="48" x2="93750.5" y2="432" stroke="#111" stroke-width="1"/&gt;
&lt;text class="mk-lbl" x="93764" y="246"&gt;one trillion dollars&lt;/text&gt;
&lt;/svg&gt;
&lt;/div&gt;
&lt;figcaption&gt;&lt;p&gt;&lt;strong&gt;One trillion dollars.&lt;/strong&gt; One
million specks, one million dollars each&amp;#8212;64 specks tall, 15,625
specks wide. A speck is a comfortable retirement; the median family
owns a fifth of one. The plate runs ninety-four thousand pixels to
the right&amp;#8212;some twenty-five metres of paper&amp;#8212;and it does end.
Counted at one dollar per second: a million dollars takes 11&amp;#189;
days; a billion, 32 years; the full plate, 31,700 years.&lt;/p&gt;&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;style&gt;
.trillion-scroll {
overflow-x: auto;
}
figure.trillion svg .halo {
font-style: italic; font-size: 1.1rem; fill: #444;
stroke: #fcfcfc; stroke-width: 9; paint-order: stroke;
}
figure.trillion svg .halo.big {
font-size: 1.4rem; fill: #222;
}
figure.trillion svg .mk-lbl {
fill: #111;
font-family: "Alegreya SC", "Alegreya SC Fallback", serif;
font-size: 1.3rem;
}
&lt;/style&gt;
&lt;p&gt;Capital compounds; labour does not. At a conservative 5%, $834
billion yields a median household income ($83,730)&lt;sup id="fnref:2"&gt;&lt;a href="#fn:2" class="footnote-ref" role="doc-noteref"&gt;2&lt;/a&gt;&lt;/sup&gt; every 63 seconds&amp;mdash;a forty-year working life every 42 minutes.
Once returns exceed any possible spending, a fortune is no longer
earned; it accrues.&lt;/p&gt;
&lt;p&gt;The taxes bind mostly those who work. The 25 richest Americans added
$401 billion to their net worth over 2014&amp;ndash;2018 and paid $13.6
billion in income tax&amp;mdash;a true rate of 3.4%: Musk 3.27% (zero in
2018), Bezos 0.98%, Buffett 0.10%.&lt;sup id="fnref:3"&gt;&lt;a href="#fn:3" class="footnote-ref" role="doc-noteref"&gt;3&lt;/a&gt;&lt;/sup&gt; A wage earner pays 15.3% from the first dollar&amp;mdash;7.65%
withheld, 7.65% more via the employer, borne by the worker&amp;mdash;four and
a half times the billionaires&amp;rsquo; rate, before income tax even begins.&lt;/p&gt;
&lt;p&gt;Jeff Yass, owner of Susquehanna ($65 billion),&lt;sup id="fnref:4"&gt;&lt;a href="#fn:4" class="footnote-ref" role="doc-noteref"&gt;4&lt;/a&gt;&lt;/sup&gt; paid the 20% long-term rate on trading income ordinarily
taxed near 40%&amp;mdash;roughly $1 billion saved, disputed in
court.&lt;sup id="fnref:5"&gt;&lt;a href="#fn:5" class="footnote-ref" role="doc-noteref"&gt;5&lt;/a&gt;&lt;/sup&gt; Susquehanna also owns some 15% of ByteDance, TikTok&amp;rsquo;s
parent;&lt;sup id="fnref:6"&gt;&lt;a href="#fn:6" class="footnote-ref" role="doc-noteref"&gt;6&lt;/a&gt;&lt;/sup&gt; Yass put $100 million into the
2024 election cycle&amp;mdash;$16 million linked to anti-Muslim and pro-Israel
groups&lt;sup id="fnref:7"&gt;&lt;a href="#fn:7" class="footnote-ref" role="doc-noteref"&gt;7&lt;/a&gt;&lt;/sup&gt;&amp;mdash;and days after
they met in March 2024, Trump reversed his support for the TikTok ban;
the divest-or-ban law, passed anyway, went unenforced. $100 million
shielding a $21 billion stake: 210 to 1.&lt;/p&gt;
&lt;p&gt;None of this breaks the law, because the law taxes realization, not
accrual: never sell; borrow against the shares&amp;mdash;loan proceeds are
not income; die, and the cost basis resets, erasing the gain for the
heirs. Buy, borrow, die. Not a loophole but the design: taxing
those gains at death would raise $536 billion over a decade in the US
alone.&lt;sup id="fnref:8"&gt;&lt;a href="#fn:8" class="footnote-ref" role="doc-noteref"&gt;8&lt;/a&gt;&lt;/sup&gt; Tens of millions then pass untaxed to heirs whose
classmates&amp;rsquo; median family owns $192,900&amp;mdash;total.&lt;sup id="fnref:9"&gt;&lt;a href="#fn:9" class="footnote-ref" role="doc-noteref"&gt;9&lt;/a&gt;&lt;/sup&gt; That is not equality of
opportunity.&lt;/p&gt;
&lt;p&gt;Neither is it democracy. Globally, 1.6% of adults own 48.1% of all
wealth ($226 trillion); the poorest 1.55 billion share under 1%; the
2,891 billionaires alone hold $15.6 trillion&amp;mdash;and wealth buys the
legislature&amp;rsquo;s attention.&lt;sup id="fnref:10"&gt;&lt;a href="#fn:10" class="footnote-ref" role="doc-noteref"&gt;10&lt;/a&gt;&lt;/sup&gt; Laissez faire for them; laissez
mourir for the rest.&lt;/p&gt;
&lt;p&gt;Tax the rich. Tax net worth. Tax inheritance. Suckers!&lt;/p&gt;
&lt;div class="footnotes" role="doc-endnotes"&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id="fn:1"&gt;
&lt;p&gt;Federal Reserve, &lt;a href="https://www.federalreserve.gov/releases/z1/dataviz/dfa/distribute/table/"&gt;Distributional Financial Accounts&lt;/a&gt;:
the bottom 50% of US households held $4.1 trillion&amp;mdash;2.5% of
household net worth&amp;mdash;at end-2024.&amp;#160;&lt;a href="#fnref:1" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:2"&gt;
&lt;p&gt;U.S. Census Bureau,
&lt;a href="https://www.census.gov/library/publications/2025/demo/p60-286.html"&gt;&lt;em&gt;Income in the United States: 2024&lt;/em&gt;&lt;/a&gt;, report P60-286, September
2025.&amp;#160;&lt;a href="#fnref:2" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:3"&gt;
&lt;p&gt;ProPublica,
&lt;a href="https://www.propublica.org/article/the-secret-irs-files-trove-of-never-before-seen-records-reveal-how-the-wealthiest-avoid-income-tax"&gt;&lt;em&gt;The Secret IRS Files&lt;/em&gt;&lt;/a&gt;, June 8, 2021. The &amp;ldquo;true tax rate&amp;rdquo; compares
federal income tax paid with growth in net worth over the same
period.&amp;#160;&lt;a href="#fnref:3" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:4"&gt;
&lt;p&gt;&lt;a href="https://www.forbes.com/profile/jeff-yass/"&gt;Forbes profile&lt;/a&gt;,
December 2025.&amp;#160;&lt;a href="#fnref:4" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:5"&gt;
&lt;p&gt;ProPublica, &lt;a href="https://www.propublica.org/article/how-susquehanna-yass-avoided-billion-taxes"&gt;&lt;em&gt;How Susquehanna&amp;rsquo;s Jeff Yass Avoided $1 Billion
in Taxes&lt;/em&gt;&lt;/a&gt;, 2022. The Tax Court dispute was filed in 2020 and remains
pending.&amp;#160;&lt;a href="#fnref:5" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:6"&gt;
&lt;p&gt;&lt;a href="https://www.inquirer.com/politics/pennsylvania/tiktok-ban-jeff-yass-congress-house-20240313.html"&gt;&amp;ldquo;A ban on TikTok would be a blow to local billionaire
investor and GOP megadonor Jeff Yass,&amp;rdquo;&lt;/a&gt; &lt;em&gt;The Philadelphia Inquirer&lt;/em&gt;,
March 13, 2024; &lt;a href="https://abcnews.go.com/Politics/trumps-tiktok-ban-reversal-after-meeting-megadonor-stake/story?id=108013785"&gt;ABC News&lt;/a&gt;, March 2024.&amp;#160;&lt;a href="#fnref:6" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:7"&gt;
&lt;p&gt;&lt;a href="https://www.opensecrets.org/outside-spending/donor_detail/2024?id=U0000004245&amp;amp;name=Yass%2C+Jeffrey+S"&gt;OpenSecrets, donor detail, 2024 cycle&lt;/a&gt;; E. Clifton,
&lt;a href="https://www.theguardian.com/us-news/2024/apr/24/jeff-yass-anti-muslim-pro-israel-donations"&gt;&amp;ldquo;Billionaire Jeff Yass linked to $16m in donations to anti-Muslim and
pro-Israel groups,&amp;rdquo;&lt;/a&gt; &lt;em&gt;The Guardian&lt;/em&gt;, April 24, 2024.&amp;#160;&lt;a href="#fnref:7" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:8"&gt;
&lt;p&gt;Congressional Budget Office, Budget Option:
&lt;a href="https://www.cbo.gov/budget-options/60943"&gt;&lt;em&gt;Change the Tax Treatment of Capital Gains from Sales of Inherited Assets&lt;/em&gt;&lt;/a&gt;,
December 2024.&amp;#160;&lt;a href="#fnref:8" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:9"&gt;
&lt;p&gt;Federal Reserve Board,
&lt;a href="https://www.federalreserve.gov/publications/october-2023-changes-in-us-family-finances-from-2019-to-2022.htm"&gt;&lt;em&gt;Changes in U.S. Family Finances from 2019 to 2022&lt;/em&gt;&lt;/a&gt;, Survey of
Consumer Finances, October 2023.&amp;#160;&lt;a href="#fnref:9" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:10"&gt;
&lt;p&gt;UBS, &lt;a href="https://www.ubs.com/global/en/wealthmanagement/insights/global-wealth-report.html"&gt;&lt;em&gt;Global Wealth Report 2025&lt;/em&gt;&lt;/a&gt;,
June 2025; figures as of end-2024.&amp;#160;&lt;a href="#fnref:10" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;</description></item><item><title>One Hundred Thousand Views</title><link>https://david.alvarezrosa.com/posts/one-hundred-thousand-views/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://david.alvarezrosa.com/posts/one-hundred-thousand-views/</guid><description>&lt;p&gt;This site just passed one hundred thousand page views.&lt;sup id="fnref:1"&gt;&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref"&gt;1&lt;/a&gt;&lt;/sup&gt; The number itself is vanity, but each count is a person who read
something I wrote, often to the end. So, before anything else: thank
you.&lt;/p&gt;
&lt;p&gt;Two things in the breakdown are worth passing on.&lt;/p&gt;
&lt;p&gt;Almost nobody arrives from Google.&lt;sup id="fnref:2"&gt;&lt;a href="#fn:2" class="footnote-ref" role="doc-noteref"&gt;2&lt;/a&gt;&lt;/sup&gt; Nine in ten of
you came from Reddit or Hacker News: the front page, not the search
index. So the traffic spikes and never accrues&amp;mdash;a post lands, it&amp;rsquo;s read
widely for a day, and between spikes the line lies flat.&lt;sup id="fnref:3"&gt;&lt;a href="#fn:3" class="footnote-ref" role="doc-noteref"&gt;3&lt;/a&gt;&lt;/sup&gt; I don&amp;rsquo;t own this audience so much as rent it, one post
at a time.&lt;/p&gt;
&lt;p&gt;And the posts that travel are not the posts you read.&lt;sup id="fnref:4"&gt;&lt;a href="#fn:4" class="footnote-ref" role="doc-noteref"&gt;4&lt;/a&gt;&lt;/sup&gt; The self-hosting posts win the lottery and are read
exactly once, the mark of a drive-by from an aggregator. The deep
ones&amp;mdash;a lock-free ring buffer, the fundamental theorem of calculus,
devirtualization&amp;mdash;earn seven to nine views apiece and keep earning them
long after the spike has died.&lt;sup id="fnref:5"&gt;&lt;a href="#fn:5" class="footnote-ref" role="doc-noteref"&gt;5&lt;/a&gt;&lt;/sup&gt; You come
back to the hard posts. Virality evaporates; depth compounds.&lt;/p&gt;
&lt;p&gt;So, again: thank you for reading. If you&amp;rsquo;d like the next post to find
you rather than leaving it to the front page, &lt;a href="https://david.alvarezrosa.com/#subscribe"&gt;subscribe to the newsletter&lt;/a&gt;
or follow along by &lt;a href="https://david.alvarezrosa.com/index.xml"&gt;RSS&lt;/a&gt;.&lt;sup id="fnref:6"&gt;&lt;a href="#fn:6" class="footnote-ref" role="doc-noteref"&gt;6&lt;/a&gt;&lt;/sup&gt; Either way, you become the kind of reader
a spike can&amp;rsquo;t take back.&lt;/p&gt;
&lt;br /&gt;
&lt;p&gt;On to the next hundred thousand.&lt;/p&gt;
&lt;div class="footnotes" role="doc-endnotes"&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id="fn:1"&gt;
&lt;p&gt;Counted by my
self-hosted &lt;a href="https://umami.is/"&gt;Umami&lt;/a&gt; instance, which doesn&amp;rsquo;t see anyone who blocks
it&amp;mdash;so the real total is higher. The average visit lasts five and a
half minutes, and two thirds of readers leave from the page they landed
on.&amp;#160;&lt;a href="#fnref:1" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:2"&gt;
&lt;p&gt;Every search engine
combined&amp;mdash;Google, Bing, DuckDuckGo, Kagi&amp;mdash;accounts for under three
percent of arrivals. Google by itself barely registers.&amp;#160;&lt;a href="#fnref:2" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:3"&gt;
&lt;p&gt;Hacker News
sends about a third of Reddit&amp;rsquo;s traffic but reads five times as much per
head: eleven views per reader against two. Reddit drove roughly eight
thousand views, Hacker News fourteen and a half thousand. The most
devoted source of all is &lt;code&gt;isocpp.org&lt;/code&gt;, whose readers averaged more than
four visits each.&amp;#160;&lt;a href="#fnref:3" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:4"&gt;
&lt;p&gt;Views per
reader tell the story: the lock-free ring buffer earns 8.7, the
fundamental theorem of calculus 7.5, devirtualization 7.2, type erasure
3.3&amp;mdash;against 1.6 and 1.4 for the two self-hosting posts. The
ring-buffer post has the most views on the whole site despite a fraction
of the reach.&amp;#160;&lt;a href="#fnref:4" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:5"&gt;
&lt;p&gt;The referrer log is full of small
delights: the C++ community carried the systems posts on its own
(&lt;code&gt;isocpp.org&lt;/code&gt;, libhunt, Meeting C++); a whole ecosystem of Hacker News
clients and RSS readers I&amp;rsquo;d never heard of; Chinese aggregators like
Zhihu and Weibo; and, for the first time, AI assistants citing
pages&amp;mdash;Claude, ChatGPT, Gemini, Copilot. My favourites are the entries
that shouldn&amp;rsquo;t exist at all: hits from a phone&amp;rsquo;s offline cache and from a
local &lt;code&gt;/Users/.../index.html&lt;/code&gt; path&amp;mdash;someone who kept a copy.&amp;#160;&lt;a href="#fnref:5" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:6"&gt;
&lt;p&gt;Both are free and there is no catch. The
newsletter is plain text&amp;mdash;no tracking, no HTML, just the post&amp;mdash;and
unsubscribing takes one click.&amp;#160;&lt;a href="#fnref:6" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;</description></item><item><title>Optimizing a Spin-Lock</title><link>https://david.alvarezrosa.com/posts/optimizing-a-spin-lock/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://david.alvarezrosa.com/posts/optimizing-a-spin-lock/</guid><description>&lt;p&gt;A spin-lock is a mutex that never sleeps: instead of yielding to the
scheduler when the lock is taken, the thread stays on the CPU and keeps
retrying&amp;mdash;&lt;em&gt;spinning&lt;/em&gt;&amp;mdash;until it succeeds, avoiding syscalls and context
switches for critical sections of a few nanoseconds. In this post we&amp;rsquo;ll
write the basic version, see why it is slow, and fix it step by step
until it beats &lt;code&gt;std::mutex&lt;/code&gt; by 3.4x under contention.&lt;/p&gt;
&lt;h2 id="the-benchmark"&gt;
The benchmark
&lt;a class="anchor" href="#the-benchmark"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Each thread increments a shared counter under the lock&amp;mdash;250k
increments in total, split evenly across threads. The total work is
&lt;em&gt;fixed:&lt;/em&gt; with a perfect lock the time stays flat as threads are added,
and any growth is pure synchronization overhead. Threads are pinned to
their own cores&lt;sup id="fnref:1"&gt;&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;template&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;typename&lt;/span&gt; &lt;span class="n"&gt;SpinLock&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;BM_SpinLock&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;benchmark&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;State&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;num_threads&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;spin_lock&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SpinLock&lt;/span&gt;&lt;span class="p"&gt;{};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kt"&gt;uint64_t&lt;/span&gt;&lt;span class="p"&gt;{};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;threads&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kr"&gt;thread&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;{};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;threads&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reserve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_threads&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nl"&gt;_&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0U&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;num_threads&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;threads&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;emplace_back&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;pinThread&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0U&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;250&amp;#39;000U&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;num_threads&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;spin_lock&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lock&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;spin_lock&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;unlock&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="kr"&gt;thread&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;threads&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kr"&gt;thread&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;benchmark&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;DoNotOptimize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;threads&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;clear&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Every version below exposes &lt;code&gt;lock&lt;/code&gt; and &lt;code&gt;unlock&lt;/code&gt;. The baseline,
&lt;code&gt;SpinLockV0&lt;/code&gt;, simply wraps &lt;code&gt;std::mutex&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;BM_SpinLock&amp;lt;SpinLockV0&amp;gt;/1/real_time 1.62 ms
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;BM_SpinLock&amp;lt;SpinLockV0&amp;gt;/2/real_time 5.48 ms
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;BM_SpinLock&amp;lt;SpinLockV0&amp;gt;/4/real_time 6.13 ms
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Going from one thread to two more than triples the time, but from two
to four it barely moves: a contended &lt;code&gt;std::mutex&lt;/code&gt; parks waiters in the
kernel and wakes them one at a time, so the damage does not compound.
The number to beat: &lt;strong&gt;6.13 ms&lt;/strong&gt; at four threads.&lt;/p&gt;
&lt;h2 id="a-basic-spin-lock"&gt;
A basic spin-lock
&lt;a class="anchor" href="#a-basic-spin-lock"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;The simplest correct spin-lock is an atomic bool and an exchange
loop&lt;sup id="fnref:2"&gt;&lt;a href="#fn:2" class="footnote-ref" role="doc-noteref"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SpinLockV1&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;atomic_bool&lt;/span&gt; &lt;span class="n"&gt;locked_&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;lock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;locked_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exchange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;unlock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;locked_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;BM_SpinLock&amp;lt;SpinLockV1&amp;gt;/1/real_time 1.20 ms
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;BM_SpinLock&amp;lt;SpinLockV1&amp;gt;/2/real_time 10.3 ms
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;BM_SpinLock&amp;lt;SpinLockV1&amp;gt;/4/real_time 14.5 ms
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Uncontended, the spin-lock already wins: &lt;strong&gt;1.20 ms&lt;/strong&gt; against the mutex&amp;rsquo;s
1.62 ms&amp;mdash;locking is now a single atomic instruction with no library
machinery around it. Under contention it is a disaster: 1.9x slower
than the mutex at two threads, 2.4x at four&amp;mdash;while burning CPU for the
entire wait.&lt;/p&gt;
&lt;p&gt;The problem is cache coherence. To write to a cache line, a core must
first own it exclusively, invalidating every other core&amp;rsquo;s copy. An
atomic exchange is a write &lt;em&gt;even when it fails&lt;/em&gt; and merely swaps &lt;code&gt;true&lt;/code&gt;
over &lt;code&gt;true&lt;/code&gt;. So every waiting thread constantly steals the line away
from everyone else&amp;mdash;including from the lock holder, who needs that same
line back just to &lt;em&gt;release&lt;/em&gt; the lock. The line ping-pongs between
cores, and the one useful increment drowns in coherence traffic.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;perf stat -d&lt;/code&gt; counts hardware events; compare one thread against four&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ perf stat -d ./benchmark --benchmark_filter&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;V1&amp;gt;/1&amp;#39;&lt;/span&gt; --benchmark_min_time&lt;span class="o"&gt;=&lt;/span&gt;500x
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; 128,088,290 branches
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; 1,896,594 branch-misses &lt;span class="c1"&gt;# 1.48% of all branches&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; 645,478,415 L1-dcache-loads
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; 1,687,482 L1-dcache-load-misses &lt;span class="c1"&gt;# 0.26% of all accesses&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ perf stat -d ./benchmark --benchmark_filter&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;V1&amp;gt;/4&amp;#39;&lt;/span&gt; --benchmark_min_time&lt;span class="o"&gt;=&lt;/span&gt;500x
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; 674,575,244 branches
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; 105,479,892 branch-misses &lt;span class="c1"&gt;# 15.64% of all branches&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; 3,262,419,243 L1-dcache-loads
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; 293,873,584 L1-dcache-load-misses &lt;span class="c1"&gt;# 9.05% of all accesses&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;A 35x jump in cache miss rate on the same nine bytes of data, and one
branch in six mispredicted: whether the exchange succeeds is decided by
the other cores, and the predictor cannot learn it. &lt;code&gt;perf c2c&lt;/code&gt;&amp;mdash;perf&amp;rsquo;s tool for cache-line contention&amp;mdash;ranks cache lines by how
often a core found them dirty in &lt;em&gt;another&lt;/em&gt; core&amp;rsquo;s cache (a &lt;em&gt;HITM&lt;/em&gt; event)&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ perf c2c record -a -- ./benchmark --benchmark_filter&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;V1&amp;gt;/4&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ perf c2c report --stdio
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="o"&gt;=================================================&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; Shared Data Cache Line &lt;span class="nv"&gt;Table&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="o"&gt;=================================================&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Index Address Hitm
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="m"&gt;0&lt;/span&gt; 0x7fffffffb440 91.49%
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="m"&gt;1&lt;/span&gt; 0xffff8c025e923700 2.13%
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;A single cache line&amp;mdash;the one holding the lock and the
counter&amp;mdash;accounts for 91% of all HITM events. Spinning is not free in
watts either; the CPU&amp;rsquo;s own energy meter&lt;sup id="fnref:3"&gt;&lt;a href="#fn:3" class="footnote-ref" role="doc-noteref"&gt;3&lt;/a&gt;&lt;/sup&gt; puts a number on the wait&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ perf stat -a -e power/energy-pkg/ ./benchmark --benchmark_filter&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;V1&amp;gt;/4&amp;#39;&lt;/span&gt; --benchmark_min_time&lt;span class="o"&gt;=&lt;/span&gt;200x
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; 24.43 Joules power/energy-pkg/
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The mutex does the same 200 iterations for &lt;strong&gt;14.9 J&lt;/strong&gt;.&lt;/p&gt;
&lt;h2 id="active-backoff"&gt;
Active backoff
&lt;a class="anchor" href="#active-backoff"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;The fix is to stop writing while waiting: attempt the exchange once and,
if it fails, spin on a plain load&amp;mdash;read-only copies of the line can
live in every core&amp;rsquo;s L1, so waiting generates no traffic at all. But
when the holder releases, every waiter sees the lock free at the same
instant, and the whole herd stampedes for the exchange&amp;mdash;one wins, the
rest pay the coherence storm anyway. To thin the herd, space out the
reads with a small delay loop&lt;sup id="fnref:4"&gt;&lt;a href="#fn:4" class="footnote-ref" role="doc-noteref"&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SpinLockV2&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;atomic_bool&lt;/span&gt; &lt;span class="n"&gt;locked_&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;lock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;locked_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exchange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;150&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;locked_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;unlock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;locked_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;At two threads the time drops to &lt;strong&gt;4.82 ms&lt;/strong&gt;&amp;mdash;2.1x faster than the basic
version and already ahead of the mutex. At four threads, &lt;strong&gt;12.0 ms&lt;/strong&gt;,
most of the gain evaporates: the herd is bigger, and a fixed delay no
longer keeps threads apart. The counters expose a hidden cost, too:
the cache misses barely improve (&lt;strong&gt;8.6%&lt;/strong&gt; against 9.1%), and the energy
gets worse, &lt;strong&gt;29.5 J&lt;/strong&gt; against 24.4 J&amp;mdash;the delay loop is busy-work
executed at full speed.&lt;/p&gt;
&lt;h2 id="passive-backoff"&gt;
Passive backoff
&lt;a class="anchor" href="#passive-backoff"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;The delay loop burns the whole wait executing useless increments. x86
has an instruction for exactly this: &lt;code&gt;pause&lt;/code&gt;, exposed as the
&lt;code&gt;_mm_pause()&lt;/code&gt; intrinsic&lt;sup id="fnref:5"&gt;&lt;a href="#fn:5" class="footnote-ref" role="doc-noteref"&gt;5&lt;/a&gt;&lt;/sup&gt;, which inserts a short delay
with the pipeline relaxed&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SpinLockV3&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;atomic_bool&lt;/span&gt; &lt;span class="n"&gt;locked_&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;lock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;locked_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exchange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;_mm_pause&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;locked_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;unlock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;locked_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The four &lt;code&gt;pause&lt;/code&gt; calls roughly match the delay of the 150-iteration
loop, and the timings land in the same place: &lt;strong&gt;5.02 ms&lt;/strong&gt; at two threads
and &lt;strong&gt;10.9 ms&lt;/strong&gt; at four. The difference is in what the timings don&amp;rsquo;t
show: &lt;code&gt;perf&lt;/code&gt; counts &lt;strong&gt;24x fewer instructions&lt;/strong&gt; for the same work&amp;mdash;the
core now spends the wait deliberately doing nothing&amp;mdash;and the energy
drops from active backoff&amp;rsquo;s 29.5 J to &lt;strong&gt;17.9 J&lt;/strong&gt;.&lt;/p&gt;
&lt;h2 id="exponential-backoff"&gt;
Exponential backoff
&lt;a class="anchor" href="#exponential-backoff"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Pausing made the wait cheap, but the timings barely moved&amp;mdash;and with a
&lt;em&gt;constant&lt;/em&gt; delay they cannot: all waiters poll at the same rate, so
every release still wakes the whole herd, and the collisions remain.
For the waiters to arrive at different times, their delays must differ:
let each thread double its delay every time it finds the lock still
taken, up to a cap&lt;sup id="fnref:6"&gt;&lt;a href="#fn:6" class="footnote-ref" role="doc-noteref"&gt;6&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cpp" data-lang="cpp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SpinLockV4&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;atomic_bool&lt;/span&gt; &lt;span class="n"&gt;locked_&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;lock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;backoff_iters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;locked_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exchange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;backoff_iters&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;_mm_pause&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;backoff_iters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;backoff_iters&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;locked_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;unlock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;noexcept&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;locked_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The timings collapse: &lt;strong&gt;1.62 ms&lt;/strong&gt; at two threads and &lt;strong&gt;1.80 ms&lt;/strong&gt; at
four&amp;mdash;within 1.5x of the single-threaded time, approaching the flat
line of a perfect lock. That is 3.4x faster than &lt;code&gt;std::mutex&lt;/code&gt; and 8x
faster than the basic spin-lock.&lt;/p&gt;
&lt;p&gt;The counters explain the win: at four threads the cache miss rate falls
to &lt;strong&gt;1.6%&lt;/strong&gt;, branch mispredictions to &lt;strong&gt;2.9%&lt;/strong&gt;, and the energy to &lt;strong&gt;3.8 J&lt;/strong&gt;, a
quarter of the mutex&amp;rsquo;s 14.9 J&amp;mdash;all near their single-threaded levels.
That is the signature of threads running one at a time: backoff
approximately &lt;em&gt;serializes&lt;/em&gt; them. Waiters sit in pause loops of up to
1024 iterations, so the releasing thread usually re-acquires the lock
immediately&amp;mdash;lock and counter still warm in its L1&amp;mdash;and races through
its share of increments while everyone else stays out of the way.
Serialization is optimal here, since the increments cannot run in
parallel anyway.&lt;sup id="fnref:7"&gt;&lt;a href="#fn:7" class="footnote-ref" role="doc-noteref"&gt;7&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;h2 id="summary"&gt;
Summary
&lt;a class="anchor" href="#summary"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;If you want to reproduce these results, the &lt;a href="https://github.com/david-alvarez-rosa/CppPlayground/blob/main/dsa/spin_lock.cpp"&gt;benchmark&lt;/a&gt; lives in my
&lt;a href="https://github.com/david-alvarez-rosa/CppPlayground"&gt;CppPlayground&lt;/a&gt; repository. Each cell reads time / L1d cache misses,
with the winners in bold.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Version&lt;/th&gt;
&lt;th&gt;1 thread&lt;/th&gt;
&lt;th&gt;2 threads&lt;/th&gt;
&lt;th&gt;4 threads&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1.62 ms / &lt;strong&gt;0.03%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;5.48 ms / 1.03%&lt;/td&gt;
&lt;td&gt;6.13 ms / 1.60%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1.20 ms&lt;/strong&gt; / 0.26%&lt;/td&gt;
&lt;td&gt;10.3 ms / 3.78%&lt;/td&gt;
&lt;td&gt;14.5 ms / 9.05%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;1.25 ms / 0.25%&lt;/td&gt;
&lt;td&gt;4.82 ms / 3.08%&lt;/td&gt;
&lt;td&gt;12.0 ms / 8.61%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;1.25 ms / 0.27%&lt;/td&gt;
&lt;td&gt;5.02 ms / 2.60%&lt;/td&gt;
&lt;td&gt;10.9 ms / 6.14%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;1.25 ms / 0.28%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1.62 ms&lt;/strong&gt; / &lt;strong&gt;0.67%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1.80 ms&lt;/strong&gt; / &lt;strong&gt;1.59%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The journey is the interesting part: the basic spin-lock lost to
&lt;code&gt;std::mutex&lt;/code&gt; by 2.4x, and three small fixes&amp;mdash;each derived from a
measurement&amp;mdash;turned that into a 3.4x win. In real code,
&lt;code&gt;std::mutex&lt;/code&gt; remains the right default; reach for a spin-lock when the
critical section is tiny, the threads have dedicated cores, and you have
measured the difference.&lt;/p&gt;
&lt;div class="footnotes" role="doc-endnotes"&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id="fn:1"&gt;
&lt;p&gt;With &lt;code&gt;pthread_setaffinity_np&lt;/code&gt;, on a machine tuned
for benchmarking (AMD Ryzen 7 PRO 8700GE, 8 cores at 3.65 GHz):
performance governor, hyperthreading and turbo boost disabled.&amp;#160;&lt;a href="#fnref:1" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:2"&gt;
&lt;p&gt;&lt;code&gt;exchange&lt;/code&gt; atomically writes &lt;code&gt;true&lt;/code&gt; and returns the previous
value: &lt;code&gt;false&lt;/code&gt; means the lock was free and is now ours; &lt;code&gt;true&lt;/code&gt; means
someone else holds it, and we retry.&amp;#160;&lt;a href="#fnref:2" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:3"&gt;
&lt;p&gt;The RAPL counters, which
perf exposes as &lt;code&gt;power/energy-pkg/&lt;/code&gt;; reading them requires system-wide
mode (&lt;code&gt;-a&lt;/code&gt;) and root.&amp;#160;&lt;a href="#fnref:3" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:4"&gt;
&lt;p&gt;&lt;code&gt;volatile&lt;/code&gt; keeps the compiler from
deleting the empty loop. The 150 iterations are a tunable parameter,
experimentally determined.&amp;#160;&lt;a href="#fnref:4" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:5"&gt;
&lt;p&gt;From &lt;code&gt;&amp;lt;immintrin.h&amp;gt;&lt;/code&gt;; ARM&amp;rsquo;s closest
equivalent is the &lt;code&gt;yield&lt;/code&gt; instruction.&amp;#160;&lt;a href="#fnref:5" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:6"&gt;
&lt;p&gt;Without the cap, threads would end up pausing
long after the lock has become free. Both bounds, 4 and 1024, are
tunable.&amp;#160;&lt;a href="#fnref:6" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:7"&gt;
&lt;p&gt;It is also maximally &lt;em&gt;unfair:&lt;/em&gt; nothing stops one
thread from re-acquiring the lock indefinitely while another starves in
its backoff loop. Real implementations bound the unfairness, or enforce
FIFO order outright with a ticket lock&amp;mdash;at a cost in throughput.&amp;#160;&lt;a href="#fnref:7" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;</description></item><item><title>Reflecting on One Hundred Thousand Reads</title><link>https://david.alvarezrosa.com/posts/reflecting-on-one-hundred-thousand-reads/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://david.alvarezrosa.com/posts/reflecting-on-one-hundred-thousand-reads/</guid><description>&lt;p&gt;This site just passed one hundred thousand reads. It began as a
notebook kept in public, read by almost no one, and the idea hasn&amp;rsquo;t
changed: one well-made post a month, not quick writeups&amp;mdash;only deep
dives into things that genuinely interest me. So to everyone who has
read one: thank you; that is what matters, the rest is just numbers.&lt;/p&gt;
&lt;p&gt;The numbers are uneven. Visitors tend to stay&amp;mdash;five and a half minutes
on average&amp;mdash;but where they come from is the surprise: Reddit and Hacker
News send nearly 90% of readers, every search engine under 5%.&lt;/p&gt;
&lt;figure class="sources"&gt;
&lt;figcaption&gt;&lt;p&gt;&lt;strong&gt;Two sites, most of it.&lt;/strong&gt; Reddit and Hacker News send nearly nine in ten readers; every search engine put together sends under one in twenty.&lt;/p&gt;&lt;/figcaption&gt;
&lt;div class="chart"&gt;
&lt;div class="plot" role="img" aria-label="Bar chart of traffic sources. Reddit 69 percent, Hacker News 20 percent, search engines under 5 percent, and others 6 percent."&gt;
&lt;div class="row"&gt;&lt;span class="name"&gt;reddit&lt;/span&gt;&lt;span class="track"&gt;&lt;span class="bar" style="width:69%"&gt;&lt;/span&gt;&lt;span class="val" data-egg&gt;69%&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="row"&gt;&lt;span class="name"&gt;hacker news&lt;/span&gt;&lt;span class="track"&gt;&lt;span class="bar" style="width:20%"&gt;&lt;/span&gt;&lt;span class="val"&gt;20%&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="row"&gt;&lt;span class="name"&gt;search engines&lt;/span&gt;&lt;span class="track"&gt;&lt;span class="bar" style="width:4%"&gt;&lt;/span&gt;&lt;span class="val"&gt;&amp;lt;5%&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="row"&gt;&lt;span class="name"&gt;others&lt;/span&gt;&lt;span class="track"&gt;&lt;span class="bar" style="width:6%"&gt;&lt;/span&gt;&lt;span class="val"&gt;6%&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/figure&gt;
&lt;style&gt;
figure.sources { color: var(--text); margin: 0 0 1.4rem; }
figure.sources .chart {
border: 2px solid var(--text); padding: 0.5rem 0.8rem; box-sizing: border-box;
}
figure.sources .plot {
max-width: 34rem; margin: 0 auto;
display: grid; grid-template-columns: max-content 1fr; align-items: stretch;
}
figure.sources .row { display: contents; }
figure.sources .name {
font-family: var(--font-sc); font-size: 1.25rem; white-space: nowrap;
display: flex; align-items: center; justify-content: flex-end;
padding-right: 0.8rem;
}
figure.sources .track {
display: flex; align-items: center;
border-left: 2px solid var(--text); padding: 0.5rem 0;
}
figure.sources .bar {
flex: none; height: 1.4rem; box-sizing: border-box;
border: 1.5px solid var(--text);
background: repeating-linear-gradient(45deg, var(--text) 0, var(--text) 1.6px, transparent 1.6px, transparent 7px);
}
figure.sources .val {
font-family: var(--font-body); font-size: 1.25rem;
margin-left: 0.6rem; white-space: nowrap;
}
figure.sources .val[data-egg] { position: relative; }
figure.sources .val[data-egg]::after {
content: "if you know, you know";
position: absolute; top: 100%; right: 0; margin-top: 0.1rem;
white-space: nowrap; font-style: italic; font-size: 1rem; color: var(--accent);
opacity: 0; transition: opacity 0.25s ease; pointer-events: none;
}
figure.sources .val[data-egg]:hover::after { opacity: 1; }
@media (min-width: 861px) {
figure.sources figcaption { padding-top: 0; margin-top: -0.3rem; }
}
&lt;/style&gt;
&lt;p&gt;And that is fragile. This traffic spikes, then fades&amp;mdash;a post hits a
front page, pulls a few thousand reads in a day, then goes quiet; the
three most-read owe their 40,000-odd reads to a few such days, not
steady interest.&lt;sup id="fnref:1"&gt;&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref"&gt;1&lt;/a&gt;&lt;/sup&gt; Search is the
opposite: tiny now, but it grows over time, so the deeper posts are
being set up to show up there&amp;mdash;and each new one is syndicated more
widely.&lt;/p&gt;
&lt;p&gt;Thank you, again, for reading.&lt;/p&gt;
&lt;div class="footnotes" role="doc-endnotes"&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id="fn:1"&gt;
&lt;p&gt;&lt;a href="https://david.alvarezrosa.com/posts/optimizing-a-lock-free-ring-buffer/"&gt;Optimizing a Lock-Free Ring Buffer&lt;/a&gt; leads with
16,635 reads, followed by the &lt;a href="https://david.alvarezrosa.com/posts/fundamental-theorem-of-calculus/"&gt;Fundamental Theorem of Calculus&lt;/a&gt; (12,506)
and &lt;a href="https://david.alvarezrosa.com/posts/devirtualization-and-static-polymorphism/"&gt;Devirtualization and Static Polymorphism&lt;/a&gt; (11,848).&amp;#160;&lt;a href="#fnref:1" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;</description></item><item><title>Self-Hosting an Email Server</title><link>https://david.alvarezrosa.com/posts/self-hosting-an-email-server/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://david.alvarezrosa.com/posts/self-hosting-an-email-server/</guid><description>&lt;p&gt;Self-hosting email gives you full control over your communications&amp;mdash;no
ads, no scanning, no one can lock you out. It&amp;rsquo;s easier than most people
think, and this guide covers everything I do when setting up a new mail
server.&lt;/p&gt;
&lt;p&gt;You&amp;rsquo;ll need a server with a clean Linux install&lt;sup id="fnref:1"&gt;&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref"&gt;1&lt;/a&gt;&lt;/sup&gt; and a domain name pointing to your server&amp;rsquo;s IP.&lt;/p&gt;
&lt;h2 id="dns-records"&gt;
DNS records
&lt;a class="anchor" href="#dns-records"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Create DNS records for your mail server: A and AAAA records for
&lt;code&gt;mail.alvarezrosa.com&lt;/code&gt;, plus an MX record pointing to it.&lt;sup id="fnref:2"&gt;&lt;a href="#fn:2" class="footnote-ref" role="doc-noteref"&gt;2&lt;/a&gt;&lt;/sup&gt; Verify propagation&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ dig mail.alvarezrosa.com A +short
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;213.32.19.229
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ dig mail.alvarezrosa.com AAAA +short
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;2001:41d0:305:2100::febc
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ dig alvarezrosa.com MX +short
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="m"&gt;10&lt;/span&gt; mail.alvarezrosa.com.
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Update your server&amp;rsquo;s hostname to match the mail FQDN. Edit &lt;code&gt;/etc/hosts&lt;/code&gt;
so that &lt;code&gt;hostname -f&lt;/code&gt; returns the fully qualified domain name&lt;sup id="fnref:3"&gt;&lt;a href="#fn:3" class="footnote-ref" role="doc-noteref"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;127.0.1.1 mail.alvarezrosa.com homelab
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="receiving-mail"&gt;
Receiving mail
&lt;a class="anchor" href="#receiving-mail"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Install Postfix and open port 25. During installation, select &amp;ldquo;Internet
Site&amp;rdquo; and enter &lt;code&gt;mail.alvarezrosa.com&lt;/code&gt; as the system mail name.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo apt install postfix
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo ufw allow 25/tcp
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Configure Postfix to accept mail for your domain. Edit
&lt;code&gt;/etc/postfix/main.cf&lt;/code&gt; and add your domain to &lt;code&gt;mydestination&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cfg" data-lang="cfg"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;mydestination&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;$myhostname, mail.alvarezrosa.com, localhost.alvarezrosa.com, localhost, alvarezrosa.com&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Restart Postfix and verify it&amp;rsquo;s listening&lt;sup id="fnref:4"&gt;&lt;a href="#fn:4" class="footnote-ref" role="doc-noteref"&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo systemctl restart postfix
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo ss -tlnp &lt;span class="p"&gt;|&lt;/span&gt; grep :25
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;LISTEN &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt; 0.0.0.0:25 0.0.0.0:* users:&lt;span class="o"&gt;((&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;master&amp;#34;&lt;/span&gt;,pid&lt;span class="o"&gt;=&lt;/span&gt;13124,fd&lt;span class="o"&gt;=&lt;/span&gt;13&lt;span class="o"&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;LISTEN &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;::&lt;span class="o"&gt;]&lt;/span&gt;:25 &lt;span class="o"&gt;[&lt;/span&gt;::&lt;span class="o"&gt;]&lt;/span&gt;:* users:&lt;span class="o"&gt;((&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;master&amp;#34;&lt;/span&gt;,pid&lt;span class="o"&gt;=&lt;/span&gt;13124,fd&lt;span class="o"&gt;=&lt;/span&gt;14&lt;span class="o"&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Send a test email to &lt;code&gt;david@alvarezrosa.com&lt;/code&gt; from Gmail, then check it
arrived&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo apt install mailutils
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ mail
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt;&amp;#34;/var/mail/david&amp;#34;&lt;/span&gt;: &lt;span class="m"&gt;1&lt;/span&gt; message &lt;span class="m"&gt;1&lt;/span&gt; new
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&amp;gt;N &lt;span class="m"&gt;1&lt;/span&gt; David Álvarez Ros Wed Feb &lt;span class="m"&gt;4&lt;/span&gt; 19:39 80/4544 Hello from GMail
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Your server can now receive mail from anywhere in the world.&lt;/p&gt;
&lt;h2 id="sending-mail"&gt;
Sending mail
&lt;a class="anchor" href="#sending-mail"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Configure Postfix to use your domain in outgoing messages. Edit
&lt;code&gt;/etc/postfix/main.cf&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cfg" data-lang="cfg"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;myhostname&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;mail.alvarezrosa.com&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;mydomain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;alvarezrosa.com&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;myorigin&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;$mydomain&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Restart and send a test message&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo systemctl restart postfix
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;Test from my mail server&amp;#34;&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; mail -s &lt;span class="s2"&gt;&amp;#34;Hello&amp;#34;&lt;/span&gt; recipient@example.com
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;TODO check this without DMARC policy &amp;ndash; If you send to a major provider
like Gmail before setting up authentication, your message will likely
land in spam or be silently dropped. That&amp;rsquo;s expected&amp;mdash;we&amp;rsquo;ll fix it in
the authentication section.&lt;/p&gt;
&lt;h2 id="client-access"&gt;
Client access
&lt;a class="anchor" href="#client-access"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;At this point you can send and receive mail, but only from the server&amp;rsquo;s
command line. To use a real email client, you need IMAP for reading and
authenticated SMTP for sending.&lt;/p&gt;
&lt;p&gt;Install Dovecot to expose mailboxes via IMAP.&lt;sup id="fnref:5"&gt;&lt;a href="#fn:5" class="footnote-ref" role="doc-noteref"&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo apt install dovecot-core dovecot-imapd
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Obtain a TLS certificate for the mail subdomain&lt;sup id="fnref:6"&gt;&lt;a href="#fn:6" class="footnote-ref" role="doc-noteref"&gt;6&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo certbot certonly -d mail.alvarezrosa.com
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Configure Dovecot TLS. Edit &lt;code&gt;/etc/dovecot/conf.d/10-ssl.conf&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cfg" data-lang="cfg"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;ssl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;required&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;ssl_server_cert_file&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;/etc/letsencrypt/live/mail.alvarezrosa.com/fullchain.pem&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;ssl_server_key_file&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;/etc/letsencrypt/live/mail.alvarezrosa.com/privkey.pem&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Configure Postfix TLS. Add to &lt;code&gt;/etc/postfix/main.cf&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cfg" data-lang="cfg"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;smtpd_tls_key_file&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;/etc/letsencrypt/live/mail.alvarezrosa.com/privkey.pem&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;smtpd_tls_cert_file&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;/etc/letsencrypt/live/mail.alvarezrosa.com/fullchain.pem&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;smtpd_tls_security_level&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;encrypt&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Enable authenticated SMTP submission. Edit &lt;code&gt;/etc/postfix/master.cf&lt;/code&gt; and
uncomment the submissions section&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cfg" data-lang="cfg"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;submissions inet n - y - - smtpd&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;-o syslog_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;postfix/submissions
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s"&gt; -o smtpd_tls_wrappermode=yes
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s"&gt; -o smtpd_sasl_auth_enable=yes
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s"&gt; -o smtpd_recipient_restrictions=permit_sasl_authenticated,reject
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s"&gt; -o milter_macro_daemon_name=ORIGINATING&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Configure Postfix to use Dovecot for SASL authentication. Add to
&lt;code&gt;/etc/postfix/main.cf&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cfg" data-lang="cfg"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;smtpd_sasl_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;dovecot&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;smtpd_sasl_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;private/auth&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;smtpd_sasl_auth_enable&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;yes&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Connect Dovecot authentication to Postfix. This lets Postfix
authenticate users against Dovecot. Edit
&lt;code&gt;/etc/dovecot/conf.d/10-master.conf&lt;/code&gt; and configure the auth service&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cfg" data-lang="cfg"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;service auth {&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;unix_listener /var/spool/postfix/private/auth {&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;mode&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;0660
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s"&gt; }&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Open ports 465 (SMTPS) and 993 (IMAPS)&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo ufw allow 465/tcp
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo ufw allow 993/tcp
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Restart services&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo systemctl restart dovecot postfix
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Configure your email client: IMAP server &lt;code&gt;mail.alvarezrosa.com&lt;/code&gt; port
993, SMTP server &lt;code&gt;mail.alvarezrosa.com&lt;/code&gt; port 465, both with SSL/TLS.
Use your Linux username and password as credentials.&lt;sup id="fnref:7"&gt;&lt;a href="#fn:7" class="footnote-ref" role="doc-noteref"&gt;7&lt;/a&gt;&lt;/sup&gt; Verify you can send and receive.&lt;/p&gt;
&lt;p&gt;You now have a fully functional email server&amp;mdash;you can read and compose
mail from any client. The hard part is done.&lt;/p&gt;
&lt;h2 id="authentication"&gt;
Authentication
&lt;a class="anchor" href="#authentication"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Your server works, but mail will land in spam without proper
authentication. Modern email requires four mechanisms: rDNS, SPF, DKIM,
and DMARC. Each one builds trust with receiving servers, proving you
are who you claim to be and that your messages haven&amp;rsquo;t been forged or
tampered with.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;rDNS.&lt;/strong&gt; Reverse DNS (also called PTR records) maps your IP back to your
domain, proving you control it. Most mail servers reject messages from
IPs without proper rDNS. Configure it through your VPS provider&amp;rsquo;s
control panel&amp;mdash;map your IPs to &lt;code&gt;mail.alvarezrosa.com&lt;/code&gt; and verify&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ dig +short -x 213.32.19.229
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;mail.alvarezrosa.com.
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;SPF.&lt;/strong&gt; SPF (Sender Policy Framework) specifies which servers can send
mail for your domain, preventing spammers from forging your address.
Create a DNS TXT record on your root domain: &lt;code&gt;v=spf1 mx -all&lt;/code&gt; means only
servers listed in your MX records can send; reject all others.&lt;sup id="fnref:8"&gt;&lt;a href="#fn:8" class="footnote-ref" role="doc-noteref"&gt;8&lt;/a&gt;&lt;/sup&gt;
Verify the record&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ dig +short TXT alvarezrosa.com
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt;&amp;#34;v=spf1 mx -all&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;DKIM.&lt;/strong&gt; DKIM (DomainKeys Identified Mail) adds a cryptographic
signature to outgoing mail. Receivers verify it against a public key in
your DNS, proving the message came from your server and wasn&amp;rsquo;t altered
in transit. Install OpenDKIM to sign outgoing messages.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo apt install opendkim opendkim-tools
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo mkdir -p /etc/opendkim/keys/alvarezrosa.com
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo opendkim-genkey -b &lt;span class="m"&gt;2048&lt;/span&gt; -D /etc/opendkim/keys/alvarezrosa.com -d alvarezrosa.com -s mail
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo chown -R opendkim:opendkim /etc/opendkim/keys
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo chmod &lt;span class="m"&gt;600&lt;/span&gt; /etc/opendkim/keys/alvarezrosa.com/mail.private
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The generated file contains your public key. Create a DNS TXT record at
&lt;code&gt;mail._domainkey.alvarezrosa.com&lt;/code&gt; with its contents&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo cat /etc/opendkim/keys/alvarezrosa.com/mail.txt
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Verify the record is published&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ dig +short TXT mail._domainkey.alvarezrosa.com
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt;&amp;#34;v=DKIM1; h=sha256; k=rsa; p=MIIBIjANBgkqhkiG9w0B...&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Configure OpenDKIM. Edit &lt;code&gt;/etc/opendkim.conf&lt;/code&gt;&lt;sup id="fnref:9"&gt;&lt;a href="#fn:9" class="footnote-ref" role="doc-noteref"&gt;9&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cfg" data-lang="cfg"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;Mode sv&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;Domain alvarezrosa.com&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;Selector mail&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;KeyFile /etc/opendkim/keys/alvarezrosa.com/mail.private&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;Socket inet:12301@localhost&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;UserID opendkim&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;PidFile /run/opendkim/opendkim.pid&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Hook Postfix to OpenDKIM. Add to &lt;code&gt;/etc/postfix/main.cf&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cfg" data-lang="cfg"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;milter_default_action&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;accept&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;milter_protocol&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;6&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;smtpd_milters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;inet:localhost:12301&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;non_smtpd_milters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;inet:localhost:12301&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Restart services&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo systemctl &lt;span class="nb"&gt;enable&lt;/span&gt; --now opendkim
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ sudo systemctl restart postfix
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;DMARC.&lt;/strong&gt; DMARC ties SPF and DKIM together, telling receivers what to do
when checks fail. Create a DNS TXT record at &lt;code&gt;_dmarc.alvarezrosa.com&lt;/code&gt;.
The &lt;code&gt;p&lt;/code&gt; parameter sets the policy: start with &lt;code&gt;p=none&lt;/code&gt; to monitor
without affecting delivery, then switch to &lt;code&gt;p=reject&lt;/code&gt; once everything
works. The &lt;code&gt;rua&lt;/code&gt; parameter sends aggregate reports to your email.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;v=DMARC1; p=reject; rua=mailto:david@alvarezrosa.com
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Verify the record&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ dig +short TXT _dmarc.alvarezrosa.com
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt;&amp;#34;v=DMARC1; p=reject; rua=mailto:david@alvarezrosa.com&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="verification"&gt;
Verification
&lt;a class="anchor" href="#verification"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Send a test email to Gmail and check the message headers&amp;mdash;SPF, DKIM,
and DMARC should all show &lt;code&gt;pass&lt;/code&gt;. Use &lt;a href="https://www.mail-tester.com"&gt;mail-tester.com&lt;/a&gt; for a
comprehensive deliverability check (aim for 10/10) and &lt;a href="https://mxtoolbox.com/SuperTool.aspx"&gt;MX Toolbox&lt;/a&gt; to
verify your DNS records.&lt;/p&gt;
&lt;p&gt;Congratulations&amp;mdash;your email server is now fully operational, with
proper authentication that major providers will trust. Messages should
land in inboxes, not spam folders.&lt;/p&gt;
&lt;p&gt;Note that new mail servers often face deliverability issues due to IP
reputation. If your mail lands in spam initially, keep sending
legitimate emails and request delisting from any blacklists your IP
appears on. Building a positive reputation can take weeks.&lt;/p&gt;
&lt;div class="footnotes" role="doc-endnotes"&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id="fn:1"&gt;
&lt;p&gt;I use Debian for
servers. For initial server setup, see my &lt;a href="https://david.alvarezrosa.com/posts/first-steps-on-a-new-server/"&gt;First Steps on a New Server&lt;/a&gt;
post.&amp;#160;&lt;a href="#fnref:1" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:2"&gt;
&lt;p&gt;MX records
tell other mail servers where to deliver mail. The number 10 is the
priority&amp;mdash;lower numbers are tried first if you have multiple mail
servers.&amp;#160;&lt;a href="#fnref:2" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:3"&gt;
&lt;p&gt;Postfix
uses the FQDN to identify itself in SMTP conversations. The short
hostname can remain &lt;code&gt;homelab&lt;/code&gt;, but &lt;code&gt;hostname -f&lt;/code&gt; must return
&lt;code&gt;mail.alvarezrosa.com&lt;/code&gt;.&amp;#160;&lt;a href="#fnref:3" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:4"&gt;
&lt;p&gt;You can also test the
connection with &lt;code&gt;telnet mail.alvarezrosa.com 25&lt;/code&gt;&amp;mdash;you should see a
Postfix greeting.&amp;#160;&lt;a href="#fnref:4" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:5"&gt;
&lt;p&gt;IMAP keeps messages on
the server and syncs across devices. I prefer it over POP3, which
downloads messages and typically deletes them from the server.&amp;#160;&lt;a href="#fnref:5" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:6"&gt;
&lt;p&gt;Certbot obtains free
certificates from Let&amp;rsquo;s Encrypt and auto-renews them. Email clients
require TLS for secure connections.&amp;#160;&lt;a href="#fnref:6" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:7"&gt;
&lt;p&gt;I use mu4e in
Emacs. For testing, Thunderbird works well and auto-detects most
settings.&amp;#160;&lt;a href="#fnref:7" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:8"&gt;
&lt;p&gt;Use
&lt;code&gt;~all&lt;/code&gt; (soft fail) instead of &lt;code&gt;-all&lt;/code&gt; (hard fail) during testing.&amp;#160;&lt;a href="#fnref:8" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:9"&gt;
&lt;p&gt;Mode &lt;code&gt;sv&lt;/code&gt; tells
OpenDKIM to sign outgoing mail and verify incoming signatures.&amp;#160;&lt;a href="#fnref:9" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;</description></item><item><title>Self-Hosting Behind CGNAT</title><link>https://david.alvarezrosa.com/posts/self-hosting-behind-cgnat/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://david.alvarezrosa.com/posts/self-hosting-behind-cgnat/</guid><description>&lt;p&gt;This site is self-hosted on a server that cannot accept a single
inbound connection: the ISP puts it behind CGNAT, so there is no public
IP to forward ports on. The fix is a &lt;em&gt;bridge&lt;/em&gt;&amp;mdash;a bastion with a
real public address&amp;mdash;and a WireGuard tunnel dialed out from the
homelab: clients connect to the bridge, and the bridge forwards
everything back through the tunnel.&lt;/p&gt;
&lt;h2 id="the-plan"&gt;
The plan
&lt;a class="anchor" href="#the-plan"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;With carrier-grade NAT, the ISP shares one public IPv4 address across
many customers: your router&amp;rsquo;s WAN address is itself
private&lt;sup id="fnref:1"&gt;&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref"&gt;1&lt;/a&gt;&lt;/sup&gt;&amp;mdash;a second NAT, outside your home,
that you don&amp;rsquo;t control. The classic recipe of port forwarding plus
dynamic DNS&lt;sup id="fnref:2"&gt;&lt;a href="#fn:2" class="footnote-ref" role="doc-noteref"&gt;2&lt;/a&gt;&lt;/sup&gt; dies here: forwarding only gets you through the first NAT,
and the address DDNS would publish is shared with hundreds of
strangers. Inbound connections are simply impossible.&lt;/p&gt;
&lt;p&gt;But outbound connections still work fine&amp;mdash;so the homelab dials &lt;em&gt;out&lt;/em&gt;,
opening a WireGuard tunnel to the bridge and keeping it alive. The
bridge keeps only WireGuard itself and its own SSH, and forwards every
other inbound connection through the tunnel. From the outside, the
bridge &lt;em&gt;is&lt;/em&gt; the homelab&amp;mdash;DNS just points at it.&lt;sup id="fnref:3"&gt;&lt;a href="#fn:3" class="footnote-ref" role="doc-noteref"&gt;3&lt;/a&gt;&lt;/sup&gt; Inside the tunnel, the bridge is &lt;code&gt;10.0.0.1&lt;/code&gt; and the
homelab &lt;code&gt;10.0.0.2&lt;/code&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; client admin
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; | |
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; | |
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;+---------------------------------------------------------+
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;| public Internet |
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;+---------------------------------------------------------+
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; | | |
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; | | |
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;+---------------------+ +--------------+ |
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;| bridge 10.0.0.1 | | Cloudflare | | homelab&amp;#39;s own
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;| DNAT * -&amp;gt; homelab | +--------------+ | outbound traffic
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;+---------------------+ ^^ |
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; ^^ || |
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; || WireGuard || cloudflared |
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; || (all ports) || (backup SSH) |
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; vv vv |
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;+---------------------------------------------------------+
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;| homelab 10.0.0.2 (behind CGNAT) |
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;+---------------------------------------------------------+
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; | ^
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; | self-check via bridge---or reboot |
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; +-------------------------------------+
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Both tunnels are dialed out from the homelab&amp;mdash;only outbound works
behind CGNAT. The admin SSHes in through the bridge like any other
client (port 22 is forwarded with the rest); the homelab&amp;rsquo;s own
outbound traffic never crosses it, only replies to forwarded
connections do. The Cloudflare backup tunnel and the watchdog loop
are covered at the end.&lt;/p&gt;
&lt;h2 id="the-tunnel"&gt;
The tunnel
&lt;a class="anchor" href="#the-tunnel"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;On both machines, install WireGuard and generate a keypair; exchange
the public keys&amp;mdash;the private ones never leave their machine.&lt;/p&gt;
&lt;p&gt;The bridge&amp;rsquo;s entire setup lives in &lt;code&gt;/etc/wireguard/wg0.conf&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cfg" data-lang="cfg"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;[Interface]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;Address&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;10.0.0.1/24&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;PrivateKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;&amp;lt;bridge-private-key&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;ListenPort&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;51820&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;PostUp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;...&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;PostDown&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;...&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;[Peer]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;PublicKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;&amp;lt;homelab-public-key&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;AllowedIPs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;10.0.0.2/32&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;where &lt;code&gt;PostUp&lt;/code&gt; enables IP forwarding and installs the five iptables
rules that do the forwarding, tying their lifetime to the
tunnel&amp;rsquo;s&lt;sup id="fnref:4"&gt;&lt;a href="#fn:4" class="footnote-ref" role="doc-noteref"&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;sysctl -w net.ipv4.ip_forward&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; net.ipv4.conf.ens3.route_localnet&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;iptables -t nat -A PREROUTING -i ens3 -p udp --dport &lt;span class="m"&gt;51820&lt;/span&gt; -j RETURN
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;iptables -t nat -A PREROUTING -i ens3 -p tcp --dport &lt;span class="m"&gt;2222&lt;/span&gt; -j RETURN
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;iptables -t nat -A PREROUTING -i ens3 -j DNAT --to-destination 10.0.0.2
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;iptables -A FORWARD -i wg0 -o ens3 -s 10.0.0.2 -j ACCEPT
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;iptables -A FORWARD -i ens3 -o wg0 -d 10.0.0.2 -j ACCEPT
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Two &lt;code&gt;RETURN&lt;/code&gt; rules keep WireGuard (&lt;code&gt;51820/udp&lt;/code&gt;) and the bridge&amp;rsquo;s own
SSH (&lt;code&gt;2222/tcp&lt;/code&gt;) local; the catch-all &lt;code&gt;DNAT&lt;/code&gt; rewrites everything
else&amp;mdash;port 22 included&amp;mdash;to the homelab&amp;rsquo;s tunnel address; and two
&lt;code&gt;FORWARD&lt;/code&gt; accepts let that traffic flow both ways.&lt;sup id="fnref:5"&gt;&lt;a href="#fn:5" class="footnote-ref" role="doc-noteref"&gt;5&lt;/a&gt;&lt;/sup&gt;
Everything happens inside the kernel: netfilter does
the rewriting, and no userspace process ever touches a packet.&lt;/p&gt;
&lt;p&gt;The bridge&amp;rsquo;s own sshd listens on 2222 precisely so that port 22 can be
forwarded with everything else: &lt;code&gt;ssh alvarezrosa.com&lt;/code&gt; lands on the
homelab, &lt;code&gt;ssh -p 2222&lt;/code&gt; on the bridge.&lt;sup id="fnref:6"&gt;&lt;a href="#fn:6" class="footnote-ref" role="doc-noteref"&gt;6&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;The homelab side dials out and answers. Its &lt;code&gt;/etc/wireguard/wg0.conf&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-cfg" data-lang="cfg"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;[Interface]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;Address&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;10.0.0.2/24&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;PrivateKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;&amp;lt;homelab-private-key&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;Table&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;off&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;PostUp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;...&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;PostDown&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;...&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;[Peer]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;PublicKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;&amp;lt;bridge-public-key&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;Endpoint&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;213.32.19.229:51820&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;AllowedIPs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;0.0.0.0/0&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="na"&gt;PersistentKeepalive&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;25&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;where &lt;code&gt;PostUp&lt;/code&gt; sets up policy routing.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-sh" data-lang="sh"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;ip route add default dev wg0 table &lt;span class="m"&gt;200&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;ip rule add from 10.0.0.2 table &lt;span class="m"&gt;200&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;iptables -t mangle -A PREROUTING -i wg0 -m conntrack --ctstate NEW -j CONNMARK --set-mark 0x1
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;iptables -t mangle -A PREROUTING -m conntrack --ctdir REPLY -j CONNMARK --restore-mark
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;ip rule add fwmark 0x1 lookup &lt;span class="m"&gt;200&lt;/span&gt; pref &lt;span class="m"&gt;1000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Each line earns its place: &lt;code&gt;AllowedIPs = 0.0.0.0/0&lt;/code&gt; accepts forwarded
clients from anywhere on the Internet; &lt;code&gt;Table = off&lt;/code&gt; stops &lt;code&gt;wg-quick&lt;/code&gt;
from hijacking &lt;em&gt;all&lt;/em&gt; of the homelab&amp;rsquo;s traffic through the
bridge&lt;sup id="fnref:7"&gt;&lt;a href="#fn:7" class="footnote-ref" role="doc-noteref"&gt;7&lt;/a&gt;&lt;/sup&gt;; the policy routing sends only replies of
forwarded connections&amp;mdash;packets &lt;em&gt;from&lt;/em&gt; &lt;code&gt;10.0.0.2&lt;/code&gt;&amp;mdash;back through the
tunnel; and &lt;code&gt;PersistentKeepalive&lt;/code&gt; keeps the CGNAT&amp;rsquo;s idle UDP mapping
alive, so the bridge can always reach in.&lt;/p&gt;
&lt;p&gt;This source-based rule is enough for services that listen on the
homelab itself, but it quietly breaks the moment a forwarded service
lives behind a second NAT&amp;mdash;a Docker container, most commonly. Such a
service sits on its own address (say &lt;code&gt;172.19.0.4&lt;/code&gt;), reached by a
further DNAT; and the reply&amp;rsquo;s source is rewritten back to &lt;code&gt;10.0.0.2&lt;/code&gt;
only in &lt;code&gt;POSTROUTING&lt;/code&gt;, &lt;em&gt;after&lt;/em&gt; the routing decision is made. At
routing time the packet still reads &lt;code&gt;from 172.19.0.4&lt;/code&gt;, misses the
rule, falls through to the main table, and leaks out the WAN
interface&amp;mdash;straight into the CGNAT, where it dies. The cure is to
route by the connection rather than by an address that isn&amp;rsquo;t settled
yet&amp;mdash;the last three lines above. The first marks every new
connection arriving on &lt;code&gt;wg0&lt;/code&gt;; the second restores that mark onto the
packets travelling the other way, in &lt;code&gt;PREROUTING&lt;/code&gt;, &lt;em&gt;before&lt;/em&gt; the
routing decision; and the &lt;code&gt;fwmark&lt;/code&gt; rule sends whatever carries the
mark through the tunnel. Routing by mark instead of by source keeps
the real client IP intact&amp;mdash;a &lt;code&gt;MASQUERADE&lt;/code&gt; on the homelab would have
been shorter, but it would rewrite that address away.&lt;/p&gt;
&lt;p&gt;Enable the tunnel on both machines with
&lt;code&gt;sudo systemctl enable --now wg-quick@wg0&lt;/code&gt; and verify the handshake
with &lt;code&gt;sudo wg&lt;/code&gt;. Then the real test: from outside, any connection to
the bridge&amp;rsquo;s public IP should land on the homelab. Point your DNS
records at the bridge and the homelab is, for all practical purposes,
on the public Internet.&lt;/p&gt;
&lt;p&gt;The detour adds latency: this bridge sits in France, the homelab in
northern Spain, and the tunnel adds ~37 ms of RTT to every
connection.&lt;sup id="fnref:8"&gt;&lt;a href="#fn:8" class="footnote-ref" role="doc-noteref"&gt;8&lt;/a&gt;&lt;/sup&gt; Not a problem in practice: with heavy optimization
and a CDN absorbing most requests, this site&amp;mdash;served through this
very tunnel&amp;mdash;is among the fastest on the web.&lt;/p&gt;
&lt;h2 id="plan-for-failure"&gt;
Plan for failure
&lt;a class="anchor" href="#plan-for-failure"&gt;&amp;sect;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Two single points of failure, and a plan for each.&lt;/p&gt;
&lt;p&gt;If the &lt;em&gt;bridge&lt;/em&gt; dies, the tunnel dies with it&amp;mdash;so keep a way into the
homelab that bypasses it entirely. I run a &lt;a href="https://developers.cloudflare.com/cloudflare-one/"&gt;Cloudflare Tunnel&lt;/a&gt;:
&lt;code&gt;cloudflared&lt;/code&gt; uses the same dial-out trick, outbound-only on both ends,
so it also works behind CGNAT.&lt;sup id="fnref:9"&gt;&lt;a href="#fn:9" class="footnote-ref" role="doc-noteref"&gt;9&lt;/a&gt;&lt;/sup&gt; It
exposes the homelab&amp;rsquo;s sshd at a hostname of its own, and a
&lt;code&gt;ProxyCommand&lt;/code&gt; in the client&amp;rsquo;s &lt;code&gt;\~/.ssh/config&lt;/code&gt; connects through
it&amp;mdash;whatever happens to the bridge, &lt;code&gt;ssh homelab2&lt;/code&gt; still gets in.&lt;/p&gt;
&lt;p&gt;If the &lt;em&gt;homelab&lt;/em&gt; dies, no tunnel will save you&amp;mdash;the machine to reboot
is the one you can&amp;rsquo;t reach. So it watches itself with a root cron
job&amp;mdash;&lt;code&gt;0 5 * * * ssh ssh.alvarezrosa.com || reboot&lt;/code&gt;&amp;mdash;that SSHes to
its own public hostname, out through CGNAT to the bridge and back in
through the tunnel, the whole chain end to end, and reboots if that
fails.&lt;sup id="fnref:10"&gt;&lt;a href="#fn:10" class="footnote-ref" role="doc-noteref"&gt;10&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;br /&gt;
&lt;p&gt;That&amp;rsquo;s the whole trick: one cheap bridge, one tunnel, five iptables
rules&amp;mdash;and a server behind CGNAT serves the public Internet, this very
page included.&lt;/p&gt;
&lt;div class="footnotes" role="doc-endnotes"&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id="fn:1"&gt;
&lt;p&gt;Usually in &lt;code&gt;100.64.0.0/10&lt;/code&gt;, the shared address space
reserved for CGNAT by RFC 6598.&amp;#160;&lt;a href="#fnref:1" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:2"&gt;
&lt;p&gt;Shaky even without CGNAT: a rotating address is bad
for anything reputation-sensitive like mail, and every rotation means
downtime while resolvers keep serving the old IP until the TTL
expires.&amp;#160;&lt;a href="#fnref:2" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:3"&gt;
&lt;p&gt;The bridge only
pushes packets, so the smallest of servers will do. See &lt;a href="https://david.alvarezrosa.com/posts/first-steps-on-a-new-server/"&gt;First Steps on
a New Server&lt;/a&gt; for the basic setup.&amp;#160;&lt;a href="#fnref:3" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:4"&gt;
&lt;p&gt;&lt;code&gt;PostDown&lt;/code&gt; mirrors &lt;code&gt;PostUp&lt;/code&gt;, undoing every command.
&lt;code&gt;ens3&lt;/code&gt; is the bridge&amp;rsquo;s public interface&amp;mdash;find yours with &lt;code&gt;ip a&lt;/code&gt; and
adjust.&amp;#160;&lt;a href="#fnref:4" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:5"&gt;
&lt;p&gt;Note there is no
&lt;code&gt;MASQUERADE&lt;/code&gt;: conntrack reverses the DNAT on the way out, and the
homelab routes its replies back through the tunnel. Forwarded services
see the &lt;em&gt;real&lt;/em&gt; client IP&amp;mdash;something proxies and third-party tunnels
can&amp;rsquo;t offer. This holds for services that terminate on the homelab
itself; a service behind a &lt;em&gt;second&lt;/em&gt; NAT hop&amp;mdash;anything in a Docker
container, say&amp;mdash;needs one more touch on the homelab, noted below.&amp;#160;&lt;a href="#fnref:5" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:6"&gt;
&lt;p&gt;Add &lt;code&gt;Port 2222&lt;/code&gt; to the
bridge&amp;rsquo;s &lt;code&gt;/etc/ssh/sshd_config&lt;/code&gt; &lt;em&gt;before&lt;/em&gt; bringing the tunnel up&amp;mdash;the
moment the DNAT rule takes effect, port 22 belongs to the homelab.&amp;#160;&lt;a href="#fnref:6" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:7"&gt;
&lt;p&gt;Without it, &lt;code&gt;wg-quick&lt;/code&gt; would install a default route
matching &lt;code&gt;AllowedIPs&lt;/code&gt;.&amp;#160;&lt;a href="#fnref:7" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:8"&gt;
&lt;p&gt;Amusingly, pinging the bridge&amp;rsquo;s &lt;em&gt;public&lt;/em&gt; IP from the
homelab reports ~74 ms&amp;mdash;exactly double. The echo request is DNAT&amp;rsquo;d
back through the tunnel to the homelab itself, so every packet crosses
the tunnel twice.&amp;#160;&lt;a href="#fnref:8" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:9"&gt;
&lt;p&gt;Tailscale fills the same role.&amp;#160;&lt;a href="#fnref:9" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:10"&gt;
&lt;p&gt;A bridge outage also trips this check and reboots a
perfectly healthy homelab&amp;mdash;an acceptable false positive, since a
reboot is harmless.&amp;#160;&lt;a href="#fnref:10" class="footnote-backref" role="doc-backlink"&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;</description></item><item><title>Translation Look Aside Buffer</title><link>https://david.alvarezrosa.com/posts/translation-look-aside-buffer/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://david.alvarezrosa.com/posts/translation-look-aside-buffer/</guid><description>&lt;p&gt;Explain what the TLB is, using maybe hrt blog?&lt;/p&gt;</description></item></channel></rss>