Amateur Hour

What I've Been Reading (Oct 2022)

Or a short log of things I read last month.
Nov 13, 2022. Filed under Musings
Tags: reading

Talks

Articles

Papers

A Series on Perennial Grains

After somehow running across the Land Institute, I somehow ended up falling down a strange rabbit hole investigating perennial grains. Apparently, all the grains we commonly rely on (e.g. wheat, maize, rice, potatoes) are annual plants which need to be replanted every year. This replanting is both expensive (requiring huge amounts of fertilizer and pesticide) and environmentally destructive (e.g. creating huge amounts of soil runoff, to the extent that 1/3rd of US agricultural lands have entirely lost their topsoil!). The Land Institute’s hope is that perennial grains can avoid these problems while being even more productive as well (since they don’t have to keep regrowing the same structures each year).

I read the following four papers about this overall vision:

I think the basic TLDR is that everyone agrees that the problems with modern agriculture are very real, but that (1) making perennial grains economically viable is a serious challenge (apparently, the naive approach leads to only 10-20% of the yield of annual grains), and (2) whether this is the best way to solve the problem. Smaje’s the only one who thinks that perennial grains might be impossible, but personally I didn’t find their argument very compelling.

I also tried to dig into what modern agricultural breeding research actually looks like and took a look at two papers:

Overall, I thought this was really cool rabbit hole, and a good forcing function to review a bunch of basic botany and agriculture knowledge.

Metastable failures in distributed systems

https://dl.acm.org/doi/10.1145/3458336.3465286

This was a really neat paper – it defines a new category of “metastable” failures, which are failures that do not self-heal (e.g. there’s some feedback loop that sustains the failure over time). A retry storm is a good example of this: if a service is overloaded such that requests start to fail (and thus be retried by the client), those retrying requests can keep the service overloaded (and thus causing even more retrys later).

Although the paper (unfortunately) didn’t go into much detail about how to avoid metastable failures, just noticing the general features of this type of problem and providing a clear mental model was nice.

DBOS: a DBMS-oriented operating system

https://vldb.org/pvldb/vol15/p21-skiadopoulos.pdf

This was a nice vision paper about using distributed databases as a layer for defining distributed operating systems (instead of the million ad-hoc interfaces from things like Kubernetes, Slurm, Zookeper, etc).

I’ve been complaining for years that operating systems should expose a more database-like API for querying the internals (e.g. instead of having to parse /proc files) so I’m naturally on-board with this vision, but I wouldn’t say the paper did enough implementation work to convince me that this would actually work on real-world workloads. Hopefully in a future paper!

XRP: In-Kernel Storage Functions with eBPF

https://www.usenix.org/conference/osdi22/presentation/zhong

This paper details XRP, which uses eBPF to directly run logic in the Linux storage layer (similar to XDP and networking). The basic idea makes a ton of sense (modern SSDs are fast and syscall overhead can be pretty substantial), although I don’t have enough experience building high-performance database systems to say much about the actual API presented (or the technical limitations).

TBH, I’m a little surprised no one built this earlier (XDP is like 6 years old now!), but I’m glad it’s finally happening.

Finding missed optimizations through the lens of dead code elimination

https://ethz.ch/content/dam/ethz/special-interest/infk/ast-dam/documents/Theodoridis-ASPLOS22-DCE-Paper.pdf

This was a one of two really clever testing papers exploiting dead code optimization. The basic idea is that by inserting “markers” (i.e. special inline assembly instructions) into code, you can tell whether a compiler optimized out that entire basic block or not (e.g. because it was dead code). Then, by comparing two different compilers, you can find identify bugs (e.g. either one compiler is missing optimizations or one compiler is incorrectly optimizing out code). By randomly generating C code using C-Smith and repeating many many times, you get yourself an automated bug finder.

A nice, clever idea to identify bugs.

Compiler validation via equivalence modulo inputs

https://web.cs.ucdavis.edu/~su/publications/emi.pdf

This is a 2nd approach to finding compiler bugs – the basic idea is to:

  1. Generate random C programs (e.g. using C-Smith)
  2. Execute those programs on random inputs while collecting code coverage
  3. Identify dead-code of the program for those inputs
  4. Arbitrarily mutate that dead-code (in this paper, they just delete dead code) and then re-execute the new program on the same input.

As long as your mutations avoid introducing undefined behavior, any mismatch in steps 2 and 4 are indicative of a bug.

I think this was a fairly clever idea, but found the paper itself to be weirdly obfuscated and unclear. They spent a lot of time with formal definitions that IMO really weren’t necessary and didn’t contribute much.

Using Lightweight Formal Methods to Validate a Key-Value Storage Node in Amazon S3

https://dl.acm.org/doi/10.1145/3477132.3483540

A nice experience report about how Amazon tests S3 about how they used property based testing + fault injection to better test S3. It was fun to read, but nothing really stood out to me as particularly novel (a lot of it just seemed like straightforward property based testing against a known oracle). I think the most interesting thing was the degree to which they used “default” random test case generators and minimization heuristics – the fact that those worked out so well is another good evidence that they’re pretty well-tuned at this point.

I did find the title slightly amusing – as someone who does a ton of property based tests at work, I have never described it as a “lightweight formal method”, but maybe I should start!