# Copilot: a DSL for Monitoring Embedded Systems

In case you missed all the excitement on the Galois blog, what follows is a re-post.

# Introducing Copilot

Can you write a list in Haskell? Then you can write embedded C code using Copilot. Here’s a Copilot program that computes the Fibonacci sequence (over Word 64s) and tests for even a numbers:


fib :: Streams
fib = do
"fib" .= [0,1] ++ var "fib" + (drop 1 $varW64 "fib") "t" .= even (var "fib") where even :: Spec Word64 -> Spec Bool even w = w mod const 2 == const 0  Copilot contains an interpreter, a compiler, and uses a model-checker to check the correctness of your program. The compiler generates constant time and constant space C code via Tom Hawkin’s Atom Language (thanks Tom!). Copilot is specifically developed to write embedded software monitors for more complex embedded systems, but it can be used to develop a variety of functional-style embedded code. Executing > compile fib "fib" baseOpts generates fib.c and fib.h (with a main() for simulation—other options change that). We can then run > interpret fib 100 baseOpts to check that the Copilot program does what we expect. Finally, if we have CBMC installed, we can run > verify "fib.c" to prove a bunch of memory safety properties of the generated program. Galois has open-sourced Copilot (BSD3 licence). More information is available on the Copilot homepage. Of course, it’s available from Hackage, too. # Flight of the Navigator Copilot took its maiden flight in August 2010 in Smithfield, Virginia. NASA rents a private airfield for test flights like this, but you have to get past the intimidating sign posted upon entering the airfield. However, once you arrive, there’s a beautiful view of the James River. We were flying on a RC aircraft that NASA Langley uses to conduct a variety of Integrated Vehicle Health Management (IVHM) experiments. (It coincidentally had Galois colors!) Our experiments for Copilot were to determine its effectiveness at detecting faults in embedded guidance, navigation, and control software. The test-bed we flew was a partially fault-tolerant pitot tube (air pressure) sensor. Our pitot tube sat at the edge of the wing. Pitot tubes are used on commercial aircraft and they’re a big deal: a number of aircraft accidents and mishaps have been due, in part, to pitot tube failures. Our experiment consisted of a beautiful hardware stack, crafted by Sebastian Niller of the Technische Universität Ilmenau. Sebastian also led the programming for the stack. The stack consisted of four STM32 ARM Cortex M3 microprocessors. In addition, there was an SD card for writing flight data, and power management. The stack just fit into the hull of the aircraft. Sebastian installed our stack in front of another stack used by NASA on the same flights. The microprocessors were arranged to provide Byzantine fault-tolerance on the sensor values. One microprocessor acted as the general, receiving inputs from the pitot tube and distributing those values to the other microprocessors. The other microprocessors would exchange their values and perform a fault-tolerant vote on them. Granted, the fault-tolerance was for demonstration purposes only: all the microprocessors ran off the same clock, and the sensor wasn’t replicated (we’re currently working on a fully fault-tolerant system). During the flight tests, we injected (in software) faults by having intermittently incorrect sensor values distributed to various nodes. The pitot sensor system (including the fault-tolerance code) is a hard real-time system, meaning events have to happen at predefined deadlines. We wrote it in a combination of Tom Hawkin’s Atom, a Haskell DSL that generates C, and C directly. Integrated with the pitot sensor system are Copilot-generated monitors. The monitors detected • unexpected sensor values (e.g., the delta change is too extreme), • the correctness of the voting algorithm (we used Boyer-Moore majority voting, which returns the majority only if one exists; our monitor checked whether a majority indeed exists), and • whether the majority votes agreed. The monitors integrated with the sensor system without disrupting its real-time behavior. We gathered data on six flights. In between flights, we’d get the data from the SD card. We took some time to pose with the aircraft. The Copilot team from left to right is Alwyn Goodloe, National Institute of Aerospace; Lee Pike, Galois, Inc.; Robin Morisset, École Normale Supérieure; and Sebastian Niller, Technische Universität Ilmenau. Robin and Sebastian are Visiting Scholars at the NIA for the project. Thanks for all the hard work! There were a bunch of folks involved in the flight test that day, and we got a group photo with everyone. We are very thankful that the researchers at NASA were gracious enough to give us their time and resources to fly our experiments. Thank you! Finally, here are two short videos. The first is of our aircraft’s takeoff during one of the flights. Interestingly, it has an electric engine to reduce the engine vibration’s effects on experiments. http://player.vimeo.com/video/15198286 The second is of AirStar, which we weren’t involved in, but that also flew the same day. AirStar is a scaled-down jet (yes, jet) aircraft that was really loud and really fast. I’m posting its takeoff, since it’s just so cool. That thing was a rocket! http://player.vimeo.com/video/15204969 # More Details Copilot and the flight test is part of a NASA-sponsored project (NASA press-release) led by Lee Pike at Galois. It’s a 3 year project, and we’re currently in the second year. # Even More Details Besides the language and flight test, we’ve written a few papers: • Lee Pike, Alwyn Goodloe, Robin Morisset, and Sebastian Niller. Copilot: A Hard Real-Time Runtime Monitor. To appear in the proceedings of the 1st Intl. Conference on Runtime Verification (RV’2010), 2010. Springer. This paper describes the Copilot language. Byzantine faults are fascinating. Here’s a 2-page paper that shows one reason why. At the beginning of our work, we tried to survey prior results in the field and discuss the constraints of the problem. This report is a bit lengthy (almost 50 pages), but it’s a gentle introduction to our problem space. Yes, QuickCheck can be used to test low-level protocols. A short paper motivating the need for runtime monitoring of critical embedded systems. # You’re Still Interested? We’re always looking for collaborators, users, and we may need 1-2 visiting scholars interested in embedded systems & Haskell next summer. If any of these interest you, drop Lee Pike a note (hint: if you read any of the papers or download Copilot, you can find my email). # Shocking Tell-All Interview on Software Assurance I was recently interviewed by Flight International magazine, one of the oldest aviation news magazines. Their reporter, Stephen Trimble, was writing on the Air Force’s Chief Scientist’s recent report stating that new software verification and validation techniques are desperately needed. Here’s an online copy of the article. # Copilot: A Hard Real-Time Runtime Monitor I’m the principal investigator on a NASA-sponsored research project investigating new approaches for monitoring the correctness of safety-critical guidance, navigation, and control software at run-time. We just got a paper accepted at the Runtime Verification Conference on some of our recent work developing a language for writing monitors. The language, Copilot, is a domain-specific language (DSL) embedded in Haskell that uses the powerful Atom DSL as a back-end. Perhaps the best tag-line for Copilot is, “Know how to write Haskell lists? Good; then you’re ready to write embedded software.” Stay tuned for a software release and updates on a flight-test of our software on a NASA test UAV… In the meantime, check out the paper! # Twinkle Twinkle Little Haskell Update Sept 28,2010: the Makefile mentioned below worked fine, except for something having to do with timing. I was too lazy to track the problem down, but fortunately, I found an scons script (using the scons build system) that I modified to run on Mac OSX, and it works perfectly. The original script is here—thanks Homin Lee! The post has been modified appropriately. Update Oct 1, 2010: Homin Lee has updated the script to work on Mac OSX, so you can just grab the original script now. It’s been a few months almost a year(!) since John Van Enk showed us how to twinkle (blink) an LED on his Arduino microcontroller using Atom/Haskell. Since that time, Atom (a Haskell embedded domain-specific language for generating constant time/space C programs) has undergone multiple revisions, and the standard Arduino tool-chain has been updated, so I thought it’d be worthwhile to “re-solve” the problem with something more streamlined that should work today for all your Haskell -> Arduino programming needs. With the changes to Atom, we can blink a LED with just a couple lines of core logic (as you’d expect given the simplicity of the problem). For this post, I’m using If you’ve played with the Arduino, you’ve noticed how nice the integrated IDE/tool-chain is. Ok, the editor leaves everything to be desired, but otherwise, things just work. The language is basically C with a few macros and Atmel AVR-specific libraries (the family to which Arduino hardware belongs). However, if you venture off the beaten path at all—say, trying to compile your own C program outside the IDE—things get messy quickly. Fortunately, with the scons script, things are a piece of cake. What we’ll do is write a Haskell program AtomLED.hs and use that to generate AtomLED.c. From that, the scons script will take care of the rest. ## The Core Logic Here’s the core logic we use for blinking the LED from Atom: ph :: Int ph = 40000 -- Sufficiently large number of ticks (the Duemilanove is 16MHz) blink :: Atom () blink = do on <- bool "on" True -- Declare a Boolean variable on, initialized to True. -- At period ph and phase 0, do ... period ph$ phase 0 $atom "blinkOn"$ do
on <== not_ (value on)  -- Flip the Boolean.

period ph $phase (quot ph 8)$ atom "blinkOff" \$ do
on <== not_ (value on)

And that’s it!  The blink function has two rules, “blinkOn” and “blinkOff”.  Both rules execute every 40,000 ticks.  (A “tick” in our case is just a local variable that’s incremented, but it could be run off the hardware clock.  Nevertheless, we still know we’re getting nearly constant-time due to the code Atom generates.)  The first rule starts at tick 0, and executes at ticks 40,000, 80,000, etc., while the second starts at tick 40,000/8 = 5000 and executes at ticks 5000, 45,000, 85,000, etc.  In each rule, after calling the avr_blink() C function (we’ll define), it modulates a Boolean upon which blink() depends. Thus, the LED is on 1/8 of the time and off 7/8 of the time. (If we wanted the LED to be on the same amount of time as it is off, we could have done the whole thing with one rule.)

## The Details

Really the only other thing we need to do is add a bit of C code around the core logic.  Here’s the listing for the C code stuck at the beginning, written as strings:
 [ (varInit Int16 "ledPin" "13") -- We're using pin 13 on the Arduino. , "void avr_blink(void);" ] 
and here’s some code for afterward:
 [ "void setup() {" , " // initialize the digital pin as an output:" , " pinMode(ledPin, OUTPUT);" , "}" , "" , "// set the LED on or off" , "void avr_blink() { digitalWrite(ledPin, state.AtomLED.on); }" , "" , "void loop() {" , " " ++ atomName ++ "();" , "}" ] 

The IDE tool-chain expects there to be a setup() and loop() function defined, and it then pretty-prints a main() function from which both are called. The code never returns from loop().

To blink the LED, we call digitalWrite() from avr_blink(). digitalWrite() is provided by the Arduino language.  (In John’s post, he manipulated the port registers directly, which is faster and doesn’t rely on the Arduino libraries, but it’s also lower-level and less portable between Atmel controllers.)  Atom-defined variables are stored in a struct, so state.AtomLED.on references the Atom Boolean variable defined earlier.

## Make it Work!

Now just drop the scons script into the directory (the directory must have the same name as the Haskell file, dropping the extension), and run
 > runhaskell AtomLED.hs > scons > scons upload 

This should work for any Atom-generated program you want to run on your Arduino (modulo deviations from the configuration I mentioned initially). Also, note the conventions and parameters to set in the scons script.

Post if you have any problems, and I might be able to help. Also, I’d love to package the boilerplate up into a “backend” for Atom, but if you have time, please beat me to it.  Thanks.

Code:

# New Group: Functional Programming for Embedded Systems

Are you interested in how functional programming can be leveraged to make embedded-systems programming easier and more reliable?  You are not alone.  For example, check out what’s been happening in just the past couple of years.

Now Tom Hawkins (designer of Atom) has started a Google group, fp-embedded, to discuss these issues.  Please join and post your projects & questions!

# An Apologia for Formal Methods

In the January 2010 copy of IEEE Computer, David Parnas published an article, “Really Rethinking ‘Formal Methods’” (sorry, you’ll need an IEEE subscription or purchase the article to access it), with the following abstract:

We must question the assumptions underlying the well-known current formal software development methods to see why they have not been widely adopted and what should be changed.

I found some of the opinions therein to be antiquated, so I wrote a letter to the editor (free content!), which appears in the March 2010 edition.  IEEE also published a response from David Parnas, which you can also access at the letter link above.

I’ll refrain from visiting this debate here, but please have a look at the letters, enjoy the controversy, and do not hesitate to leave a comment!

# 10 to the -9

$10^{-9}$, or one-in-a-billion, is the famed number given for the maximum probability of a catastrophic failure, per hour of operation, in life-critical systems like commercial aircraft.  The number is part of the folklore of the safety-critical systems literature; where does it come from?

First, it’s worth noting just how small that number is.  As pointed out by Driscoll et al. in the paper, Byzantine Fault Tolerance, from Theory to Reality, the probability of winning the U.K. lottery is 1 in 10s of millions, and the probability of being struck by lightening (in the U.S.) is $1.6 \times 10^{-6},$ more than a 1,000 times more likely than $10^{-9}.$

So where did $10^{-9}$ come from?  A nice explanation comes from a recent paper by John Rushby:

If we consider the example of an airplane type with 100 members, each flying $3000$ hours per year over an operational life of 33 years, then we have a total exposure of about 107 flight hours. If hazard analysis reveals ten potentially catastrophic failures in each of ten subsystems, then the “budget” for each, if none are expected to occur in the life of the fleet, is a failure probability of about $10^{-9}$ per hour [1, page 37]. This serves to explain the well-known $10^{-9}$ requirement, which is stated as follows: “when using quantitative analyses. . . numerical probabilities. . . on the order of $10^{-9}$ per flight-hour. . . based on a flight of mean duration for the airplane type may be used. . . as aids to engineering judgment. . . to. . . help determine compliance” (with the requirement for extremely improbable failure conditions) [2, paragraph 10.b].

[1] E. Lloyd and W. Tye, Systematic Safety: Safety Assessment of Aircraft Systems. London, England: Civil Aviation Authority, 1982, reprinted 1992.

[2] System Design and Analysis, Federal Aviation Administration, Jun. 21, 1988, advisory Circular 25.1309-1A.

(By the way, it’s worth reading the rest of the paper—it’s the first attempt I know of to formally connect the notions of (software) formal verification and reliability.)

So there a probabilistic argument being made, but let’s spell it out in a little more detail.  If there are 10 potential failures in 10 subsystems, then there are $10 \times 10 = 100$ potential failures.  Thus, there are $2^{100}$ possible configurations of failure/non-failure in the subsystems.  Only one of these configurations is acceptable—the one in which there are no faults.

If the probability of failure is $x,$ then the probability of non-failure is $1 - x.$  So if the probability of failure for each subsystem is $10^{-9},$ then the probability of being in the one non-failure configuration is

$\displaystyle(1 - 10^{-9})^{100}$

We want that probability of non-failure to be greater than the required probability of non-failure, given the total number of flight hours.  Thus,

$\displaystyle (1 - 10^{-9})^{100} > 1 - 10^{-7}$

which indeed holds:

$\displaystyle (1 - 10^{-9})^{100} - (1 - 10^{-7})$

is around $4.95 \times 10^{-15}.$

Can we generalize the inequality?  The hint for how to do so is that the number of subsystems ($100$) is no more than the overall failure rate divided by the subsystem rate:

$\displaystyle \frac{10^{-7}}{10^{-9}}$

This suggests the general form is something like

Subsystem reliability inequality: $\displaystyle (1 - C^{-n})^{C^{n-m}} \geq 1 - C^{-m}$

where $C,$ $n,$ and $m$ are real numbers, $C \geq 1,$ $n \geq 0,$ and $n \geq m.$

Let’s prove the inequality holds.  Joe Hurd figured out the proof, sketched below (but I take responsibility for any mistakes in it’s presentation).  For convenience, we’ll prove the inequality holds specifically when $C = e,$ but the proof can be generalized.

First, if $n = 0,$ the inequality holds immediately. Next, we’ll show that

$\displaystyle (1 - e^{-n})^{e^{n-m}}$

is monotonically non-decreasing with respect to $n$ by showing that the derivative of its logarithm is greater or equal to zero for all $n > 0.$  So the derivative of its logarithm is

$\displaystyle \frac{d}{dn} \; e^{n-m}\ln(1-e^{-n}) = e^{n-m}\ln(1-e^{-n})+\frac{e^{-m}}{1-e^{-n}}$

We show

$\displaystyle e^{n-m}\ln(1-e{-n})+\frac{e^{-m}}{1-e^{-n}} \geq 0$

iff

$\displaystyle e^{-m}\left(e^{n}\ln(1-e^{-n}) + \frac{1}{1-e^{-n}}\right) \geq 0$

and since $e^{-m} \geq 0,$

$\displaystyle e^{n}\ln(1-e^{-n}) + \frac{1}{1-e^{-n}} \geq 0$

iff

$\displaystyle e^{n}\ln(1-e^{-n}) \geq - \frac{1}{1-e^{-n}}$

Let $x = e^{-n}$, so the range of $x$ is $0 < x < 1.$
$\displaystyle\ln(1-x) \geq - \frac{x}{1-x}$

Now we show that in the range of $x$, the left-hand side is bounded below by the right-hand side of the inequality.
$\displaystyle \lim_{x \to 0} \; \ln(1-x) = 0$

and
$\displaystyle - \frac{x}{1-x} = 0$

Now taking their derivatives
$\displaystyle \frac{d}{dx} \; \ln(1-x) = \frac{1}{x-1}$

and
$\displaystyle \frac{d}{dx} \; - \frac{x}{1-x} = - \frac{1}{(x-1)^2}$

Because $\displaystyle x-1 \geq - (x-1)^2$ in the range of $x$, our proof holds.

The purpose of this post was to clarify the folklore of ultra-reliable systems.  The subsystem reliability inequality presented allows for easy generalization to other reliable systems.

Thanks again for the help, Joe! Read the rest of this entry »