Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Somewhat off topic and very speculative, but I'm curious how feasible it would be to propagate safety proofs through compilation - not just formulaic memory safety rules but hopefully also arbitrary behavioral proofs - all the way down from a source language to machine code, so that essential properties could be formally verified without needing to either trust a compiler or use a provably correct one, in the latter case with corresponding difficulty of modification and low optimization level. The compiler would still have to be modified to do the propagation, and a machine code model and verifier constructed, but theoretically this would be easier than proving the whole thing works correctly.

I guess Typed Assembly Language works along these lines:

http://www.cs.cornell.edu/talc/

but I haven't read up on the papers, and it seems outdated.

My imaginary end goal (not that I'd be able to do anything remotely as ambitious myself, but I still like to think about it) is an operating system where all code is run in kernel mode after being checked for safety - like Singularity OS but without trusting a compiler.

Perhaps that trust doesn't actually matter very much, since the compiler is unlikely to contain too many exploitable bugs (AFAIK most Java vulnerabilities are not related to the JIT, for instance), and there are plenty of other places in such an operating system bugs could hide anyway. But it's inelegant to require all code to go through a single compiler. For example, it would be cooler if the assembly verifier were not baked into the system, but simply a program proven to correctly check whether some code is safe in the machine code model; if you (any user) could prove a JIT never generates unsafe code, you could submit the JIT in place of the verifier, and run wild with it without going through any slow compilation or verification processes.



I think you're not thinking far enough. Why not have a compiler that can take your specification and just write your program for you? You'll probably need to write some code yourself, which will then interface with the written-from-formal-proof code a-la quark.

My intuition is that a specification that can be checked and is good enough to guarantee that your program is 100% correct should be enough to compile a full program from, possibly with some hand-written lower-level code for guidance so it doesn't fall in pathological cases like "the empty program satisfies these constraints and is easiest to generate, so here".


> Why not have a compiler that can take your specification and just write your program for you?

One difficulty with this is that programs (by definition) are 'computationally relevant' whereas proofs are not. In other words, as long as we have a proof of X it doesn't make a difference which proof we happen to have. On the other hand, different functions of type X can have a big impact on a program.

For starters, there are properties which are difficult to express using types. For example, we only have rudimentary ways to encode space and time usage (eg. 'cost semantics'). Without this, when we ask for a sorting algorithm we may get back bubblesort, since it's a perfectly acceptable implementation of a sorting algorithm.

Also, our types will have to become incredibly precise. Rather than just encoding the properties we care about (eg. security guarantees), we need to include lots of uninteresting properties to guide the computer to what we want (compare this to guiding a genetic algorithm via a fitness function, or getting a genie to grant you a wish in exactly the way you want). At this point, you're basically writing your program in a very indirect way; you may be better off just writing one or two lines 'manually' instead of trying to steer the automated process.


What exactly is doing the verification? Another program, no? In which case why would you trust that program any more than the compiler that generated the machine code? At some point you must trust something to be correct. Whether that's the compiler or something else doesn't really change anything.


The idea is to minimize the trusted computing base---a compiler and runtime system for a high level language is a big and complicated system, so it is very hard to get it right, but a proof checker can be small so it is easier to be confident that it is correct. (This idea is also known as the "de Bruijn criterion").


Ideally, we should minimize the code in any given path to assurance, but maximize the number of independent paths to assurance (modulo resource constraints).


There's some academic work in this area under the name of "proof-carrying code": http://en.m.wikipedia.org/wiki/Proof-carrying_code


On the topic of trusting compilers:

http://www.dwheeler.com/trusting-trust/


you might be interested in this work: http://research.microsoft.com/pubs/176601/js-star.pdf




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: