Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"most reasonably large C programs contain assembly corresponding to unsafe pointers, unions, etc."

I'm not sure what you're saying here with the "corresponding to" - obviously you can write "unsafe pointers", unions, etc, right in C, and I've not really ever encountered inline assembly for the purposes of subverting the type system. I'll respond to more of your content when I'm confident I've understood you (which I don't mean as a brush off...).



My response was poorly worded, but what I meant is that compilers for other languages with higher level type systems (which make more guarantees) cannot emit the same assembly as a C compiler, not that assembly is needed to subvert C's type system.


Why not? Can you show me an example of what you mean?

EDIT: I think the difference here is between 'types' and 'tags'. Types are at compile time. Tags are at runtime.


All right, let's see if I can express what I'm talking about properly. I don't have much background in theory, but what I'm trying to say is pretty trivial, so...

The article talks about translating from a typed lambda calculus to the untyped variant. Real compilers typically translate from the original typed AST gradually through multiple IRs to assembly, which is more complicated since they're changing the form of the program rather than just dropping type annotations, but is generally analogous. What I care about is whether there are programs with "terms" - instruction sequences - which correspond to useful programs which cannot be "typed" in a particular language, i.e. they cannot be produced by a compiler following any kind of straightforward procedure to lower from that language. Of course you can always come up with an equivalent program as long as the language is Turing complete; in particular, just as untyped lambda calculus can also be considered unityped, you can do something like Emscripten where you mechanically translate low level IR into a tiny subset of a higher level language, and maybe the compiler will optimize that subset. But that's not that interesting, because the general case is changing the program to a completely different one, and the special case removes all of the benefits of the type system. Additionally, you can have a language like Rust that lets the programmer poke holes in the type system, which is useful in practice but makes the type system no longer guarantee what it says it guarantees.

So, of course, with these limitations, not only are there correct programs which typical, non-dependent type systems cannot express, but, judging by the high performance of C (which has a type system, but one that provides barely any guarantees and thus has barely any limitations) in practice, these programs are quite common and useful in practice. Dependent type systems, along with their Curry-Howard mirror of theorem provers for low-level languages, can almost always make up for this (though by Gödel's incompleteness theorem there is some correct program somewhere that they can't prove correct), but they're a relative pain to program in and for that reason are uncommon in practice. Or to put it another way, you have a choice between delegating verification of complex properties about programs (the types which are there whether you want them to be or not) to computers, which have rather poor insight, or to humans, which are notoriously unreliable. The ultimate type system is a general AI you can explain the situation to in English.


> Additionally, you can have a language like Rust that lets the programmer poke holes in the type system, which is useful in practice but makes the type system no longer guarantee what it says it guarantees.

FWIW, Rust has a pile of behaviours that are undefined behaviour[1]: that is, the compiler will optimise/reason assuming they never happen. These are impossible to hit in safe code, but are possible with `unsafe`, it's up to the programmer to manually maintain them. That is, the type system does guarantee certain things, and it's up to the programmer to not break it inside `unsafe`.

[1]: http://doc.rust-lang.org/master/rust.html#behavior-considere...


If you want your high-level language to be as fast as C, it needs provably safe manual memory management (a.k.a. linear types) and provably safe unchecked array access (a.k.a. dependent types). Most high-level languages don't have these features, because in their current state they are really hard for programmers to use. See ATS for an example.


My parent claimed 'cannot,' not 'most languages don't have the features to be able to.'


There is some existing work on typed assembly languages (having a compiler emit assembly language that can be verified to be correct), although so far this is more of a research thing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: