October 15, 2011

Very high level programming

James Iry has written an excellent post on why calling C “portable assembly” is wrong.  The part that inspired me to think is the one about undefined behaviour.  He says:
The C standard says a bunch of stuff leads to undefined behavior. Real compilers take advantage of that to optimize away code, sometimes with surprising results.

What assembly has undefined behavior on that scale? mov al, [3483h] ;(Intel syntax) is going to try to copy the byte at 3483h into the al register, period.
Essentially compilers understand the intent of our statements and can rewrite the program for us if they can prove that the rewrite does not change the program’s behaviour.  Compilers understand a lot of “low level” operations so they are smart enough to rewrite them for us.

One thing you would have noticed if you’re a programmer is that formidable amount of coding time is spent on handling corner cases.  Most of this corner case complexity is caused by the lack of understanding our tools (compilers, runtimes, etc.) have of the program’s intent.  Let’s take an example code snippet:
List<Integer> list = new ArrayList<Integer>();
for (int i = 0; i < 10; ++i) {
  list.add(300 * 2);
Compilers of today can rewrite this code as:
List<Integer> list = new ArrayList<Integer>();
int __tmp = 300 * 2;
for (int i = 0; i < 10; ++i) {
This is a very simple optimization that saves CPU time.  This is possible only because our compilers know that a loop repeats the execution of its body over and over again, and that the value of 300 * 2 cannot change over time.

All the languages and other tools we have today are only aware of low level constructs, or code blocks.  They don’t understand, for instance, connection between two functions in a program.  What I think future holds for us are tools that understand very high level semantics.  Tools that can detect if you have two conflicting features in your product.

Another example.  If a Blogger blog uses a classic template, that blog should not have Layouts tab in the UI.  Because Layouts tab is only for manipulating a Layouts template.  Currently this idea is beyond the scope of tools, so we implement this manually: there’s a block of code that determines if the Layouts tab can be shown for a specific blog or not.

If a code change accidentally makes the Layout tab available to classic template blogs, no tool can find it today.  The only way we can detect such an error is by having tests.  Awesome as they are, tests are just band-aids.  Some day there will be tools that would know that classic templates and Layout tab are in conflict.  Those tools will flag an error if we link to the Layout tab of a classic template blog.  (In other words, a few more levels of indirection is due.)

No comments:

Post a Comment