Monday, November 7, 2011

Conventional Syntax is Critical in Programming


Languages, as everyone knows, come with rules that govern when and where you can say something. Programming languages are no different. The term for such rules is syntax.

Now what amazes me is the diversity of syntax that exists among computer languages and how amazingly baroque so many of them appear.

For example, Lisp uses something called polish notation that--in English speak--puts the verb first and wraps the whole things in parens. Here's an example from Scheme, a dialect of Lisp, that displays the sum of two numbers:
(display (+ 20 30))
See what happened? I summed 20 and 30 and then printed them to the screen with the display function.

Let's compare that to reverse polish notation, which turns it all around. Here's an example from the programming language Forth that once again sums 20 and 30 and then prints them to the screen.
20 30 + . 
Boom! How does that look? Here's what happened. Forth uses a stack to store most variables and words (forth calls functions "words"). So let's trace out what I did. First, I put 20 on the stack. Then 30. Then the + word. The + word sums 20 and 30 as plus always does. Finally I put the . word on the stack which outputs the contents of whatever is on the stack.

If you want to see a larger example of Forth, read Dadgum's blog on What It's Like to Program Forth.

Here's the example in Python:
print 20 + 30
Notice anything? I do. The Lisp and Forth examples require explanation. They look tricky. They are Puzzle Languages.

In contrast, the Python code is pretty straightforward. It tracks the way most of us in the West have been reading, writing, and speaking all our lives. This is important.

Programming languages are tools. They should make problem solving easier. When they don't, they need to be replaced by tools that do. I'm going give such replacements a name: better tools.

I know that's going to be hard for some programmers to accept, but it's true. Here's why: Programming languages that track the syntax of spoken languages enable programmers to leverage their human language experience. This almost certainly accelerates comprehension when reading code and decreases the time needed to learn a given language.

Python has really been climbing the popularity charts over the past decade, and I would argue a large part of its success as a programming language is due to how simple it is to learn and remember.

And don't ignore that "remember" bit. It's in bold for a reason. Just learning a language isn't enough; you also must remember the language when you need it. Ten years ago, Perl was the language everybody used for scripting, but not anymore. Perl's syntax is tricky and hard to remember, which is probably why other languages like Python, Ruby, and PHP have been stealing away so many programmers who would likely have counted themselves as Perl programmers back around 2000. For that matter, many of them were Perl programmers back then. But not anymore.

I'm arguing that straightforward syntax is one of the most important features a programming language can have and that lacking such syntax can result in the language declining in popularity or never really getting adopted at all.

This is actually coming up in my own experience. One of my favorite languages right now is Erlang. It's pretty fast, really powerful, and has a ton of features that make programming easy. But it's syntax is a mess. Just a mess. (Pop quiz: what's the lesser-than term look like in Erlang?) That worries me. It's a problem. It means that given a choice between Erlang or another language with more conventional syntax, Erlang might fail to gain mind share--even if it's technically superior--and that's the death sentence for programming languages. Not right away of course. Languages don't die suddenly. It's more like a creeping death where new programmers don't bother to learn the language and old programmers slowly migrate away.

But it's death all the same.

1 comment: