Saturday, June 27, 2009

The Python Platform

Update: I believe that the example for this essay on Python as an incompatible platform is incorrect. The bug report that I submitted was updated 3 months later to indicate that I'd read the documentation wrong. I think the core point here is still valid. There's a lot of "not invented here" applied to the UNIX and Linux conventions in Python, but I chose a bad example, and for that I should apologize to the Python community. I like Python. I like programming in Python. I don't want to make it sound like I'm dismissing the language, here, just a particular trope in the community.

When Java came out, I remember the promise that it would be the write-once-run-anywhere language. It was supposed to free programmers from the need to tie their code to a platform, and instead they could simply write it. This never really happened. Instead, what we got was the Java (or more accurately the JVM) platform. It wasn't really a great platform as these things go, and for a short time that confused me. I wasn't sure why the smart people at Sun would be unable to create a decent platform on which to write generic code.

Then it came to me... It was Windows. You see, Sun had a pretty decent little operating system called Solaris (né SunOS), but Java was supposed to work everywhere, so at a minimum, it would have to accommodate the world's most popular desktop platform at the time (and still, though it has less market share now): Microsoft Windows. Windows has its own ideas about how a system should manage users, permissions, networking and a host of other things that programs want to interact with, so Java couldn't allow the same code to run everywhere while exposing the powerful semantics of the Solaris operating system. More broadly, it couldn't expose those core Solaris semantics that came from its Unix heritage, embodied in the POSIX standards. These standards are what make C, C++, Perl and many other language's standard libraries so powerful, and because of that power they are also widely portable. So useful are these standards that they have molded operating system after operating system, all based to some extent on Unix. Today this includes Linux, MacOS and a plethora of lesser-known systems, all of which have important niches in various industries such as HP/UX and AIX.

So, in the late 90s a new language started to gain popularity: Python. It didn't fall for Java's trap entirely. It was mostly in league with the POSIX way of thinking. Process management, file IO and many other aspects of the language were all very reminiscent of POSIX. However, Python suffered from a new problem: it was the anti-Perl. Perl, you see, is a programming language that became very popular in the early 90s, and Guido van Rossum, Python's original author made his feelings about Perl fairly clear early on. He wasn't fond of it, and Python was going to avoid its mistakes.

While correcting the perceived mistakes of another language might seem a noble goal, it has several pitfalls which must be avoided. One of the most obvious of these is avoiding something only because the original language embraces it. Python has had a rocky relationship with POSIX for just this reason. You see, Perl is a deeply POSIX-based, and even more specifically, Unix-based language. Python, as I said, is mostly a POSIX-friendly language as well, but there's a silent mistrust within the community of the platform that Python's nemesis language so readily embraces, and this has lead to a number of almost-entirely-POSIX-friendly choices which, when seen as a whole, yield the Python Platform.

This platform is not entirely POSIX-compatible, which means that users of Python and Python programs on both POSIX and Windows systems must adapt to it, in the same way (but to a lesser extent) that they must adapt to Java's platform.

OK, so that's the generalities, but what about specifics? One must look no further for a simple example than the Python standard library's command-line processing module, optparse. This module has a simple documentation bug, but that bug illuminates the Python Platform in stark detail. Here's an excerpt:
"... the traditional Unix syntax is a hyphen (“-“) followed by a single letter [...] Some other option syntaxes that the world has seen include: a hyphen followed by a few letters, e.g. "-pf" [...] These option syntaxes are not supported by optparse, and they never will be. This is deliberate: the first three are non-standard on any environment[...]"

This sounds reasonable, after all, why support oddball features? Well, it turns out they're not so oddball. The POSIX standard says that the compliant program "accepts any of the following as equivalent: 'cmd -ao arg path path', 'cmd -a -o arg path path' ..." notice that traditional Unix and POSIX programs such as the "ls" command will always accept these concatenated arguments. So why would Python tell us that this is non-standard on any environment? That goes back to the mistrust that Python has for POSIX. There's no compliance testing for the Python library's POSIX support because Python isn't a POSIX language. It's a Python language.

I decided to take the cautious approach here. I didn't want set anyone off, so when I submitted the following alternate wording to the Python folks:
"optparse has chosen to implement a subset of the GNU coding standard's command line interface guidelines, allowing for both long and short options, but not the POSIX-style concatenation of short options."

... I kept it simple and factual. I didn't attempt to suggest a rationale, and I made it clear that I wasn't asking for Python's behavior to change, just the documentation. This bug report sat for three months and then was silently lowered in priority, even though it required nothing more than a cut-and-paste documentation change.

This is but one example of where Python chooses to go its own way, eschewing the wisdom of a platform that has suited the likes of Sun, Apple, HP, IBM and countless FOSS developers for decades now. I like Python. I think it's a great language for certain types of tasks. I'm also a fan of its nemesis, Perl, along with a host of other languages I've used over the years, but time and time again, I see the Python community embrace a culture of exclusion and "not invented here." For Python to truly reach the potential that I'm sure it has, it will have to be able to embrace the tools that have worked well and only discard features which have been carefully considered and understood.

There's hope, though, that this will be the case. Guido's distaste for Perl and some other Unix tools like it may have fueled this fire, but it turns out he's a very reasonable person for the most part. In a recent blog post, he says, "It's no wonder that users are switching to the web as the platform for everything that used to live on the desktop -- with all its flaws (which I will discuss another time), web development still feels like a breeze compared to Windows development." I take this as a sign that he understands the power of a platform which works with consistency at all levels, and that he will continue to improve the Python Platform so that it builds synergy with the POSIX platform, and doesn't fight against it as if it were an opponent to be conquered.