Posted by eitan
on April 22, 2005 at 8:53 AM PDT
somewhat incoherent, though perhaps interesting musings on languages and apis
i've recently been thinking about two related topics:
- on programming languages vs programming apis
- on the relationship between human languages and programming languages
the sources of these thoughts have been coming from two separate corners:
1. discussions we sometimes have at nfjs symposia on the ever-popular question "which language should i program in" or "which language is best;"
2. my incidental knowledge of french, hebrew, and english and their bearing on my thoughts about software designs
everyone has an opinion on the first issue of 'which language is best.' my personal opinion is that different languages suit different people. that is, my answer is that there is no such thing as one language fits all people (or purposes, or both). i believe our liking of a programming language is somewhat related to our character and our intellectual capabilities. most _really_ smart people i know love languages like LISP, and in general languages that are extremely terse. my explanation (to myself) for this is that the terseness is not a problem for really smart people. once your learn a language it becomes natural, second nature. then what really matters is how terse it is. so a language like perl is very terse and you have really smart people who absolutely love perl and some not so smart people (like me) like it less.
but what does smart mean anyway? my wife tells me i'm smart and that makes me happy. but i know that i'm not the kind of person who will solve a difficult problem in real-time. i also know that in real-time conversations i'm a bore. where i do ok is when i take my own time to think about something. i'll resurface and by then i will have thought about the problem, brooded over it, understood it a little better, and so my proposed solution is usually a little better than what others propose. at least that seems to be my own subjective analysis of myself. so in the last paragraph, by "smart" i mean people who can think quickly in real-time. in that respect i'm somewhat of a disapointment. :-)
i'm not so interested in the question "which language is best" anyway. that was just a trigger that led me to thinking about programming languages vs programming apis.
to me, one language can be replaced with another. the essence is elsewhere: in the api. here's an example: pdflib (see http://www.pdflib.org/ ). pdf lib is a library for working with pdf documents. here are a few facts about pdflib:
1. The PDFlib core is written in the ANSI C language.
2. PDFlib supports language bindings for all common programming environments:
that is, pdflib clients can be written in:
c, php, java, .NET (.WHAT?), Perl, Python, Cobol, and more!
the striking thing is that the way you interface to the api in perl compared to java is essentially the same. pdflib is indifferent to the language used by its clients.
i think the same is true for other types of apis. take interfacing to a database as an example. all languages have a jdbc-like api for interfacing to a db. you submit sql queries and get back resultsets and you can iterate over the resultset and extract field values, etc.. if you do this in perl or in java the code is pretty much the same (albeit the perl version is more terse or obfuscated depending on how you look at it). so what defines the way the program is going to be written is the database api and not as much the language.
so my point is this: the api is a language. it's a domain-specific language. how you design an api has direct impact on whether it feels natural (e.g. jdom) or not (e.g. w3c dom). and to some extent, that seems to be independent of [a] the programming "language" you use to write the api or [b] the programming language you use to write the client.
so it turns out that we're developing languages every day in our programming work but we may not really be conscious of this fact (maybe we should call ourselves computer linguists instead of computer programmers).
let me turn now to the other thoughts in my head: the thoughts about languages. i'm most fluent today in english. this is no surprise since i've been living in the usa for the last 20+ years. but i'm in a somewhat unique position. i look at code i've written a number of years ago and think to myself: "did i write this code? it's terrible! it shouldn't be designed this way." i take this as a sign that i've become a better software designer (or maybe just more pompous) in the last few years. finding the right design is much more natural to me today than it was, say, 5-6 years ago. the principal design tools i believe one needs are:  refactoring +  design catalogs. my current understanding of software design is shedding light into the design of human languages. i look at english. english is _the_ international common language. it's spoken almost everywhere. i believe english's design is not so ideal, not so great. other languages have better designs. i believe that java is like english is, but for programming languages. i don't see a rush for english speakers to find another language to express themselves in because of "inadequacies of expression" that exist in the english language (inadequacies that may not exist in other languages). i look at hebrew specifically. what really strikes me about hebrew is that it looks like a language that was designed, not one that emerged; almost a "cleanroom" design. but the same goes for evolution: this world looks like it was designed (in the hitchhiker's guide you actually get to meet the designer) although it really evolved. so maybe the fact that hebrew looks like a better-designed language is that it's older than english and has had more time to evolve (this reasoning is flawed though because hebrew was a dead language for 2,000 years and was revived only in the last century; the elements of the hebrew language i describe below were present in the language as far back as 1200 BC during the exodus).
if i could describe to you the design of the hebrew language. it's beautiful. there is a notion of a "root" which most of the time is composed of three consonents. i don't even know how to describe it. you can say that almost the entire vocabulary of the hebrew language is a set of roots. then from that root are various derivatives. there is a consistent way to "transform" a root for conjugation, to create a verb form, noun form, to imply possession, and much more. like everything else, an illustration is more revealing than a description. take for example the root "YLD" which means "child" or is the concept that is related to the word "child." yes, a Y is a consonent in the hebrew alphabet. that's because a consonent in hebrew is defined as "that thing that goes in between two vowels." take for example the word "maya" -- the "y" goes in between two vowels. by the way, since the latin alphabet is a derivative of the hebrew alphabet (or maybe they have a common ancestor), the Y ("yud") corresponds to the letter "J" in latin, which i believe in german is pronounced "Y" but in english for some reason has morphed into the current "Jay" sound. so jonathan for example was never called jonathan, but yonatan. but i digress. so there is this model in hebrew that all writing is basically an alternating pattern of consonents and vowels. as in "pomade" - 3x(consonent followed by vowel). to handle the exceptional case of two consonents back to back, there is the notion of a null vowel: it's called the "shva." and there is a null consonent too: that's the letter "aleph" (which somehow morphed into the letter "a" in latin). it turns out that hebrew doesn't even have vowels: it's all consonents. i mean they're there but they've become implicit in the language. there are no letters for vowels. people have discovered they can read and write faster if they just omitted the vowels altogether. but again i digress. ok, so back to YLD. "yeled" means boy. "yalda" means girl. any word that is somewhat related to the concept of childhood is derived from that root. take birthday: that's two words: birth and day. the "birth" in birthday is "holedet." ok, the y has disappeared but take my word for it: the 'l' and 'd' come from the root of yld. "toldot" means "the telling of the generations of" or "recounting the genealogy of" which occurs frequently in the bible (as in "these are the generations of jacob (israel)." "toldot" again derives from yld. "yaldut" means childhood. "lehivaled" means "to be born." "yelid" means "native." so in english we have ten words that have no phonetic relationship to one another: boy, girl, birth, genealogy, child, etc.. (only a relationship in meaning). in hebrew they're all related both in meaning and phonetically. i realize that latin and greek have that concept as well. "gene" in genealogy is essentially a root. and there are many words in english that derive from the "gene" root such as "generation", "genes", "gender", etc.. that's why people recommend learning latin and greek: because so many words in the english language derive from lating and greek roots.
now that i think of it, i think hebrew could use a javadoc tool that will tell you all words that derive from a specified root. kind of like an "descendants" cross-reference in the java almanac (or ashkelon). maybe there's already one out there, who knows.
what's even more important is that the rules for derivation of words from their root does not change from one root to another. so if you learn the rules once, you now can generate almost the entire hebrew language from its set of roots. it's like the root being a machine and the user interface for that machine being the set of rules you apply to generate sentences. you learn how to operate one machine, and now you know them all. that reminds me of naked objects: naked objects does that for user interfaces: it generates a _standard_ ui for any set of types you desire. if you know how to work with one type of object, you've just learned the user interface for all of them. (that's why i believe that hebrew is an easy language to learn: it's less redundant: it's well refactored.) that's powerful stuff. the type can be parameterized. just like in hebrew the root can be parameterized. that's what the java reflection api gives us: programming model constructs are elevated to types, which can, among other things, be used as parameters. the type can be treated in a standard way. that's also the reason for the command pattern: elevating a method to a type.
so back to java. java is a fairly new language. i think we're slowly witnessing the evolution of a programming language. i think java has been evolving remarkably well. we wanted assertions, we got them. we wanted complementary apis that were not in j2se, we got them from apache and sourceforge and all kinds of places. we wanted a simpler way of doing iteration: we got it in java 5. we wanted parameterized types / generics: we got them. we also got annotations. we'll see which will survive and which will not. they might remain in the dna.. i mean in j2se. but they might not be used as much (e.g.: the preferences api). we are also experimenting with other languages that are byte-code compatible with java, groovy being an example.
i know there are a few things in j2se that haven't exactly evolved very much. the issue of ownership can also sometimes get in the way of evolution. maybe evolution deals with that by being slow. people die and then no one is left to fight for ownership .. of ideas, at least.
well, if you made it this far the only thing i can say is: thanks!