Thursday, July 14, 2022

Essay 3 C is misunderstood

Essay 3 C is misunderstood

Quirky, flawed, and tremendous success! -Dennis Ritchie


Imagine, if you will, that you are living in the 1970s where computer are in the beginning of their creation. You have a simple machine with simple instruction set. And you want to program the computer. What will you do?


1. Rewire the circuit

2. Enter hex numbers, backwards

3. Enter decimal codes

4. Enter mnemonic assembly

5. Use FORTRAN

6. Use Forth

7. Use BASIC


Most programmers at the time use assembly programming language. Higher level construct are inefficient and to be used only by unskilled clerks. The assumption was that if you can say it in English, then you can say it in computerese. However, you do need to say it in restricted English. In other words, Pseudocode.


No one code in Pseudocode, except Donald Knuth, but he's the only one to do so with his Literature Programming. However, my point is simple: Translating Pseudocode from English to high level computer language is so easy, anybody can do it!


Forth, a stack based language, is unique, in that it's simple to code, yet capable of powerful functionality. You don't see Forth nowadays, but the spirit lives on PostScript format, which is contained in Portable Document Format (PDF). Another simple language is List Processing (LISP) and its dialect cousin Scheme. You may see it in Emacs scripting engine.


Of course, there is BASIC and LOGO, as well. Those are mostly limited in educational institution. The thing about them is that they're easy to code. BASIC, famously, can easily fit in less than 8K. LOGO and a more sophisticated BASIC can fit in a bit more memory, such as 8K. I have personally done a LISP based interpreter in as little as 300 lines. Other students did it in about 600 lines. But I digress. The point is that creating a coding language was by necessity an exercise in small scale expenditure. Computers of the day just isn't capable of large memory footprint requirements.


So, then, the question that needed to be answered at the time was "What is the most efficient way to program a computer that has the ease of high level language, yet be capable of fast machine language execution speed?" And that was the goal of the project. It began with BCPL, then a simplified version of it, B, was conceived. Then improved to C.


C was never conceived to be a high level language. It amuses me to no end that people nowadays call C a high level language. Maybe C++, although that was debatable, too. You see, C++ was originally created as a precompiler that compiles to C. C++ now compiles directly to ASM to take advantage of the language construct to generate good optimization. However, at the time, C++ to C to ASM to ML was quite common. The reason being is "ease of implementation."


If you have the hardware, you have Machine Language (ML). In order to ease the programming process, Assembly language (ASM) is used. C compiler compiles to ASM, which is then compiled and linked to ML. It's trivial to write an assembler. It's not difficult to write a Tiny BASIC compiler, albeit, having done so, I can testify that Tiny BASIC has a rather limited utility.


So, a more powerful language was desired, and the priority is to write something that is easy to write and compile. Hence, C uses a lot of standard libraries. About the only custom library is the stdio library. That is done with machine language. Most other library actually use C.


If you look at the libraries, most of them are really simple and easy to do. Once you have the core C compiler going, you can just implement the rest really, really easy. Hence, the nearly ubiquitous presence of C compiler. C compiler is relatively quick and easy to do, at least in principle, before the desire for a more powerful language comes into being.


I'm not the one to discuss the merits of various C language features, but fortunately, Brian Kernighan is still alive and well, and I believe he is the de facto person to consult about such things. My point is: C language is small and easily implementable across most system, and therefore available to most computers.


There is a version of C compiler called Tiny C Compiler (TCC) which I actually use every day. I don't use Gnu C Compiler (GCC) which is what professional coders use. That's because it's really overkill for my use. I only use the smallest set of C language features, anyway. Why would I want to use GCC? GCC's compilation time is rather extensive! 


And yet, simple as it is, C does feature rather powerful set of capabilities. It bridges the assembly language and high level language. C was actually classified as "middle level" language, yet quite powerful. There is actually a design called C--, which is an even simpler language set, but that isn't popular. In short, C occupies the ideal "Goldilocks" zone of complexity and features.


That is my take of C language. It's not a language that is the end of all language, rather it is the language that is the beginning of all languages. A whole lot of later generation languages can trace its lineage to C. I already mention C++, but basically C -> C++ -> Java -> C# -> Rust is one such lineage. Another would be C -> Perl -> Raku. There are plenty more.


Should you learn C? I would advise it, but not if you're a beginner. For beginners, there are more accessible language such as BASIC, which is designed for beginners in mind, about the only difficulty is the lack of unified dialect. The most popular BASIC dialect seems to be Microsoft QuickBasic, and QB64 seems to be a good implementation of it. The question you should ask is: Will you be willing to learn Assembly programming? Do you want to code as many languages as possible? Do you want to write program in various systems/hardwares? If yes, then learning C is a great way to get started!


As Dennis Ritchie himself pointed out: C is quirky and flawed, but it's also tremendously successful. Yes, the standard library isn't the greatest in the world, but it's not supposed to be. They're supposed to be simple and easily implementable. If you need more, then you should code your own libraries. C makes that easy.


As a note: Tiny BASIC specification features a FOR-LOOP. I'd say that's a mistake. WHILE-LOOP is easily implementable, as is REPEAT-LOOP. If you add *continue* and *break*, then you can even use infinite loop! GOTO is also useful, albeit easily misused. The answer is simple: Structured GOTO. 


No comments:

Post a Comment