A Brief Introduction to C
Written for the 8-bit Software User Group by Stephen Mumford
After buying a BBC or Master, it is inevitable that sooner or later,
one will begin to dabble in BASIC; writing small programs or pottering
about with larger magazine listings. For some, the process stops there.
However, for a large number of other users, this learning process will
continue, and they will eventually find, without too much effort, that
they have become totally proficient in BASIC, and are slowly realising the
limitations of speed and versatility of the language. [Then again, you
could learn C before learning BASIC; then you would never come across any
limitations at all! Many people believe that learning BASIC first hinders
you from learning better languages later, as it teaches you to think in a
non-structured way.] The question that arises is "Where do I go from
here?". There are a number of options, including assembly language,
PASCAL, LISP and C. Whatever is chosen will prove to be a big step; there
is no easy route to writing fast, efficient and well-written pieces of
code. Because of this, it seems reasonable that the user should choose the
next step carefully, weighing up the pros and cons of each.
The fastest language mentioned above is assembly language, and this
will always be the case no matter what machine is being programmed. It is
the fastest because the assembly source code is written by hand and
assembled directly into machine code. As there are less steps involved,
the code produced will always tend to be neater and quicker than that
produced by any other method. However, the language has its drawbacks - as
the assembler instructions will be computer-specific, this means that code
produced on one make of machine will be completely incompatible with any
other make, unless an emulator is being used (which then removes the speed
advantage). It is also (as many frustrated programmers will testify) very
hard to write, and incredibly hard to write well.
The other options are PASCAL, LISP and C. These are very similar in
nature; textual source code is written in a readable form similar to
BASIC, and this is then fed into a compiler which will think silently for
a while and (hopefully) come up with a working version of the program,
written in machine-specific machine code. However, the source code is
portable as it is written in a machine-independent language (PASCAL, LISP
or C) and this can be copied to another machine as a text file and
compiled there, in effect producing the same program on two different
machines. The differences between each language are slight; it is up to
the user to pick one that is best represented on his or her machine. At
the moment, at least, the language 'C' is perhaps the most popular, and it
is this language that I shall be discussing in this article. A more
demanding reason is that I own 'C'; I do not own PASCAL or LISP. This
narrows my choice somewhat!
[A brief note on the allegedly "slight" difference between the three
languages. I would agree that PASCAL and C, and indeed the older FORTRAN,
are very similar in the way in which they are written, and the features
they support; for example, PASCAL bears far more resemblance to C than
either do to BASIC. C generally has more features, and can be used more
flexibly, because it is more recent. However, from the (limited)
description of LISP in the Master Reference Guide ("an artificial
intelligence problem-solving language"), it can be assumed that it is
rather different from any of the other high-level languages. A few weeks
ago I spoke to a French research student, who writes in C on Archimedes,
mini-computers and mainframes in order to solve (very complex)
mathematical problems. He suggested that LISP was so highly structured
that it was "virtually impossible to make mistakes in it" (his English was
better than my French). This would seem to indicate a vast difference
between LISP and any other language - C is supposed to be very structured
if written properly, but you still make mistakes all over the place (well
I do anyway). However, he also commented that he did not write in LISP
because it was far too slow for his needs. Again, this indicates that LISP
must be very different in character from C to be significantly slower when
both are compiled languages (I am fairly sure LISP is compiled?)
Anyway I hope to have my own C, LISP, PASCAL and FORTRAN-77 compilers
fairly soon, so perhaps I will be able to add more detail after having
programmed in all of them! Meanwhile, back to the introduction to C -
sorry Stephen!...]
Before I get further involved, I had better come clean with you. I own
and use an Archimedes, and this is the machine on which I run my ANSI C
compiler. It may seem odd, therefore, that I am writing an article for the
8 bit user group. There are two strong reasons for this; one is that, as I
mentioned above, as C is a machine-independent language, anything written
using ANSI C on a BBC or Master will compile on an Archimedes, and the
reverse is true. [Although Stephen's C programs to sort 50,000 16-bit
integers tend to run into fairly obvious problems on my Master!] The other
reason is named Daniel Shimmin. [I'm sure Stephen isn't implying that he
has written the article only because I have pestered him to do so
continually for three weeks, but rather that 8-Bit Software is the only
disk-based user group of its sort, so if I wasn't around there wouldn't be
anywhere to publish the article!]
C was created in 1972 by Dennis Ritchie, and was based, perhaps not
surprisingly, on a previous programming language called 'B'. The language
was shaped and defined further when Ritchie was joined by Brian Kernighan
and they wrote a book called "The Programming Language of C" in 1978. This
was taken as 'the Bible' for C programmers everywhere, and this was the
first step to producing a language that was standardised across all
machines. The next was taken when a committee was formed in 1983 to
completely define an ANSI standard for C, and this is the standard that is
being followed today. The proof of the popularity of C can be seen in the
UNIX computer operating system, of which 85% was written in C. I have even
heard of C compilers being written in C, although this sounds far too much
like recursion for my liking! [The Small C Compiler of Dr. A.J. Travis is
one example of a C compiler written in C.]
If you are familiar with BBC BASIC and the use of procedures, you will
already understand one of the fundamental features that other computer
users find harder to understand. Although C has a GOTO command, it doesn't
have a GOSUB and the unwritten rule is that one should use procedures
[i.e. instead of GOTOs as well as GOSUBs; although you should really avoid
GOTOs in BASIC as well]. These are actually more similar to functions in
BASIC, and that is what they are called; they can only return one variable
directly. However, this limitation can be worked around easily enough;
global variables can be used; alternatively the neater and more flexible
way is to use pointers. Pointers are variables that hold memory addresses
of other variables, and in this way, you can alter the source variables
directly by changing the value stored at the location pointed to by the
pointer. There is one disadvantage; they are rather hard to get to grips
with, and I shall not attempt to cover them just yet. [Does this imply a
sequel to this article is on the way?]
To enter a 'C' program, you will need to find a text editor or word
processor that is capable of producing ASCII text without any unexpected
control codes. I believe EDIT, built into the Master, is suitable. (Any
ideas for the BBC, Daniel?) [The instructions for the Small-C compiler
state that it will accept text files written in VIEW, which presumably
means most other standard WP packages (WORDWISE etc., but not EDWORD!)
will produce suitable textfiles. So there. You may have to remove control
codes from non-EDIT files before feeding them to compilers on machines
other than the BBC, though.]
Perhaps the easiest command to understand and utilise is the C
equivalent of REM; this means you can fill your programs with long strings
of explanatory text or 'hello' messages. To create a REM statement, you
start it with the characters /* and you finish with the reverse */. An
example is given below.
/* This is a REM statement */
The advantage of these over the BASIC version is that they can span any
number of lines.
To create a program that will actually compile (although it won't do
much), the function main() needs to be included. A sample program is shown
below; don't bother compiling it, just study the layout of the function.
/* A short example program */
main()
{
}
The text "main()" defines the function called "main" in the same way
that "DEF PROCmain" would; no parameters are passed to it, hence the
brackets () are empty. This function is important because it is the one
that a C program will always execute first, no matter where it is. It's
basically a marker telling the computer to "Start Here". The other
important feature of the function are the 'curly brackets', or braces, {}.
These surround any statements executed by the function, and can be
compared to the "DEF PROC" and "ENDPROC" of BASIC.
The next step is to put something inside the braces, and that has been
done with the next program.
/* A program which does something */
#include <stdio.h>
main()
{
printf("An example of the 'printf' function\n");
}
With this program, there are a few more details to note. The line
'#include <stdio.h>' tells the compiler to find, and include with the
object code, a set of functions known as <stdio.h>. 'stdio' stands for
"STanDard In/Out" and it supplies routines for printing and inputing data,
amongst other things. Believe it or not, the C language has no print
function built in, and this has to be called from a separate program. The
function being used is called 'printf' and takes a number of parameters,
depending on what you want it to do. To print a line of text, you follow
the method above:
printf("An example of the 'printf' function\n");
The text is placed in quotes and is the first (and only) parameter;
this is surrounded by brackets and prefixed with 'printf'. The '\n' at the
end of the text represents a linefeed, or 'newline', as the 'printf'
function doesn't add one by itself. A semicolon marks the end of the
statement, because in C there are no lines or line numbers and the layout
is 'freeform'. All statements that 'do' something (printing something,
calling another function, defining a variable) should end with a
semicolon.
As this is only supposed to be a brief introduction to C, I will not
give more detailed explanation. Instead, I will give you the titles of
some books to search for if you are interested.
Bibliography
The C Programming Language (2nd Ed.) - Brian Kernighan & Dennis Ritchie
Learning to Program in C - N. Kantaris
The Big Red Book of C - Kevin Sullivan
Editor's notes:
Stephen also supplied a C version of "Quicksort", which can apparently
sort 50,000 numbers in 0.98 seconds on an A5000. After struggling for a
while to get this to compile in Small C on the Master, I realised that it
was impossible due to the limited number of functions supported. However,
I haven't given up yet, and may well write the necessary functions myself,
so that I might be able to include a stand-alone demo program to sort 500
numbers in issue 22 (perhaps comparing it with a similar sorting routine
in BASIC).
Anyone attempting to compile true ANSI C from a more expensive
machine/compiler on the Small C Compiler may like to note that (1) it does
not support "prototyping" in any form (i.e. replace "void main()" or "int
main()" with "main()"), and it doesn't appear able to cope with "int data
[500+1]" or "int data [MAXDATA+1]"; I used "int data [501]" instead.