INTRODUCTION
This document sets out to describe the C programming language, as defined by
the ANSI standard, in order to enable a computer programer with no previous
knowledge of the C programming language to program in C.
It is assumed that the reader has access to a C compiler, and to the
documentation which accompanies it regarding library functions.
History:
The C programming language was invented by Dennis Ritchie for use on a DEC
PDP-11 during the 1970s. Since then it has become one of the most widely used
and respected programming languages available for computers. The reason for
this success is two fold; Firstly the C programming language is portable
between different computers. So a program may be developed on an IBM PC and
recompiled on a DEC VAX and still work without any changes to the code.
Secondly C provides the programmer full access to the computer's operating
system and memory. In practice this often means that the C programmer has
complete freedom to make a complete mess of the operating system! But it is a
level of power and freedom only offered by Assembler language programming
other than C.
C is a medium level language:
The powerful facilities offered by C to allow manipulation of direct memory
addresses and data, even down to the bit level, along with C's structured
approach to programming cause C to be classified as a "medium level"
programming language. It posesses less ready made facilities than a high level
language, such as BASIC, but a higher level of structure than low level
Assembler.
Key words:
The original C language as described in; "The C programming language", by
Kernighan and Ritchie, provided 27 key words. To those 27 the ANSI standards
committee on C have added 5 more. This confusingly results in two standards
for the C language. However, the ANSI standard is quickly taking over from the
old K & R standard.
The 32 C key words are;
auto double int struct
break else long switch
case enum register typedef
char extern return union
const float short unsigned
continue for signed void
default goto sizeof volatile
do if static while
Some C compilers offer additional key words specific to the hardware
environment that they operate on. You should be aware of your own C compilers
additional key words, but they have no place in portable code.
Structure:
C programs are written in a structured manner. A collection of code blocks are
created which call each other to comprise the complete program. As a
structured language C provides various looping and testing commands such as;
do-while, for, while, if
and the use of jumps, whilst provided for, are rarely used.
A C code block is contained within a pair of curly braces "{ }", and may be a
complete procedure, in C terminology called a "function", or a subset of code
within a function. For example the following is a code block. The statements
within the curly braces are only executed upon satisfaction of the condition
that "x < 10";
if (x < 10){
a = 1;
b = 0;
}
Whilst this is a complete function code block containing a sub code block in
the form of a do-while loop;
int GET_X()
{
int x;
do{
printf("\nEnter a number between 0 and 10 ");
scanf("%d",&x);
}while(x < 0 || x > 10);
return(x);
}
Notice how every statement line is terminated in a semicolon, unless that
statement marks the start of a code block, in which case it is followed by a
curly brace. C is a case sensitive but free flow language, spaces between
commands are ignored, and hence the semicolon delimiter is required to mark
the end of the command line.
Having a freeflow structure the following commands are recognised as the same
by the C compiler;
x = 0;
x =0;
x=0;
The general form of a C program is as follows;
compiler preprocessor statements
global data declerations
return-type main(parameter list)
{
statements
}
return-type f1(parameter list)
{
statements
}
return-type f2(parameter list)
{
statements
}
.
.
.
return-type fn(parameter list)
{
statements
}
Comments:
C allows comments to be included in the program. A comment line is defined by
being enclosed within "/*" and "*/". Thus the following is a comment;
/* This is a legitimate C comment line */
Libraries:
C programs are compiled and combined with library functions provided with the
C compiler. These libraries are of generally standard functions, the
functionality of which are defined in the ANSI standard of the C language, but
which are provided by the individual C compiler manufacturers to be machine
dependant. Thus, the standard library function "printf()" provides the same
facilities on a DEC VAX as on an IBM PC, although the actual machine language
code in the library is quite different for each. The C programmer however,
does not need to know about the internals of the libraries, only that each
library function will behave in the same way on any computer.
DATA TYPES
There are four basic types of data in the C language; character, integer,
floating point, and valueless which are refered to by the C key words; "char",
"int", "float" and "void" respectively.
To the basic data types may be added the type modifiers; signed, unsigned,
long and short to produce further data types. By default data types are
assumed signed, and the signed modifier is rarely used, unless to overide a
compiler switch defaulting a data type to unsigned.
The size of each data type varies from one hardware platform to another, but
the minimal range of values which can be held is described in the ANSI
standard as follows;
TYPE SIZE Range
char 8 -127 to 127
unsigned char 8 0 to 255
int 16 -32767 to 32767
unsigned int 16 0 to 65535
long int 32 -2147483647 to 2147483647
unsigned long int 32 0 to 4294967295
float 32 Six digit precision
double 64 Ten digit precision
long double 80 Ten digit precision
Declaring a variable:
All variables in a C program must be declared before they can be used. The
general form of a variable definition is;
type name;
So, for example to declare a variable "x", of data type "int" so that it may
store a value in the range -32767 to 32767, you use the statement;
int x;
Character strings may be declared, which are in fact arrays of characters.
They are declared as follows;
char name[number_of_elements];
So, to declare a string thirty characters long, and called 'name' you would
use the decleration;
char name[30];
Arrays of other data types may also be declared in one, two or more dimensions
in the same way. For example to declare a two dimensional array of integers;
int x[10][10];
The elements of this array are then accessed as;
x[0][0]
x[0][1]
x[n][n]
There are three levels of access to variable; local, module and global.
A variable declared within a code block is only known to the statements within
that code block. A variable declared outside of any function code blocks but
prefixed with the storage modifier "static" is known only to the statements
within that source module. A variable declared outside of any functions and
not prefixed with the static storage type modifier may be accessed by any
statement within any source module of the program.
For example;
int error;
static int a;
main()
{
int x;
int y;
}
funca()
{
if (a == 0){
int b;
for(b = 0; b < 20; b++)
printf("\nHello World");
}
}
In this example the variable 'error' is accessible by all source code modules
compiled together to form the finished program. The variable 'a' is accessible
by statements in both functions 'main()' and 'funca()', but is invisible to
any other source module. Variables 'x' and 'y' are only accessible by
statements within function 'main()'. The variable 'b' is only accessible by
statements within the code block following the 'if' statement.
If a second source module withed to access the variable 'error' it would need
to declare 'error' as an 'extern' global variable thus;
extern int error;
funcb()
{
}
C will quite happily allow you, the programer, to assign different data types
to each other. For example, you may declare a variable to be of type 'char' in
which case a single byte of data will be allocated to store the variable. To
this variable you can attempt to allocate larger values, for example;
main()
{
x = 5000;
}
In this example the variable 'x' can only store a value between -127 and 128,
so the figure 5000 will NOT be assigned to the variable 'x'. Rather the value
136 will be assigned!
Often you may wish to assign different data types to each other, and to
prevent the compiler from warning you of a possible error you can use a cast
to tell the compiler that you know what you're doing. A cast statement is a
data type in parenthesis preceeding a variable or expression;
main()
{
float x;
int y;
x = 100 / 25;
y = (int)x;
}
In this example the (int) cast tells the compiler to convert the value of the
floating point variable x to an integer before assigning it to the variable y.
Formal parameters:
A C function may be receive parameters from a calling function. This
parameters are declared as variables within the paranthesis of the function
name, thus;
int MULT(int x, int y)
{
/* Return parameter x multiplied by parameter y */
return(x * y);
}
main()
{
int a;
int b;
int c;
a = 5;
b = 7;
c = MULT(a,b);
printf("%d multiplied by %d equals %d\n",a,b,c);
}
Access modifiers:
There are two access modifiers; 'const' and 'volatile'. A variable declared to
be 'const' may not be changed by the program, whereas a variable declared as
type 'volatile' may be changed by the program. In addition, declaring a
variable to be volatile prevents the C compiler from allocating the variable
to a register, and reduces the optimization carried out on the variable.
Storage class types:
C provides four storage types; 'extern', 'static', 'auto' and 'register'.
The extern storage type is used to allow a source module within a C program to
access a variable declared in another source module.
Static variables are only accessible within the code block which declared
them, and additionaly if the variable is local, rather than global, they
retain their old value between subsequent calls to the code block.
Register variables are stored within CPU registers where ever possible,
providing the fastest possible access to their values.
The auto type variable is only used with local variables, and declares the
variable to retain its value locally only. Since this is the default for local
variables the auto storage type is very rarely used.
OPERATORS
Operators are tokens which cause a computation to occur when applied to
variables. C provides the following operators;
& Address
* Indirection
+ Unary plus
- Unary minus
~ Bitwise compliment
! Logical negation
++ As a prefix; preincrement
As a sufix; postincrement
-- As a prefix; predecrement
As a sufix; postdecrement
+ Addition
- Subtraction
* Multiply
/ Divide
% Remainder
<< Shift left
>> Shift right
& Bitwise AND
| Bitwise OR
^ Bitwise XOR
&& Logical AND
|| Logical OR
= Assignment
*= Assign product
/= Assign quotient
%= Assign remainder (modulus)
+= Assign sum
-= Assign difference
<<= Assign left shift
>>= Assign right shift
&= Assign bitwise AND
|= Assign bitwise OR
^= Assign bitwise XOR
< Less than
> Greater than
<= Less than or equal to
>= Greater than or equal to
== Equal to
!= Not equal to
. Direct component selector
-> Indirect component selector
a ? x:y "If a is true then x else y"
[] Define arrays
() Parenthesis isolate conditions and expressions
... Ellipsis are used in formal parameter lists of
function prototypes to indicate a variable number of
parameters or parameters of varying types.
To illustrate some of the more commonly used operators consider the following
short program;
main()
{
int a;
int b;
int c;
a = 5; /* Assign a value of 5 to variable 'a' */
b = a / 2; /* Assign the value of 'a' divided by two to variable
'b' */
c = b * 2; /* Assign the value of 'b' multiplied by two to
variable 'c' */
if (a == c) /* Test if 'a' holds the same value as 'c' */
printf("Variable 'a' is an even number\n");
else
printf("Variable 'a' is an odd number\n");
}
Normaly when incrementing the value of a variable you would write something
like;
x = x + 1
C provides the incremental operator '++' as well so that you can write;
x++
Similarly you can decrement the value of a variable using '--' as;
x--
All the other mathematical operators may be used the same, so in a C program
you can write in shorthand;
NORMAL C
x = x + 1 x++
x = x - 1 x--
x = x * 2 x *= 2
x = x / y x /= y
x = x % 5 x %= 5
and so on.
INPUT AND OUTPUT
Input to a C program may occur from the console, the standard input device
(unless otherwise redirected this is the console), from a file or from a data
port.
The general input command for reading data from the standard input stream
'stdin' is scanf(). Scanf() scans a series of input fields, one character at a
time. Each field is then formatted according to the appropriate format
specifier passed to the scanf() function as a parameter. This field is then
stored at the ADDRESS passed to scanf() following the format specifiers list.
For example, the following program will read a single integer from the stream
stdin;
main()
{
int x;
scanf("%d",&x);
}
Notice the address operator & prefixing the variable name 'x' in the scanf()
parameter list. This is because scanf() stores values at ADDRESSES rather than
assigning values to variables directly.
The format string is a character string that may contain three types of data:
whitespace characters (space, tab and newline), non-whitespace characters
(all ascii characters EXCEPT %) and format specifiers.
Format specifiers have the general form;
%[*][width][h|l|L]type_character
After the % sign the format specifier is comprised of:
an optional assignment suppression character, *, which suppresses
assignment of the next input field.
an optional width specifier, width, which declares the maximum number
of characters to be read.
an optional argument type modifier, h or l or L, where:
h is a short integer
l is a long
L is a long double
the data type character which is one of;
d Decimal integer
D Decimal long integer
o Octal integer
O Octal long integer
i Decimal, octal or hexadecimal integer
I Decimal, octal or hexadecimal long integer
u Decimal unsigned integer
U Decimal unsigned long integer
x Hexadecimal integer
X Hexadecimal long integer
e Floating point
f Floating point
g Floating point
s Character string
c Character
% % is stored
An example using scanf();
#include <stdio.h>
main()
{
char name[30];
int age;
printf("\nEnter your name and age ");
scanf("%30s%d",name,&age);
printf("\n%s %d",name,age);
}
Notice the include line, "#include <stdio.h>", this is to tell the compiler to
also read the file stdio.h which contains the function prototypes for scanf()
and printf().
If you type in and run this sample program you will see that only one name can
be entered, that is you can't enter;
JOHN SMITH
because scanf() detects the whitespace between "JOHN" and "SMITH" and moves on
to the next input field, which is age, and attempts to assign the value
"SMITH" to the age field! The limitations of scanf() as an input function are
very obvious.