Compacting (Squashing) BASIC programs
By Jon Ripley (D5B)
Many programmers will have noticed times when memory becomes short when
developing a program.
This is the first part of an article which will attempt to deal with the
compacting of a BASIC program.
Reasons for compressing a program might be to try and fit the program in 10
lines of BASIC, many magazines had a monthly section devoted to this type of
program.
Another reasons why programs are compressed might be; to save disc space, or to
produce a copy of the program which is difficult for other people (hackers) to
understand.
This might form a part of program protection or just be something to do before
distributing a program, to a magazine (such as 8BS) or a PD library (such as
8BS)!
More adventurous people may try to compress a program into 1 line of BASIC!
(It may seem impossible to get any decent program into such a small place but
it is possible!)
Packing a program is not just a nice way to save memory and help protect a
program. Shorter programs run appreciably faster than longer ones.
Later I will deal with joining lines of a program together to create
multi-statement lines.
Less lines also take up less memory. (3 bytes saved for each line compressed.)
Less lines also make GOTO, GOSUB and RESTORE respond much quicker than if there
were a lot of lines in the program. Even if if the individual lines themselves
are very long.
You might want to print this article and keep it for reference purposes.
Packing programs
----------------
In this article I will deal with simple ways to shorten a program.
Contents...
1) Remove spaces
2) Colons after repeats
3) Colons after DEF statements
4a) Semi-colons before TAB( statements
4b) Semi-colons before quote marks
5) REM statements
6) END and STOP at the end of a program
7) Variables after NEXT statements
8) Nested FOR loops
9) GOTO in IF statements
10) Then statements
11) Colons before and after ELSE statements
12) Arrays
1) Remove spaces
It is amazing how many extra spaces ' 's can find there way into a program.
BASIC doesn't actually need any spaces in a program. Any spaces are usually
just there for readability.
Remove all spaces between line numbers and the line itself.
e.g.
10 REM Hi!
would become
10REM Hi!
Remove all spaces at the end of lines, these are very hard to see but do
sometimes exist.
Remove all spaces before and after BASIC keywords.
e.g.
10FOR X = 0 TO 50
20PRINT X
30NEXT X
becomes
10FORX=0TO50
20PRINTX
30NEXTX
There is an exception to this rule however, it is there to avoid ambiguity.
Where a variable comes directly before a keyword there should be a '%' or a
'$' between them. If not then use a space.
e.g.
FOR A=B TO C
becomes
FORA=B TOC
Instead of
FORA=BTOC
In the latter case the computer would look fo the variable BTOC rather then
the variables B and C. The space following the keyword is not needed.
IF A$=B$ AND C$="A"...
becomes
IFA$=B$ANDC$="A"...
And...
IF A%=B% AND C%=43
becomes
IFA%=B%ANDC%=43
Of course if you changed...
A$="Hi! Everybody, have a nice day!"
to
A$="Hi!Everybody,haveaniceday!"
That would be silly!
2) Colons after REPEATs
After the REPEAT statement a colon is not needed to separate it from the next
part of the line. So...
REPEAT:PRINT "Hello World!":UNTIL FALSE
becomes
REPEATPRINT "Hello World!":UNTIL FALSE
3) Colons after DEF statements
After the DEF statement, used when defining a procedure. A colon is not needed
to separate it from the next statement. If parameters are passed to the
procedure. So...
DEFPROCprint(text$):PRINTtext$:ENDPROC
becomes
DEFPROCprint(text$)PRINTtext$:ENDPROC
However
DEFPROChello:PRINT"Hi!":ENDPROC
Would need to stay as it is. If you did change it to;
DEFPROChelloPRINT"Hi!":ENDPROC
The computer would call the procedure PROChelloPRINT, instead of PROChello.
The only way this could be avoided would be to replace the colon with a space
but that would be silly!
This is the same for functions. So
DEFFNwait(delay):T=TIME:REPEATUNTILTIME=T+delay:=0
Would become
DEFFNwait(delay)T=TIME:REPEATUNTILTIME=T+delay:=0
But
DEFFNpointless:PRINT"It's not worth it.":=0
Would need to stay the same to avoid ambiguity.
There is one exception to this however, if the function value is returned
immediately.
eg
DEFFNadd(a,b):=a+b
becomes
DEFFNadd(a,b)=a+b
And
DEFFNversion:=1.51
becomes
DEFFNversion=1.51
4a) Semi-colons before TAB( statements
You do not need to proceed a TAB( statement with a ';'. They can be safely
removed. So...
PRINT TAB(0,5)"Please wait...";TAB(15)"5 seconds"
becomes
PRINT TAB(0,5)"Please wait..."TAB(15)"5 seconds"
4b) Semi-colons before quote marks
If a quote mark is proceeded by a ';' then it can be safely removed. There is
one rare exception that I will describe below.
So...
PRINT "There are ";count;" words."
becomes
PRINT "There are ";count" words."
Except, if the semi-colon is inside the string. So...
PRINT ASC";"
Would stay the same because ASC"" would give an undesirable number.
PRINT "They are as follows;"
Would also stay the same. Changing them would be silly!
5) REM statements
In a program REM is only used for commenting purposes and doesn't affect the
running of the program. (Apart from slowing it down a bit!)
All REM statements should be removed.
Except where the line is referred to somewhere else in the program.
So...
10REM This program prints the numbers 1 to 10
15REM The variable 'X' is used as a counter
20FOR X=1TO20:REM The start of the FOR loop
30REM The next line displays the number
40PRINT X
50REM End the loop
60NEXT
70END:REM The end!
becomes
20FOR X=1TO20
40PRINT X
60NEXT
70END
The example is a bit extreme but it shows how much space the extra comments
take up.
Normally REMs are used by the programmer to insert comments into a program to
add readability. And are useful when coming back to a program at a later stage.
If the line is referred to by a GOTO, GOSUB or RESTORE statement elsewhere in
the program there are three things you can do.
If the line contains only the REM statement then you can...
Remove the comment leaving only the REM.
So...
10REM This is the start
20PRINT "Hello World!"
30GOTO 10
becomes
10REM
20PRINT "Hello World!"
30GOTO 10
This is good but still leaves a REM which could be removed.
You could change the line the GOTO (or GOSUB or RESTORE) refers to, to the
first non-REM line after the REM.
So the above program would become...
20PRINT "Hello World!"
30GOTO20
Thirdly you could remove the line containing the REM and then renumber the
next non-REM line number to the line that you deleted.
So the above program would become.
10PRINT "Hello World!"
30GOTO 10
If a line ends with a REM statement and has statements before it then the REM
can be removed, thus...
10PRINT "Hello World!":REM Print forever
20GOTO 10
becomes
10PRINT "Hello World!"
20GOTO 10
The same goes for RESTORE and GOSUB.
The examples below illustrate the above methods with these statements.
10RESTORE 60
20REPEAT
30READ number
40PRINT number
50UNTIL number=5
60REM Data starts here...
70DATA 1,2,3,4,5
You could, change line the number 60 in line 10 to 70 or replace line 60 with
line 70.
10GOSUB 30
20END
30REM Subroutine
35PRINT"Hello"
40RETURN
You could, change line the number 30 in line 10 to 35 or replace line 30 with
line 35.
6) END and STOP at the end of a program
At the end of a program END or STOP is not needed to tell the computer that it
is the end of the program.
So...
10PRINT "Hi!"
20END
And
10PRINT "Hi!"
20STOP
Would both become
10PRINT "Hi!"
However if there are procedures or functions defined after the main program
then the END (or STOP) should be kept.
So...
10FOR X=1TO10
20PROCprint(X)
30NEXT
40END
50DEFPROCprint(number)
60PRINTnumber
70ENDPROC
Would stay the same.
But if only DATA statements (or REMs) follow the end of the program then END
(or STOP) can be removed.
So...
10READ name$
20PRINT name$
30STOP
40DATA Jon Ripley
Would become...
10READ name$
20PRINT name$
40DATA Jon Ripley
But, if both procedures (or functions) and DATA (or REM) statements follow a
program then the END (or STOP) should remain.
e.g.
10FOR X=1TO5
20READ name$
30PROCprint(name$)
40NEXT
50END
60DEFPROCprint(whatever$)
70PRINT whatever$
80ENDPROC
90DATA Jon,Chris,Steve,Martin,Fred
Would remain the same.
It is good programming practice to use END instead of STOP unless when being
used to stop the program at a particular point, when testing.
However, END has a special use that rarely occurs. This is included only as a
note to more advanced programmers.
END causes BASIC to search through the program for a valid end marker and
updates its internal pointers. (TOP, LOMEM, VARTOP, etc) This is used if PAGE
has been changed or if an unusual loading procedure is used, merging 2 BASIC
programs in memory for example.
7) Variables after NEXT statements.
Including the loop variable after the next statement is optional.
So...
10FOR loop=1 TO 10
20PRINT 2^loop
30NEXT loop
becomes
10FOR loop=1 TO 10
20PRINT 2^loop
30NEXT
And...
10FOR X=0 T0 20
20FOR Y=0 TO 20
30PRINT TAB(X,Y)"*"
40NEXT Y
50NEXT X
would become
10FOR X=0 TO 20
20FOR Y=0 TO 20
30PRINT TAB(X,Y)"*"
40NEXT
50NEXT
(See the next section for further tips when using NEXT.)
Generally the loop variable should not be referred to at all. This is because
of possible ambiguity.
For example if the above program was entered as follows, with the variables
interchanged...
10FOR X=0 TO 20
20FOR Y=0 TO 20
30PRINT TAB(X,Y)"*"
40NEXT X
50NEXT Y
The program only goes through the loop once. This would not occur if the
variables were omitted.
8) Nested FOR loops
If there are two or more NEXT statements next to each other in a program then
they can be joined together.
For example
10FOR X=0 TO 12
20FOR Y=0 TO 12
30PRINT X*Y
40NEXT Y
50NEXT X
would become
10FOR X=0 TO 12
20FOR Y=0 TO 12
30PRINT X*Y
40NEXT Y,X
It is again possible to omit the loop variable and line 40 would become...
40NEXT,
Similarly for three NEXTs in succession. 'NEXT ,,' etc
9) GOTO in IF statements
If in an IF statement, a GOTO comes directly after THEN or ELSE then the GOTO
can be removed.
So...
IF A%=answer THEN GOTO 50 ELSE GOTO 100
becomes
IF A%=answer THEN 50 ELSE 100
Generally, the use of GOTO should be kept to a minimum. (Hopefully it should
not used at all!)
10) THEN statements
In most cases in an IF statement the THEN can be left out.
So...
IF A%=B% THEN PRINT"Equal"
becomes
IF A%=B% PRINT"Equal"
One exception is if a pseudo variable is assigned. (Such as, LOMEM, PAGE,
TIME, etc)
For example
IF TIME=maximum THEN TIME=0
Would remain the same. An error would be caused if you did remove the THEN in
this instance. However, if a normal variable is assigned then the THEN can be
removed.
So...
IF x_position>30 THEN x_position=30
becomes
IF x_position>30 x_position=30
The second (and last!) exception an extension to point 9, above, where a GOTO
after a THEN has been removed.
For example
IF A%=answer THEN 50 ELSE 100
The THEN should NOT be removed. If you do remove the THEN, then the GOTO
should be replaced.
So the above would become
IF A%=answer GOTO 50 ELSE 100
Again a bit silly!
11) Colons before and after ELSE statements
Remove all colons after ELSE statements. They aren't needed.
So...
IF A%=5 THEN PRINT"Hi!":ELSE:PRINT"Bye!"
becomes
IF A%=5 THEN PRINT"Hi!":ELSEPRINT"Bye!"
Also, you can remove all colons before ELSE statements.
Except where a variable comes directly before the ELSE there should be a '%'
or a '$' between them. If not then use a space.
e.g.
IF A=B THEN C=D:ELSE:C=F
becomes
IF A=B THEN C=D ELSEC=F
Instead of
IF A=B THEN C=DELSEC=F
In the latter case the computer would look fo the variable DELSEC rather then
the variable C.
IF A$=B$ THEN A$=C$ELSEA$=D$
becomes
IF A$=B$ THEN A$=C$ELSEA$=D$
And...
IF A%=B% THEN A%=C%:ELSE:A%=D%
becomes
IF A%=B% THEN A%=C%ELSEA%=D%
12) Arrays
Less advanced programmers should skip this section.
This only really applies to one-dimensional (1D) integer arrays. Where all
values will be between 0 and 255 (&FF)
Bewildered?
DIM A(27)
The array A() is a 1D array and it is a real (floating point) array because it
can store decimals and fractions as well as whole numbers.
DIM A$(27)
The array A$ is a string array again 1D. It can store strings.
DIM A%(27)
This is the integer array we were looking for. It can hold any integer number
(32bit).
An integer is simply a whole number without a decimal or fractional part.
For example the following are decimal numbers;
1, 6.25, 100, -45.735, 0, 56000.0001.
The following are integers;
1, 6, 100, -45, 0, 56000.
We are only interested in arrays where only numbers between (and including) 0
and 255 (&FF - hex) will be stored in it.
Although the numbers stored in the array can only be between 0 and 255 (&FF -
hex) the array can be as large or small as you like.
(As long as there is enough memory! - In fact this method should be used in
preference for very large arrays as one byte is used for each element of the
array rather than the many bytes that the other method demands)
So...
DIM A%(1000)
becomes
DIM A%1000
In the program when you refer to the array you have to use a slightly
different method.
A%(38)=23+A%(2)
becomes
A%?38=23+A%?2
And...
A%(9)=A%(10)+11
becomes
A%?9=A%?10+11
This method saves 2 bytes when defining the array and 1 byte every time the
array is referred to in the program.
If you calculate the value of the element number you need to use a slightly
different method.
So...
A%(4)=A%(Z+4)+4
becomes
A%?4=A%?(Z+4)+4
(Incidentally this method uses less variable memory and also speeds up the
program!)
If you want to save another 1 byte each time the array is referred to then,
instead of calling the array A% call it A instead. (or B,C,D,etc)!
So...
DIM A%1000
becomes
DIM A1000
And remove the '%' from the above examples. Briefly, instead of the previously
used 'A%?' use 'A?'! Advanced note...This slows down a program very slightly.
The second part of this article will appear in a future issue.