Packing Programs

Compacting (Squashing) BASIC programs By Jon Ripley (D5B) Many programmers will have noticed times when memory becomes short when developing a program. This is the first part of an article which will attempt to deal with the compacting of a BASIC program. Reasons for compressing a program might be to try and fit the program in 10 lines of BASIC, many magazines had a monthly section devoted to this type of program. Another reasons why programs are compressed might be; to save disc space, or to produce a copy of the program which is difficult for other people (hackers) to understand. This might form a part of program protection or just be something to do before distributing a program, to a magazine (such as 8BS) or a PD library (such as 8BS)! More adventurous people may try to compress a program into 1 line of BASIC! (It may seem impossible to get any decent program into such a small place but it is possible!) Packing a program is not just a nice way to save memory and help protect a program. Shorter programs run appreciably faster than longer ones. Later I will deal with joining lines of a program together to create multi-statement lines. Less lines also take up less memory. (3 bytes saved for each line compressed.) Less lines also make GOTO, GOSUB and RESTORE respond much quicker than if there were a lot of lines in the program. Even if if the individual lines themselves are very long. You might want to print this article and keep it for reference purposes. Packing programs ---------------- In this article I will deal with simple ways to shorten a program. Contents... 1) Remove spaces 2) Colons after repeats 3) Colons after DEF statements 4a) Semi-colons before TAB( statements 4b) Semi-colons before quote marks 5) REM statements 6) END and STOP at the end of a program 7) Variables after NEXT statements 8) Nested FOR loops 9) GOTO in IF statements 10) Then statements 11) Colons before and after ELSE statements 12) Arrays 1) Remove spaces It is amazing how many extra spaces ' 's can find there way into a program. BASIC doesn't actually need any spaces in a program. Any spaces are usually just there for readability. Remove all spaces between line numbers and the line itself. e.g. 10 REM Hi! would become 10REM Hi! Remove all spaces at the end of lines, these are very hard to see but do sometimes exist. Remove all spaces before and after BASIC keywords. e.g. 10FOR X = 0 TO 50 20PRINT X 30NEXT X becomes 10FORX=0TO50 20PRINTX 30NEXTX There is an exception to this rule however, it is there to avoid ambiguity. Where a variable comes directly before a keyword there should be a '%' or a '$' between them. If not then use a space. e.g. FOR A=B TO C becomes FORA=B TOC Instead of FORA=BTOC In the latter case the computer would look fo the variable BTOC rather then the variables B and C. The space following the keyword is not needed. IF A$=B$ AND C$="A"... becomes IFA$=B$ANDC$="A"... And... IF A%=B% AND C%=43 becomes IFA%=B%ANDC%=43 Of course if you changed... A$="Hi! Everybody, have a nice day!" to A$="Hi!Everybody,haveaniceday!" That would be silly! 2) Colons after REPEATs After the REPEAT statement a colon is not needed to separate it from the next part of the line. So... REPEAT:PRINT "Hello World!":UNTIL FALSE becomes REPEATPRINT "Hello World!":UNTIL FALSE 3) Colons after DEF statements After the DEF statement, used when defining a procedure. A colon is not needed to separate it from the next statement. If parameters are passed to the procedure. So... DEFPROCprint(text$):PRINTtext$:ENDPROC becomes DEFPROCprint(text$)PRINTtext$:ENDPROC However DEFPROChello:PRINT"Hi!":ENDPROC Would need to stay as it is. If you did change it to; DEFPROChelloPRINT"Hi!":ENDPROC The computer would call the procedure PROChelloPRINT, instead of PROChello. The only way this could be avoided would be to replace the colon with a space but that would be silly! This is the same for functions. So DEFFNwait(delay):T=TIME:REPEATUNTILTIME=T+delay:=0 Would become DEFFNwait(delay)T=TIME:REPEATUNTILTIME=T+delay:=0 But DEFFNpointless:PRINT"It's not worth it.":=0 Would need to stay the same to avoid ambiguity. There is one exception to this however, if the function value is returned immediately. eg DEFFNadd(a,b):=a+b becomes DEFFNadd(a,b)=a+b And DEFFNversion:=1.51 becomes DEFFNversion=1.51 4a) Semi-colons before TAB( statements You do not need to proceed a TAB( statement with a ';'. They can be safely removed. So... PRINT TAB(0,5)"Please wait...";TAB(15)"5 seconds" becomes PRINT TAB(0,5)"Please wait..."TAB(15)"5 seconds" 4b) Semi-colons before quote marks If a quote mark is proceeded by a ';' then it can be safely removed. There is one rare exception that I will describe below. So... PRINT "There are ";count;" words." becomes PRINT "There are ";count" words." Except, if the semi-colon is inside the string. So... PRINT ASC";" Would stay the same because ASC"" would give an undesirable number. PRINT "They are as follows;" Would also stay the same. Changing them would be silly! 5) REM statements In a program REM is only used for commenting purposes and doesn't affect the running of the program. (Apart from slowing it down a bit!) All REM statements should be removed. Except where the line is referred to somewhere else in the program. So... 10REM This program prints the numbers 1 to 10 15REM The variable 'X' is used as a counter 20FOR X=1TO20:REM The start of the FOR loop 30REM The next line displays the number 40PRINT X 50REM End the loop 60NEXT 70END:REM The end! becomes 20FOR X=1TO20 40PRINT X 60NEXT 70END The example is a bit extreme but it shows how much space the extra comments take up. Normally REMs are used by the programmer to insert comments into a program to add readability. And are useful when coming back to a program at a later stage. If the line is referred to by a GOTO, GOSUB or RESTORE statement elsewhere in the program there are three things you can do. If the line contains only the REM statement then you can... Remove the comment leaving only the REM. So... 10REM This is the start 20PRINT "Hello World!" 30GOTO 10 becomes 10REM 20PRINT "Hello World!" 30GOTO 10 This is good but still leaves a REM which could be removed. You could change the line the GOTO (or GOSUB or RESTORE) refers to, to the first non-REM line after the REM. So the above program would become... 20PRINT "Hello World!" 30GOTO20 Thirdly you could remove the line containing the REM and then renumber the next non-REM line number to the line that you deleted. So the above program would become. 10PRINT "Hello World!" 30GOTO 10 If a line ends with a REM statement and has statements before it then the REM can be removed, thus... 10PRINT "Hello World!":REM Print forever 20GOTO 10 becomes 10PRINT "Hello World!" 20GOTO 10 The same goes for RESTORE and GOSUB. The examples below illustrate the above methods with these statements. 10RESTORE 60 20REPEAT 30READ number 40PRINT number 50UNTIL number=5 60REM Data starts here... 70DATA 1,2,3,4,5 You could, change line the number 60 in line 10 to 70 or replace line 60 with line 70. 10GOSUB 30 20END 30REM Subroutine 35PRINT"Hello" 40RETURN You could, change line the number 30 in line 10 to 35 or replace line 30 with line 35. 6) END and STOP at the end of a program At the end of a program END or STOP is not needed to tell the computer that it is the end of the program. So... 10PRINT "Hi!" 20END And 10PRINT "Hi!" 20STOP Would both become 10PRINT "Hi!" However if there are procedures or functions defined after the main program then the END (or STOP) should be kept. So... 10FOR X=1TO10 20PROCprint(X) 30NEXT 40END 50DEFPROCprint(number) 60PRINTnumber 70ENDPROC Would stay the same. But if only DATA statements (or REMs) follow the end of the program then END (or STOP) can be removed. So... 10READ name$ 20PRINT name$ 30STOP 40DATA Jon Ripley Would become... 10READ name$ 20PRINT name$ 40DATA Jon Ripley But, if both procedures (or functions) and DATA (or REM) statements follow a program then the END (or STOP) should remain. e.g. 10FOR X=1TO5 20READ name$ 30PROCprint(name$) 40NEXT 50END 60DEFPROCprint(whatever$) 70PRINT whatever$ 80ENDPROC 90DATA Jon,Chris,Steve,Martin,Fred Would remain the same. It is good programming practice to use END instead of STOP unless when being used to stop the program at a particular point, when testing. However, END has a special use that rarely occurs. This is included only as a note to more advanced programmers. END causes BASIC to search through the program for a valid end marker and updates its internal pointers. (TOP, LOMEM, VARTOP, etc) This is used if PAGE has been changed or if an unusual loading procedure is used, merging 2 BASIC programs in memory for example. 7) Variables after NEXT statements. Including the loop variable after the next statement is optional. So... 10FOR loop=1 TO 10 20PRINT 2^loop 30NEXT loop becomes 10FOR loop=1 TO 10 20PRINT 2^loop 30NEXT And... 10FOR X=0 T0 20 20FOR Y=0 TO 20 30PRINT TAB(X,Y)"*" 40NEXT Y 50NEXT X would become 10FOR X=0 TO 20 20FOR Y=0 TO 20 30PRINT TAB(X,Y)"*" 40NEXT 50NEXT (See the next section for further tips when using NEXT.) Generally the loop variable should not be referred to at all. This is because of possible ambiguity. For example if the above program was entered as follows, with the variables interchanged... 10FOR X=0 TO 20 20FOR Y=0 TO 20 30PRINT TAB(X,Y)"*" 40NEXT X 50NEXT Y The program only goes through the loop once. This would not occur if the variables were omitted. 8) Nested FOR loops If there are two or more NEXT statements next to each other in a program then they can be joined together. For example 10FOR X=0 TO 12 20FOR Y=0 TO 12 30PRINT X*Y 40NEXT Y 50NEXT X would become 10FOR X=0 TO 12 20FOR Y=0 TO 12 30PRINT X*Y 40NEXT Y,X It is again possible to omit the loop variable and line 40 would become... 40NEXT, Similarly for three NEXTs in succession. 'NEXT ,,' etc 9) GOTO in IF statements If in an IF statement, a GOTO comes directly after THEN or ELSE then the GOTO can be removed. So... IF A%=answer THEN GOTO 50 ELSE GOTO 100 becomes IF A%=answer THEN 50 ELSE 100 Generally, the use of GOTO should be kept to a minimum. (Hopefully it should not used at all!) 10) THEN statements In most cases in an IF statement the THEN can be left out. So... IF A%=B% THEN PRINT"Equal" becomes IF A%=B% PRINT"Equal" One exception is if a pseudo variable is assigned. (Such as, LOMEM, PAGE, TIME, etc) For example IF TIME=maximum THEN TIME=0 Would remain the same. An error would be caused if you did remove the THEN in this instance. However, if a normal variable is assigned then the THEN can be removed. So... IF x_position>30 THEN x_position=30 becomes IF x_position>30 x_position=30 The second (and last!) exception an extension to point 9, above, where a GOTO after a THEN has been removed. For example IF A%=answer THEN 50 ELSE 100 The THEN should NOT be removed. If you do remove the THEN, then the GOTO should be replaced. So the above would become IF A%=answer GOTO 50 ELSE 100 Again a bit silly! 11) Colons before and after ELSE statements Remove all colons after ELSE statements. They aren't needed. So... IF A%=5 THEN PRINT"Hi!":ELSE:PRINT"Bye!" becomes IF A%=5 THEN PRINT"Hi!":ELSEPRINT"Bye!" Also, you can remove all colons before ELSE statements. Except where a variable comes directly before the ELSE there should be a '%' or a '$' between them. If not then use a space. e.g. IF A=B THEN C=D:ELSE:C=F becomes IF A=B THEN C=D ELSEC=F Instead of IF A=B THEN C=DELSEC=F In the latter case the computer would look fo the variable DELSEC rather then the variable C. IF A$=B$ THEN A$=C$ELSEA$=D$ becomes IF A$=B$ THEN A$=C$ELSEA$=D$ And... IF A%=B% THEN A%=C%:ELSE:A%=D% becomes IF A%=B% THEN A%=C%ELSEA%=D% 12) Arrays Less advanced programmers should skip this section. This only really applies to one-dimensional (1D) integer arrays. Where all values will be between 0 and 255 (&FF) Bewildered? DIM A(27) The array A() is a 1D array and it is a real (floating point) array because it can store decimals and fractions as well as whole numbers. DIM A$(27) The array A$ is a string array again 1D. It can store strings. DIM A%(27) This is the integer array we were looking for. It can hold any integer number (32bit). An integer is simply a whole number without a decimal or fractional part. For example the following are decimal numbers; 1, 6.25, 100, -45.735, 0, 56000.0001. The following are integers; 1, 6, 100, -45, 0, 56000. We are only interested in arrays where only numbers between (and including) 0 and 255 (&FF - hex) will be stored in it. Although the numbers stored in the array can only be between 0 and 255 (&FF - hex) the array can be as large or small as you like. (As long as there is enough memory! - In fact this method should be used in preference for very large arrays as one byte is used for each element of the array rather than the many bytes that the other method demands) So... DIM A%(1000) becomes DIM A%1000 In the program when you refer to the array you have to use a slightly different method. A%(38)=23+A%(2) becomes A%?38=23+A%?2 And... A%(9)=A%(10)+11 becomes A%?9=A%?10+11 This method saves 2 bytes when defining the array and 1 byte every time the array is referred to in the program. If you calculate the value of the element number you need to use a slightly different method. So... A%(4)=A%(Z+4)+4 becomes A%?4=A%?(Z+4)+4 (Incidentally this method uses less variable memory and also speeds up the program!) If you want to save another 1 byte each time the array is referred to then, instead of calling the array A% call it A instead. (or B,C,D,etc)! So... DIM A%1000 becomes DIM A1000 And remove the '%' from the above examples. Briefly, instead of the previously used 'A%?' use 'A?'! Advanced note...This slows down a program very slightly. The second part of this article will appear in a future issue.