8-Bit Software Online Conversion

Allsorts of sorts An Article By H.S.Williams Sept. 1991 Welcome to this article on sorting and the routines employed. If you'll excuse the corny title, I'll explain the format of this article. The program that is no w running is a caretaker program. It will load and display all the text files th at form this article. The entire article comprises of a series of basic programs , and accompanying text files. To get the best of the article, I suggest that yo u read through the text files as presented by this program ( The title of each f ile is displayed at the top of the screen as each file is displayed for your ref erence. ) and then to go through the basic programs with reference to the text f iles. The text files were written using the Wordwise word processor and should easily load into any other processor for printing. If you do not have a word processor and you want to print out the articles then type in the following, ending each line by pressing the return key. MODE0 VDU2 *TYPE <filename> The purpose of this article, or so Duncan tells me is to show the technical si de of sorting. Also to provide the fastest sorting routines that I can, supposed ly in machine code. The reality may be somewhat different. I'll start with an ou tline of what I intend to cover. An introduction is always a good start. The first thing to do will be to expla in the basic methods behind some of the commonly used sorting routines; bubble, index, shell, and quick. Taking the simplest and most inefficient of these ( the bubble sort ), I'll show how by a sucession of simple modifications it can be m ade more and more effective. The modification of the bubble sort will lead neatl y onto the methods of comparison of sort routines. Also under which conditions d ifferent routines are the most effective. After this introduction I will present a sort routine of my own devising. This will demonstrate how memory and speed are often traded off against one another, incredible speed with many limitations. If I get this far I'll then translate the more suitable of the routines so far examined into machine code, this should provide you with some of the fastest so rting routines available. You may include any of the routines presented in this article in your own progr ams. Examples in the form of basic programs and accompanying text files will be presented along the way.

Program Format. Rather than cloud the programs up with endless REM statements which only confu se things, each demonstration program ( B.filename ) will also have a text file associated with it ( T.filename ). The text file will explain all the modificati ons. Also all the programs will use the same variable names to represent the sam e functions. For example every program will initially fill an array ' A%(size%) ' with random numbers between 0 and max% . Here size% represents the size of th e array minus one ( don't forget element A%(0) ), and max% represents the maximu m value that an element within that array can take. All the variable names commo n to two or more of the programs are listed below, along with an explanation of their function. In the following list n represents any integer ( whole number ) between two set numbers ( usually 0 and maximum value. ) seed% This variable is used to set the random number generator to give the sam e results every time. This is not to fiddle the results , but so that if I try t o explain the results, you have to have the same results that I am explaining. A%(n) Array containing random numbers between 0 and max%. size% is the size of the array. B%(n) Same size array as A%(n). Contains the position in the final sorted list that an element in the array A%(n) occupies. E.G. If B%(3)=0, then A%(3) is at the top of the sorted list.

O.K. so why is sorting so important, and why do people make so much fuss about the subject ? The idea of sorting is simple; you start with a group of objects i n a random order, and by applying a set of rules to the objects, the objects acq uire predictable relationships with respect to each other. Long winded or what ! Put simply, you arrange the objects into a logical order. Since all computers d eal exclusivly with numbers ( Even the words on this screen are just numbers in the computer's memory. ) we will only consider numbers. So consider a group of n umbers in any order, wouldn't it be usefull to know the largest, and smallest nu mbers, and also how large any number is in relation to the others. If we take our group of numbers then there are two obvious ways of organising them; in ascending, and descending order of magnitude. Any other more complicate d relationship is pointless.

Well I'm onto the fourth file and I haven't started sorting yet. If you're stil l here, then you've got more staying power than I have! Now onto the first sorti ng routine. ( Yaaaaaaayy! ) The simplest, most intuitive and in some ways the most important sorting routi ne is called the BUBBLE sort. ( For reasons that will become obvious later. ) It is important that you understand this routine because it is the most obvious, a nd logical of all sorting routines. Let's start with a small group of numbers in a random order. E.G. 3 9 6 7 1 Put simply the routine works by repeately comparing adjacent numbers, and excha nging them if they are in the wrong order. Firstly the top two numbers are compared, and if they are in the wrong order th en they are swapped around. Then routine moves down the list by one number and t he process is repeated. The process continues until the end of the list. The who le process is then repeated over and over until the routine has been through the entire list comparing adjacent numbers, and hasn't had to swap any pair aroud. Say I want to arrange the above numbers into descending order using the bubble sort. First thing to do is compare the top two numbers, the first is a 3 and the second, a 9. Since I want the list in descending order, these two are in the wr ong order and are swapped. The list now reads : 9 3 6 7 1 O.K. moving one down the list to the second number I have a 3 ( The one that ha s just been moved down. ), and a 7. They're in the wrong order so are swapped. A nd we have : 9 6 3 7 1 Moving down one again I have 3 and 7, again in the wrong order, so they are swa pped to give : 9 6 7 3 1 Comparing the 3 and the 1, we find that for the first time that they are alread y in the right order. Also we have just compared the last two numbers in the lis t. Since we have swapped at least one pair of numbers during the routine, we hav e to go through the same process again. Starting with the first two numbers we have a 9 and a 6 in the right order, so we move on. Next there are a 6 and a 7 in reverse order, so they are swapped, gi ving : 9 7 6 3 1 There will be no more swaps during this cycle since all the numbers are in the right order. But since SOME numbers were swapped around the routine is called ag ain. On the next pass however, since the list is in the correct order none of th e numbers are swapped around. The routine then recognises that the list isin or der since it dosen't have to swap any numbers. You may ask why the routine was called once more when we could see that the lis t was obviously sorted by the end of the second pass. The reason for this is tha t sorting routines usually work with much larger lists of numbers, and it is not so obvious when the list becomes finally sorted during a pass. But if a pass of the routine is completed without a single pair of numbers being swapped then it MUST be in order. The file B.BUBBLE on the disk is a basic program that sets up an array of rando m numbers ( A%(size%) ) and then sorts them using a bubbe sort routine. The prog ram prints out the numbers in their current order after each pass of the routine , so that you can see how each pass affects the order of the numbers.

If you've reached this stage, and you don't understand the bubble sort routine, or why it's called the bubble sort, then please go back and work it out. It may not seem important to you, but if this is so then you don't understand it! Still with me ? How about a natural sort routine. If I gave you a set of number s, you'd look through the list until you found the smallest number, then you'd p ut this number at the bottom of the list. Then you would look for the next smal lest number, and place it just above the first smallest number. Makes sense dose n't it ( I hope ) ? As far as computer talk goes, this as called a selection ( o r exchange ) sort. A selection sort looks through the group of numbers, finds the smallest, and th en swaps it with the last number in the list. The size of the list is then reduc ed by one to exclude the last number, and the routine moves onto the next smalle st. The BASIC program ( B.SELECT ) may not be as easy to understand as the B.BUBBLE program, but it follows the way that people sort more accurately. As before the arrangement of the numbers is shown after each pass of the sorting routine.

The SHELL sort. The shell sort is the hardest of the the three basic sorting routines to under stand. Don't worry if you have problems trying to make sense of it, just take yo ur time. The shell sort consists of a series of bubble sorts, performed on bigger and b igger groups of numbers. To start, the entire list in divided into two halves, a nd the first item in each half are compared and sorted. On the next pass of the routine, the entire list is divided into four quaters, and the first item of eve ry quarter is sorted using a bubble sort. Then the list is divided into eighths, and the first item of every eighth bubble sorted. And so on until, on the final pass the entire list is bubble sorted. The file B.SHELL on the disk is, unsurprisingly a shell sort program. Like the previous two example programs, the state of the partially sorted data is shown after each pass of the shell sort routine. The state of the data is not however shown during the 'sub' bubble sorts.

So far I've presented you with three different sorting methods, and also demon stration programs in basic. Now it's time to see how the different routines comp are against one another. The most obvious way of comparing performances is by me asuring the time each routine takes to sort a group of numbers. As the size of t he group of numbers to be sorted varies, different sorting methods become more e ffective. The basic program ( B.COMPARE ) tests all three of the sorts so far en countered against each other for different group sizes. To run it PAGE should be set to &1200, if it is not the program automatically reloads to the correct add ress. Once running, you will be prompted for firstly the maximun group size to b e sorted between two and ten. This represents the maximum power of 2^n that is s orted. If you don't understand this at first, just enter 6 or 7 or play with it and it'll become obvious, I hope. Next you must enter how many times that the program is sort groups of data. Th e more times that the program is run, the more statistically accurate the result s will be, just enter two or three if unsure. After that the program will switch to mode 0 and the results will be displayed and updated as they are calculated. Minimum, average, and maximum time taken to sort are given for each of the thre e sort routines, and for each size of group sorted up to the one selected. It takes about 72 minutes to bubble sort 1024 numbers so be prepared to wait i f you choose a large sort size! Well because I'm already behind in getting this to Duncan, I've decided to spi lt this article into two parts. You've just reached the end of the first half. T he next half, covering all the rest that I've promised will probably appear next month if there's room. It'll give you a chance to work out and understand the first half. The second half gets complicated, believe me, I'm the one who has to write it.