by Steven Flintham (15A) Introduction Accompanying this article are two
programs which produce 'silly
sentences'. I was reading through an
old VIC-20 guide (An introduction to
BASIC - part 1, for any VIC-20!) and
came across a description of such a
program. I decided to see if I could
write such a program and this is the
result.
To get an idea of what the programs
can do, just run either of them. Both
versions work in the same way, but one
should be a bit faster than the other
for complex sentences. Any bias
towards 8-bit machines in the example
sentences is deliberate!
The example sentence data supplied is
deliberately fairly short and simple -
just enough to demonstrate how the
program works. I think the programs
are fairly flexible and you could
almost certainly do better.
Changing the data I would advise using SentSlw for any
initial experiments, since it is as
fast as (or faster than) Sentenc for
small quantities of data and you don't
have to wait before it starts. The
data is at the end of the program in
DATA statments and is in two parts,
the templates and the components.
The templates These are the basic 'patterns' for the
sentences and are just single strings.
If there is more than one (as there
usually will be), a template will be
chosen at random. Any text in the
chosen template will be copied into
the final sentence, unless it is
enclosed in angle brackets <>. Text
between angle brackets is the name of
a component and will be replaced with
text according to the list of
components. The last 'template' should
be ZZZ to show that there are no more.
The components These are the building blocks of
sentences. Every component consists of
three items of DATA, all of them
strings. The first is its name and is
the text which appears between angle
brackets. Ignoring the second for the
moment, the third is a list of
possible bits of text. For example, a
component line such as:
DATA "hello","","hello÷hi÷greetings"
will result in <hello> being replaced
with hello, hi or greetings (the
actual replacement is chosen at
random). The tilde ÷ is used to
separate items i the list.
If you have too many items in a list
to fit on one line, you can split a
component over two lines by repeating
the first two items, for example:
DATA
"hello","","hello÷hi÷greetings÷lots
more"
DATA "hello","","hi there÷still
more÷good to see you"
and this will work in the same way as
if all of the items were on one line.
You can use component names in angle
brackets with the list of possible
replacements. This sounds rather
complicated, but it is quite simple in
practice. For example, the following
component lines:
DATA "He/she saw
something","","<He/She> saw
<something>"
DATA "He/She",,"He$She"
DATA "something",,"a fox÷a cat÷a dog÷a
cow"
would cause <He/she saw something> in
a template to be replaced by <He/She>
saw <something>, and <He/She> would
then be replaced by either "He" or
"She" and <something> would be
replaced by either "a fox", "a cat",
"a dog" or "a cow".
You can have 'null' items in a list -
items which are empty. For instance,
the following component line:
DATA "(old )","","÷old "
would cause <(old )> to be replaced by
either nothing or "old ". Don't worry
about the round brackets - they are
just part of the name. I use them to
remind myself that the component may
be replaced with nothing. The extra
space is used here so that a sentence
template such as
DATA "The <(old )>man sat down."
can be used. If <(old )> is replaced
with nothing, the space before it will
separate "The" and "man", while if
<(old )> is replaced with "old " the
space outside the angle brackets will
separate "The" and "old", while the
extra space will then separate "old"
and "man". This is just a technique -
it is not a feature of the program, so
don't worry if you don't understand
it. It will become clear if you want
to use it.
Finally, the second item which has
been ignored so far. If this is of the
form S=2, it means that as well as the
usual replacement, the variable S will
be set to 2. Only single capital
letters are allowed for variable
names. For instance, the component
lines
DATA "man or woman",,"<man>÷<woman>"
DATA "man","S=1","man"
DATA "woman","S=2","woman"
will set S to 1 if <man or woman> is
replaced by "man" and 2 if <man or
woman> is replaced by "woman". Note
that two extra components have to be
used - this was done because usually
several possible repacements for <man>
and <woman> would be provided.
By itself, this is useless, but if the
second item in a component is a single
letter (once again, it must be a
capital), it is the name of a variable
(as in BASIC) which will contains the
item to be chosen from the list. For
instance, the following component line
DATA "he/she","S","he÷she"
will replace <he/she> with "he" if S
is 1 or "she" if S is 2. If this is
used without a previous component
having set S, the result will be
unpredictable and the program may
'crash'.
A limited degree of error trapping is
provided, but this is not exhaustive.
If S had been three in the previous
he/she example, the program would
crash but it should be fairly obvious
why. It would not be too difficult to
provide full error trapping, but this
would probably slow the program down
still further.
I hope that these instructions are
sufficient and that you obtain at
least a few minutes pleasure out of
fiddling with the sentence data. If
you are stuck, have a look at the
example data - it contains examples of
everything mentioned above. If you
come up with any good sets of data,
why not send your version in to 8BS?
Using 80-column text If you have a Master or BBC with
shadow ram, you may want to change the
line reading
MODE 7:width%=39
to
MODE 128:width%=79
(or whatever commands are required to
activate your shadow RAM).
The two different versions SentSlw simply looks through the
entire list of components when one is
referred to. It is therefore quite
slow, particularly when there are a
lot of components to search through.
However, the code is simpler and if
you want to examine the program to see
how it works, you should look at this
one first.
Sentenc sorts the list of components
and then uses a binary search to find
a component. This gives about the same
speed as SentSlw for small numbers of
components, but as the number grows
Sentenc does not slow down too much.
Both programs are a little on the slow
side, but bearably so and it seems
silly to spend ages speeding up
trivial programs like these.