PUNTER.DOC

(23 KB) Pobierz
The "C1" Protocol
-----------------

The "C1" description is exceedingly long, so it has been broken up into 11
more easily digestible sections. If you haven't read any of this before,
please read them ALL, in order.

Topic                            Section
-----                            -------
Inception & Concepts ........... C1-1
A Simple Conversation .......... C1-2
Communication Codes & Checksums. C1-3
Statement and Listen Loops ..... C1-4
Synchonization Lock ............ C1-5
Block Structure ................ C1-6
Varying Block Size ............. C1-7
Communication Syntax ........... C1-8
Syntax Description ............. C1-9
The "Endoff" Situation ......... C1-10
Transfering File Type .......... C1-11


Section C1-1
------------

Inception
---------
During the summer of 1981, when I first got the idea of putting up a BBS, I
started work on a simple protocol for transfering programs to and from the
BBS. This protocol was similar in structure to XMODEM, and had about the same
reliability. Under good line conditions, it would give error free transfers
(this was to be expected). Under moderate noise conditions, the protocol would
hold up, and would still give error free transmissions. It was under poor line
conditions that it, and XMODEM, would fall apart.

In the summer of 1984, I started work on a very ambitious project; to produce
a protocol that was both fast, and extremely reliable, even under the worst of
line conditions. From this work came the "C1" protocol; not a simple
block/checksum affair, but a complete communication system for the computer.

Be warned, therefore, that understanding the ins and outs of "C1" will not be
easy, but with enough patience, there's no reason why even the least skilled
programmer cannot be comfortable with it.

Concepts
--------
The concept behind the "C1" protocol was simple; to allow two computers to
"talk" with one another (while transferring data) in such a way that nothing
short of a complete distortion of the transmission line could result in a
misunderstanding. If this concept could be realized, then files could be
transferred between computers without fear of line noise causing a breakdown
in the protocol, or that the received data would differ, in any way, from that
which was sent. Nothing is perfect though, and I don't, for a minute, claim
that "C1" is completely infallible, but I can say, with reasonable comfort,
that "C1" can deliver bad line accuracy not found in any other microcomputer
transfer protocols. For this accuracy though, there is a price to pay, and it
is complexity; the protocol is extremely difficult to duplicate without a
complete and utter understanding of the intricate workings of "C1". This
document will attempt to give you that required understanding.



Section C1-2
------------

A Simple Conversation
---------------------
In first deciding how the protocol would function, I thought of how two people
could carry on a conversation under high noise conditions, where
misunderstanding would be the norm. The senario I'm going to give differs from
the protocol in that the people talking have no way of verifying the accuracy
of what they believe they have heard. What it is meant to demonstrate is how
the the two computers "talk" with one another, and discuss the neccessary
repetition, or non-repetition, of each block of data (the cornerstone of a
checksum based transfer protocol).

Ken and John are attempting to assemble a machine in the middle of a very
noisy machine shop. Ken reads the instructions to John, who carries them out.
Even at close proximity, the two have difficulty hearing one another, so they
adopt of form banter which allows each instuction to be verified and
acknowledged. Here is how the conversation might go:

John: Put part "A" in hole "D".

Ken: Understood, putting part "A" in hole "D".

John: Acknowledged, let me know when you are ready for the next instruction.

Ken: Go ahead, what do I do next?

John: Put screw "E" through slot "T".

Ken: I didn't understand that, could you please repeat.

John: Oh, ok, tell me when you're ready for that instruction again.

Ken: Ready now.

The conversation continues on in this fashion, guaranteeing that both John and
Ken are fully aware of what the other is doing. In real life, people wouldn't
have the patience to keep up that sort of banter, but that's why they make
more mistakes than a computer. It is just this sort of "conversation" that the
two computers have between each other, only the language is different; the
instruction is replaced by the block of data, and all other statements by
special codes.




Section C1-3
------------

Communication Codes
-------------------
One of the areas where simple protocols fall apart is in the transmission of
"handshaking codes". It's called handshaking because is implies that the two
computers are having a dialogue, rather than a monologue. These other
protocols rely on single byte (8 bit) words for their communication codes, and
that could spell trouble, since the likelihood of any one 8 bit code being
transposed into another is greater than for multiple byte codes. For this
reason, "C1" uses 3 byte (24 bit) codes which are sufficiently different that
the likelihood of a transposition is extremely low. Not only that, but as you
will soon learn, the method of receiving 3 byte codes is designed such that if
there is sufficient line noise to make the neccessary transpositions, there
would most likely be extra characters sent; "C1" can avoid this situation.

Five distinct codes are used in the protocol; "GOO", "BAD", "ACK", "S/B", and
"SYN". Each has it's own meaning, just like any English word, and all are used
in a specific sequence such that synchronization difficulties would be
automatically identified and corrected.

Checksums
---------
When a block of data is sent, we must have a way of determining if it is
correctly received or not. This is accomplished by using what is known as a
checksum. Quite simply, a checksum is a number which is mathematically derived
from all the bytes within the block. The receiving computer recalculates the
sum and compares it with the sum it received along with the block.
Theoretically, any fault in the transmitted data will result in the two
checksums not matching; but that's theory. In reality, the accuracy of the
checksum is based on the type of mathematical operation used to calculate it,
and what kind of noise it encounters.

The simplest way to create a checksum is to add up all the ASCII values of the
bytes contained in the block. This is fine for many types of errors, but not
the type which inverts a particular bit. Should two identical inversions occur
on two opposite bits, the sum will remain the same. For example, take the
following two bytes:

     11010011 = 211
Plus 01101101 = 109
     --------   ---
                320

Now assume that the forth bit from the right of both of these bytes becomes
inverted by line noise:

     11011011 = 219
Plus 01100101 = 101
     --------   ---
                320

As you can see, the sum remains 320, even though line noise has made obvious
changes to the bytes. A better system is one called "Cyclic Redundancy", which
works on a somewhat different principle. The checksum is 16 bits long, and is
created in the following fashion; each byte from the block is Exclusive OR'ed
with the low order part of the checksum. The checksum is then ROTATED one bit
to the left, and the procedure repeated with the next byte.  Even this highly
superior method can be tripped up, so I have combined BOTH an additive
checksum and Cyclic Redundancy checksum to create one very hard to beat 32 bit
"super" checksum.



Section C1-4
------------

Listening For Code Words
------------------------
Although 3 byte code words are more reliable than 1 byte code words, nothing
is perfect. It was once said that if you let an infinite number of monkeys
bash away at typewriters for an infinite amount of time, one of them would
eventually type "To be or not to be, that is the question". Although this
stretches statistical probability to it's limit, this kind of thing can easily
happen on a smaller scale; the letters "GOO" could quite conceivably be
produced by purely random line noise.

To try and eliminate ALL possible errors isn't feasible, but "C1" makes an
attempt at trying to eliminate as many as possible. One reasonably probable
fact is that any noise capable of randomly producing "GOO", would not stop
there; more likely, it would produce a string of characters, something like
"HGOOEK". Were we to allow the protocol to listen exclusively for three letter
combinations, it would most assuredly pick out the "GOO" in that string.

My specifications for "C1" call for a code recognition routine which will ONLY
make code word comparisons on the LAST 3 RECEIVED bytes. This is accomplished
in my coding by going back and testing for further characters after I have
identified a three byte code word. Should another byte be present, the
identified code word is thrown away, and the search will continue.

Statement and Listen Loops
--------------------------
One immediate drawback to the system described above is that a REAL code word,
masked within some random noise, would be rejected by the receiving computer.
This would also be true of a code word simply damaged by noise (like "GOE").
For a protocol to be impervious to this sort of corruption, it must be capable
of restating code words over and over until the receiving computer can
understand, yet it must also have a way of knowing whether the receiving
computer got the code word or not. This was a fact that eluded me when I wrote
the original protocol. When we talk to other people, the cornerstone of
understanding is recognition. If we ask "What do you think?", yet get no
reply, we ask again. Onl...
Zgłoś jeśli naruszono regulamin