The "C1" Protocol ----------------- The "C1" description is exceedingly long, so it has been broken up into 11 more easily digestible sections. If you haven't read any of this before, please read them ALL, in order. Topic Section ----- ------- Inception & Concepts ........... C1-1 A Simple Conversation .......... C1-2 Communication Codes & Checksums. C1-3 Statement and Listen Loops ..... C1-4 Synchonization Lock ............ C1-5 Block Structure ................ C1-6 Varying Block Size ............. C1-7 Communication Syntax ........... C1-8 Syntax Description ............. C1-9 The "Endoff" Situation ......... C1-10 Transfering File Type .......... C1-11 Section C1-1 ------------ Inception --------- During the summer of 1981, when I first got the idea of putting up a BBS, I started work on a simple protocol for transfering programs to and from the BBS. This protocol was similar in structure to XMODEM, and had about the same reliability. Under good line conditions, it would give error free transfers (this was to be expected). Under moderate noise conditions, the protocol would hold up, and would still give error free transmissions. It was under poor line conditions that it, and XMODEM, would fall apart. In the summer of 1984, I started work on a very ambitious project; to produce a protocol that was both fast, and extremely reliable, even under the worst of line conditions. From this work came the "C1" protocol; not a simple block/checksum affair, but a complete communication system for the computer. Be warned, therefore, that understanding the ins and outs of "C1" will not be easy, but with enough patience, there's no reason why even the least skilled programmer cannot be comfortable with it. Concepts -------- The concept behind the "C1" protocol was simple; to allow two computers to "talk" with one another (while transferring data) in such a way that nothing short of a complete distortion of the transmission line could result in a misunderstanding. If this concept could be realized, then files could be transferred between computers without fear of line noise causing a breakdown in the protocol, or that the received data would differ, in any way, from that which was sent. Nothing is perfect though, and I don't, for a minute, claim that "C1" is completely infallible, but I can say, with reasonable comfort, that "C1" can deliver bad line accuracy not found in any other microcomputer transfer protocols. For this accuracy though, there is a price to pay, and it is complexity; the protocol is extremely difficult to duplicate without a complete and utter understanding of the intricate workings of "C1". This document will attempt to give you that required understanding. Section C1-2 ------------ A Simple Conversation --------------------- In first deciding how the protocol would function, I thought of how two people could carry on a conversation under high noise conditions, where misunderstanding would be the norm. The senario I'm going to give differs from the protocol in that the people talking have no way of verifying the accuracy of what they believe they have heard. What it is meant to demonstrate is how the the two computers "talk" with one another, and discuss the neccessary repetition, or non-repetition, of each block of data (the cornerstone of a checksum based transfer protocol). Ken and John are attempting to assemble a machine in the middle of a very noisy machine shop. Ken reads the instructions to John, who carries them out. Even at close proximity, the two have difficulty hearing one another, so they adopt of form banter which allows each instuction to be verified and acknowledged. Here is how the conversation might go: John: Put part "A" in hole "D". Ken: Understood, putting part "A" in hole "D". John: Acknowledged, let me know when you are ready for the next instruction. Ken: Go ahead, what do I do next? John: Put screw "E" through slot "T". Ken: I didn't understand that, could you please repeat. John: Oh, ok, tell me when you're ready for that instruction again. Ken: Ready now. The conversation continues on in this fashion, guaranteeing that both John and Ken are fully aware of what the other is doing. In real life, people wouldn't have the patience to keep up that sort of banter, but that's why they make more mistakes than a computer. It is just this sort of "conversation" that the two computers have between each other, only the language is different; the instruction is replaced by the block of data, and all other statements by special codes. Section C1-3 ------------ Communication Codes ------------------- One of the areas where simple protocols fall apart is in the transmission of "handshaking codes". It's called handshaking because is implies that the two computers are having a dialogue, rather than a monologue. These other protocols rely on single byte (8 bit) words for their communication codes, and that could spell trouble, since the likelihood of any one 8 bit code being transposed into another is greater than for multiple byte codes. For this reason, "C1" uses 3 byte (24 bit) codes which are sufficiently different that the likelihood of a transposition is extremely low. Not only that, but as you will soon learn, the method of receiving 3 byte codes is designed such that if there is sufficient line noise to make the neccessary transpositions, there would most likely be extra characters sent; "C1" can avoid this situation. Five distinct codes are used in the protocol; "GOO", "BAD", "ACK", "S/B", and "SYN". Each has it's own meaning, just like any English word, and all are used in a specific sequence such that synchronization difficulties would be automatically identified and corrected. Checksums --------- When a block of data is sent, we must have a way of determining if it is correctly received or not. This is accomplished by using what is known as a checksum. Quite simply, a checksum is a number which is mathematically derived from all the bytes within the block. The receiving computer recalculates the sum and compares it with the sum it received along with the block. Theoretically, any fault in the transmitted data will result in the two checksums not matching; but that's theory. In reality, the accuracy of the checksum is based on the type of mathematical operation used to calculate it, and what kind of noise it encounters. The simplest way to create a checksum is to add up all the ASCII values of the bytes contained in the block. This is fine for many types of errors, but not the type which inverts a particular bit. Should two identical inversions occur on two opposite bits, the sum will remain the same. For example, take the following two bytes: 11010011 = 211 Plus 01101101 = 109 -------- --- 320 Now assume that the forth bit from the right of both of these bytes becomes inverted by line noise: 11011011 = 219 Plus 01100101 = 101 -------- --- 320 As you can see, the sum remains 320, even though line noise has made obvious changes to the bytes. A better system is one called "Cyclic Redundancy", which works on a somewhat different principle. The checksum is 16 bits long, and is created in the following fashion; each byte from the block is Exclusive OR'ed with the low order part of the checksum. The checksum is then ROTATED one bit to the left, and the procedure repeated with the next byte. Even this highly superior method can be tripped up, so I have combined BOTH an additive checksum and Cyclic Redundancy checksum to create one very hard to beat 32 bit "super" checksum. Section C1-4 ------------ Listening For Code Words ------------------------ Although 3 byte code words are more reliable than 1 byte code words, nothing is perfect. It was once said that if you let an infinite number of monkeys bash away at typewriters for an infinite amount of time, one of them would eventually type "To be or not to be, that is the question". Although this stretches statistical probability to it's limit, this kind of thing can easily happen on a smaller scale; the letters "GOO" could quite conceivably be produced by purely random line noise. To try and eliminate ALL possible errors isn't feasible, but "C1" makes an attempt at trying to eliminate as many as possible. One reasonably probable fact is that any noise capable of randomly producing "GOO", would not stop there; more likely, it would produce a string of characters, something like "HGOOEK". Were we to allow the protocol to listen exclusively for three letter combinations, it would most assuredly pick out the "GOO" in that string. My specifications for "C1" call for a code recognition routine which will ONLY make code word comparisons on the LAST 3 RECEIVED bytes. This is accomplished in my coding by going back and testing for further characters after I have identified a three byte code word. Should another byte be present, the identified code word is thrown away, and the search will continue. Statement and Listen Loops -------------------------- One immediate drawback to the system described above is that a REAL code word, masked within some random noise, would be rejected by the receiving computer. This would also be true of a code word simply damaged by noise (like "GOE"). For a protocol to be impervious to this sort of corruption, it must be capable of restating code words over and over until the receiving computer can understand, yet it must also have a way of knowing whether the receiving computer got the code word or not. This was a fact that eluded me when I wrote the original protocol. When we talk to other people, the cornerstone of understanding is recognition. If we ask "What do you think?", yet get no reply, we ask again. Onl...
Amiga7878