CUTLASS PROTOCOL OVERVIEW

Cutlass overall design decisions proposed:
 - Cutlass will run entirely over UDP. This will enable us to handle
   our own session state. This adds programmer load but provides
   flexibility. Voice apps are often hindered by TCP, and there is
   no clean way around it. Running all protocol functionality over
   UDP also protects against traffic analysis.

 - Cutlass will offer the capability to transmit text, audio, and files,
   and be expandable to support other data types. It is focused on
   providing a clean user experience, while still offering sufficient
   security to satiate tinfoil hat-wearing users.
  
 - Cutlass will offer the ability to obscure traffic analysis by injecting
   chaff packets, but this ability will not be enabled by default.

CUTLASS KEY EXCHANGE

The Cutlass key exchange protocol was subjected to a couple of 
constraints above and beyond what existing key exchange protocols
offer. The following were additional constraints:

 - Cutlass must offer the ability to not answer to remote scanners that
   do not have a preshared secret
 - Cutlass must not allow network observers to distinguish between
   an initial key exhange (where both sides do not know the other's
   public keys), and subsequent key exchanges (where both sides are
   merely verifying the other's public keys).
 - Cutlass must offer the ability to exchange previously unknown public
   keys, offering usability no worse than that of ssh.

Because the first and third requirements conflict (Cutlass must both
offer a mechanism for connecting where no preshared secret exists, and
Cutlass must not identify itself to other Cutlass peers that do not 
share a secret), Cutlass offers two modes, a stealth mode and a
normal mode. 

In stealth mode, the client must prove to the server that the client
knows the server's public key before the server will respond to any
traffic. The public key is transmitted in the clear during the
protocol, but an attacker in a position to sniff will likely be
able to determine the existence of a cutlass server anyways. The
stealth option only protects against remote scan discovery.

Since key exchange packets are large, potentially thousands of bytes,
they may be too large to fit in a single UDP packet without fragmentation.
Because not all networks deal pleasantly with fragmentation, the
application may have to fragment packets at the application layer,
so that IP fragmentation does not come into play. Cutlass Key exchange
packets have the following header:

                          1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     | Frag Number   |Total Fragments|       Kex packet ...          |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                                                               |
     ~                       Key Exchange Continued                  ~
     |                                                               |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

All fragments are expected to be the same size as the first fragment, with the
exception of the last fragment. Exceptions to this rule will be quietly
dropped. The sum of all fragments cannot be larger than 65,535 bytes. Frag
Number starts counting at 1, and unfragmented packets have both Frag Number and
Total Fragments set to zero.

The following packets illustrate a Cutlass key exchange session. Client and
Server terminology is used, although a more accurate description would be
Initiating Peer for the Client, and Recieving Peer for the Server:

C->S: VERSION, C_NONCE, HASH(C_NONCE, CPK, SPK), CPK
The client sends a nonce, its public key, and a hash of the nonce and
the server key. If the client does not know the server key, it should
replace the server key with a random string.

The VERSION field is a single byte, if the server doesn't recognize the version
given, it b0rks, otherwise it continues. Current version is 0.

S: The server must now make a decision based on whether it is in stealth mode
or not. If the server is both in stealth mode and the hash from the client does
not match the server's hash of C_NONCE and its public key, the server should
quietly drop all state regarding the connection.  Otherwise, the server will
respond with the following:

S->C: S_NONCE, C_NONCE, SPK
The server sends a new nonce, the client's nonce, and its own public key. Including
the client's nonce lets us tie sessions together.

C: If the client has a copy of the server's public key already stored, and the
key sent by the server does not match, or if no current server key is stored,
it's time to make like ssh and ask the user. Assuming they say yes or the key
matches:

C->S: DHGROUP, DH_C_LENGTH, DH_C
      SIG_LENGTH, RSA_SIGN(S_NONCE, C_NONCE, DHGROUP, DH_C)

The client sends it's ephemeral DH key (along with the group it's using), and a
signature of said key, along with the nonces (to prevent replay). The length is
just a 16 bit field prepended to the front so we know how long the ephemeral key
is. Whatever is left is the signature.

S: The server now branches based on authentication method. If we're no auth or
RSA key auth, we have enough information to make an authentication. If we're
password-authenticated, we have to issue a challenge. We need to work out
challenge mechanisms. Assuming we're authenticated

S->C: DH_S_LENGTH, DH_S, SIG_LENGTH, RSA_SIGN(C_NONCE, S_NONCE, DHGROUP, DH_C, DH_S)

The server generates it's ephemeral DH key (in the group the client chose), and
using the clients and its own, generates the four main symmetric crypto-keys
our Cutlass connection will use (using KDF2/SHA-256). It signs it's DH paramter, along
with the group ID, the client's ephemeral key, and the nonces.

Notation:
C_NONCE - A nonce sent by the client (128 bits)
S_NONCE - A nonce sent by the server (128 bits)
CPK - The client's public RSA key
SPK - The server's public RSA key
KEX_K - The temporary symmetric key only used during key exchange
DHGROUP - The Diffie-Hellman group used by both sides. A widely known value
DH_C - The client's ephemeral DH key
DH_S - The server's ephemeral DH key
CKEY_S2C - The symmetric key used to encrypt server->client communications
CKEY_C2S - The symmetric key used to encrypt client->server communications
MKEY_S2C - The symmetric key used to validate server->client communications
MKEY_C2S - The symmetric key used to validate client->server communications
HASH(X) - A SHA-256 hash of the elements X contained within the parenthesis
RSA_ENCRYPT(X) - The RSA encryption of X by the recipient's public key,
                 using OAEP/SHA-256
RSA_SIGN(X) - The signature of the elements X by the transmitter's private key,
              using PSS padding and SHA-256
AES_<KEY>(X) - The AES encryption of X by the symmetric key <KEY> (CTR mode)
MAC_<KEY>(X) - The HMAC/SHA-1 of X, using the symmetric key <KEY>

CUTLASS DATA TRANSFER
   
Cutlass will support the concept of group channels for both audio and
text transfer. Groups may be created by any user, and offer the
following authentication methods:
 - None
 - Passphrase
 - RSA keypair

RSA keypair and passphrase authentication mechanisms will both be based
on a challenge-response mechanism.
   
Each Cutlass packet will describe a complete data unit, and the proposed
packet will be of the following format:

 - 16 bytes of nonce material, pseudorandom.
 - A 2 byte type field, describing what packet family this belongs to.
 - A 2 byte stream ID number, assigning which stream this packet is in
   between the two systems. Each file transfer, each voice channel will
   be a separate stream.
 - A 4 byte sequence number, starting from zero, and incrementing for
   each stream.
 - A 2 byte data length, which must not be longer than the packet length,
   minus the header length and MAC length. Any data longer than this
   length is considered padding, and this feature can be used to
   frustrate traffic analysis. A zero packet length implies that this
   packet is a NULL packet, and is being used as chaff.
 - A payload of variable length, no longer that a maximum IP packet length.
   (minus headers and MAC)
 - A 20 byte SHA-1 based MAC.

Upon receiving a packet, the client must validate the MAC before any
further processing.

Valid packet types are:
 - Text packet
 - Audio packet
 - File transfer packet
 - Ping packet
 - Ping reply packet
 - Acknowledgement/Control packet
   (Actually, I'm starting to think each major "protocol" needs its
    own ACK-style packet, as they have differing needs).
 - Request to Forward packet (Onion routing or NAT addressing packet).

--------------------------------------------------------------------------
   
Text packets will be based on the UTF-8 character set. Text packets will 
contain a recipient ID and text data.

-------------------------------------------------------------------------
   
Audio packets will contain a one-byte subtype packet. Valid subtypes
discussed were:
 - Session info request (Requesting supported codec list).
 - Session info reply
 - Session initiation packet 
 - Session data
 - Session termination

Session data audio packets will contain recipient ID fields in the header.

----------------------------------------------------------------------------

File transfer packet headers have the following format:
A 32-bit file offset
A variable length of file data.

--------------------------------------------------------------------------

[-- The following data is PGP/MIME encrypted --]

One thing we forgot about; in addition to negotiating a connection
through a relay, we need to have a way for two otherwise unconnected
nodes to be able to negotiate a point to point link between themselves
through a relay. So if we have:

 A <----> B <----> C <----> [a bunch of nodes over here]

Then A can first negotiate a session with C through B, and at some point say
"Hey, I'd like to get an actual connection with you", a request that C can
accept or refuse based on local policy (for example, if C knows and trusts
A). At that point (if C agrees) they trade IPs [*] and renegotiate a session
directly, then (probably) tear down the connection they had through B.

[*]: Hello problems with NAT! I'm open to a cleaner way to handle this. :)

We really want this because if B happens to go down, and A is the only node  
that A knows, A loses it's connection to everything. And B might be OK with
being an introducer, but not have the bandwidth to forward everything between A
and C.

-J

CUTLASS TRANSPORT LAYER

Cutlass makes use of a reliable transport layer for any types of
data that have to be reliably transmitted, and are potentially too
large for one packet to contain. These are the expected types of
transport-layer consumers:

  File transfer
  Text messages
  Group information
  Directory information

The transport protocol is almost completely unlike TCP, but it has
similar goals. That is, get everything to the other side quickly,
with data integrity assured, but play nicely with others when
it comes to congestion.

One notable difference between the cutlass transfer protocol 
(hereafter referred to as CTP) and TCP is that TCP is stream based,
with streams thathave no defined beginning or end point. CTP 
transmits messages of fixed size, and the size is stated up front.
There can be multiple messages in parallel transit, but no
resizing once the transmission starts. For the expected consumers
(files, text messages, group info, dir info), that's not a problem.

The most important concept to grasp is that CTP is focused around
the concept of "gaps." Gaps are the data that has not yet been
transmitted. When a transfer starts, say a 50,000 byte transfer,
we will have one gap, from 0-50000. As data shows up, gaps get filled
in. Sequential transfer is not required. This will make it easier
to do bittorrenty style things in the future.

CUTLASS TRANSFER PROTOCOL PROCESS FLOW

There are two fundamental process flow types. Let's cover the
most generic case first. 

            CUT_TRANS_INIT -----> 
(optional)                 <----- CUT_TRANS_INIT_ACK
                           <----- CUT_TRANS_ACK
            CUT_TRANS_SEND ----->
                           <----- CUT_TRANS_ACK

                           <. . .>

                           <----- CUT_TRANS_ACK
         CUT_CHANNEL_CLOSE ----->

Let's break down the steps below.

CUT_TRANS_INIT ----->
The init packet lets the remote side know a transfer is being initiated.
The init packet contains the type of transfer (file, text, dir, whatever),
the total size of the buffer/file, and the size of the init_header (which
is at least 7 bytes, but can be more). Depending on the type of transfer,
there will be additional information tacked on. (A message init header,
a file init header, etc. These headers are covered in the type-specific
sections below).

(optional) <----- CUT_TRANS_INIT_ACK 
The side that we are offering the file to can either reject it
(CUT_CHANNEL_CLOSE), accept it, (CUT_TRANS_ACK), or go ask the
user and send this instead. When I say file, that's all that this
is implemented for currently. I mean seriously, are you gonna pop
up a dialog box asking the user if they'll accept the message "foo?"
The CUT_TRANS_INIT_ACK lets the sender know that they got the
CUT_TRANS_INIT packet, and please stop pestering them with retransmits
until their user makes up his mind.

<----- CUT_TRANS_ACK
Once the transfer has been accepted, either automatically or via
user interaction, a CUT_TRANS_ACK packet is sent. Remember the
gaps that I mentioned above? CUT_TRANS_ACK contains a list of
all the gaps remaining in our file. 

CUT_TRANS_SEND ----->
The CUT_TRANS_SEND contains the offset into the file/buffer that it
starts at, and then pure data goodness. The ACK and SEND go back
and forth until all gaps have been filled. Then...

<----- CUT_TRANS_ACK
The last CUT_TRANS_ACK contains zero gaps. This is a signal to
the initiating side that we have received all the data, and it's OK
to tear down the connection.

CUT_CHANNEL_CLOSE ----->
This lets the remote side know that we're cool with the connection
completing, and that we have removed all state regarding the connection.
This way, they can stop retransmitting that last CUT_TRANS_ACK.
Coincidentally, CUT_CHANNEL_CLOSE is what gets transmitted on
unsolicited packets as well, so we can safely lose all state when
we transmit this.

Now, there's a bit of logic in the interplay of ACK and SEND that
deserves a bit more attention. There's a balance that must be
struck to be both speedy and non-congesting. The rules as they are 
currently laid out are as follows:

 - An ACK will always be immediately responded to with a SEND.
 - There will be (initially) one unsolicited SEND per second.
 - Each received ACK will increase the rate of unsolicited SENDs by
   1 per second.
 - The sending side will keep track of the most recent SEND's
   offset. When sending new data, by default, we will write
   starting from the end of the last send, and keep going from there.
 - If the number of gaps in an ACK increases, we have lost a packet or
   things got out of order. In that case, we will "switch ends" on the
   send, transmitting from the back to the front, or vice versa. We
   will also cut the rate of unsolicited SENDs per second by 40% (going
   no lower than 1).

Let's talk about the end switching. The idea is that we want to
minimize the number of retransmitted packets that are redundant.
No point in resending something the remote side has already received, 
right? TCP solves this with the selective ACK option, we do it by
end switching.

So if we lose a packet, we start sending again from the other end. 
Meanwhile, any packets that were still in transit can filter through,
and hopefully by the time we lose another packet, all the packets
that were in transit from _that_ end of the transmission will have
filtered through, and we'll have an accurate picture of what gaps
need to be filled in on that end. Let's run through an example:

EXAMPLE: 10,000 bytes, 500 byte MTU.

We view this from the sender's side.

ACK advertises gap 0-10000.
SEND sends 0-499
SEND sends 500-999 (unsolicited)
ACK advertises gap 500-10000
SEND sends 1000-1499
SEND sends 1500-1999 (unsolicited)
SEND sends 2000-2499 (unsolicited)

(Meanwhile. packet 1000-1499 is lost! The horror!)

ACK advertises gaps 1000-1499, 2000-10000

We cut our unsolicited rate, and start in from the other end.

SEND sends 9501-10000
SEND sends 9001-9500 (unsolicited)

(Packet 2000-2499 and 9501-10000 arrived in the meantime)

ACK advertises gaps 1000-1499, 2500-9500

SEND sends 8501-9000
SEND sends 8001-8500 (unsolicited)
SEND sends 7501-8000 (unsolicited)

(Packet 8501-9000 is lost)

ACK advertises gaps 1000-1499, 2500-7500, 8501-9000

We switch ends again, and start filling in from the beginning

SEND sends 1000-1499

ACK advertises gaps 2500-7500, 8501-9000

And so on, and so forth, until all gaps are filled.

Oh yeah, I mentioned another process flow way back when. There are two.
The other process flow is for buffers that can be short, and thus
contained in one packet. It looks like the following:

            CUT_TRANS_INIT ----->
                           <----- CUT_TRANS_ACK
         CUT_CHANNEL_CLOSE ----->

In this case, the CUT_TRANS_INIT packet carries the data with it,
the CUT_TRANS_ACK packet advertises zero gaps, and CUT_CHANNEL_CLOSE
closes the connection. This means that for short messages and the like
we can complete message tranfer in only 3 packets, giving somewhat
zippy response time.