INTERNALS INTRO

The Cutlass internal architecture doc, the code logic, the guts.

This doc is intended to be used to document what the internals
of libcutlass look like, for the purpose of reminding the original 
authors if they forget what the internals look like, for facilitating
communication between said authors, and to get outsiders who may wish 
to hack up to speed quickly. Reading this before hacking should help 
greatly.

If you are programming a client for cutlass, rather than working
on the underlying library functionality, please read api.txt
instead. (Although it won't hurt to read this, it is not
necessary, and the code details here should be hidden from
you).

LOCK RULES
The following rules apply only if you are hacking on libcutlass
directly. If you're hacking a client for Cutlass, please do not
be locking ANY locks directly. Use the API.

Handle or hashtable locks must be locked BEFORE connection locks. If you have
a connection lock, you must drop it before locking the handle.

There are 4 mutexes contained within the cutlass_t handle. These are:
handle_mutex - controls handle settings
htab_mutex - controls the connection hash table entries
del_stack_mutex - Controls access to the stack of connections
		  "To be deleted"
fd_set_mutex - Controls access to the set of select()ed across
	       file descriptors

As a matter of fact, if you have any locks at all, you must drop
them before locking handle or hashtable  locks. Priority goes:

Handle > Hashtable > Connection > FD_SET
                                > del_stack

You make "lock down" this chain, but you should never attempt to
"lock up," i.e., obtain a connection lock, and then attempt to
obtain a hashtable lock. Deadlocks may result.

DELETION RULES

If you are deleting a connection, you must pass it to the housekeeping
thread for deletion. Call cutlass_connection_dispose(). Do NOT simply
call free()!

PROCESS FLOW

Cutlass currently is architected around a three-thread model. These
threads separate after some initialization logic, and sub-threads
may spawn, but at any given time, three threads will likely be
blocking.

One thread is primarily responsible for user input, and waits for
user input events. This is the "main thread," and is the central
execution flow. This thread is the only thread that can terminate the
program. This is the thread that API users will write.

One thread is a housekeeping thread, and will periodically sleep, then
iterate through all connections, sending outstanding traffic as needed,
then sleep again.

The last thread is the listening thread, and it spends all of its
idle time in a select() call, listening on all the sockets we have
open.

Subthreads may spawn off in the case of action handlers, because
action handlers may block on user input, and we don't want to
block our listening thread on that input.

All three threads share a central data structure, called a cutlass_t
data structure. This handle stores the hash table of all connections,
information about this instance of cutlass, the set of file descriptors,
and a series of message stacks, that the three threads can leave
messages for each other on. The cutlass_t also contains mutexes to
protect the message queues from concurrent thread access.

Each thread "owns" a certain area of the central structure. The
housekeeping thread, for example, owns the hash table of connections,
and only it is responsible for adding and deleting connections to
this table. Other threads may initialize connection structures, and
hand them to the housekeeping thread to add to the hash table, but
the other threads should not edit the hash table directly.

The reason for this message passing is to prevent deadlock conditions.
I originally wrote cutlass to have each thread lock on the hashtable, 
and lock on each connection, and a lock on the file descriptor sets, 
but I kept running into deadlock conditions. Specifically, adding to 
the file descriptor set while the listener was going to be stuck in 
select() most of the time was a real bear. So now, the listener is 
responsible for adding a file descriptor to the select() list, and the 
other threads simple stack up the file descriptors for it. Keeps lock 
contention minimal.

Cutlass uses the following hash tables:

Connections - Keyed off of remote public key, and also potentially keyed
off of IP address/port/protocol group if we have a direct connection. 
Keys of the form <remoteIP>:<localIP>:<remotePort>:<localPort>:<protocol> 
All values in network byte order. Each connection value is a structure 
pointer, with the structure containing the following:

 - A linked list of group structures that the remote client belongs to.
 - The crypto key material structure, containing both public and session keys
 - The UTF-8 "Name" associated with the remote client.
 - A linked list of file structures containing info about in-transit
   files. We will need to hold some state about what fragments we're
   currently missing.
 - A counter for the number of in-transit files on this connection. (Not
   to exceed some sane value for number of simultaneous file transfers).
 - A linked list of message structures containing info about message
   fragments (for long messages).
 - A counter for the number of fragmented messages currently in
   transit, not to exceed some sane value.
 - A permissions/capabilities structure.
 - A struct timeval with last activity on this connection
 - A struct timeval with last ping sent on this connection.
 - The socket we're listening on, if a directly connected connection.
 - The remote struct sockaddr_in and socklen_t, if we did not initiate
   the connection.
 - Current routing info for non-directly connected connections (This
   is fuzzy, needs detail).

Groups - a hashtable of all groups our client is currently part of. Keys
are the groupID, a (128 bit?) value. Group values are structure pointers,
with the structure containing the following:

 - A linked list of connections belonging to the group.
 - the UTF-8 "Name" of the group.

-----------------------------------------------------------------------

The following are structures containing the layout of needed linked list nodes
(all nodes have a *next pointer). Linked lists should be OK, because each
connection can only have a finite, and small, number of any of these at any
given time:

Files - We will require a structure with information for each file
currently in transit. The entry will contain:

 - The file descriptor for the local file
 - The SHA-1 hash for the expected file
 - The total expected file length
 - The list of gaps currently in the file. This will likely be a sorted
   red-black tree containing start and end gaps eventually, but right now
   is a doubly linked list. <Mumble, grumble>

Message fragments - Messages are limited to some sane value in length,
any longer and you need to send a file. 64K, perhaps?

------------------------------------------------------------------------

Structures we need:
 - Connection structure
 - Group structure
 - File structure
 - Fragmentary message structure
 - Cutlass packet message structure(s)
 - Capabilities/permissions structure

-----------------------------------------------------------------------

Process flow:
Code execution can get kicked off by one of three things. These things are:
 - User event (button click, keypress, sound input).
 - Network event (socket read)
 - Housekeeping thread (periodic timed thread).

We're multithreaded, gotta worry about mutexen. 

Threads each have ownership over different data structures. The housekeeping
thread owns the connection hash table, the network event thread owns the
set of open file descriptors. In order to keep each other in synch,
the

Housekeeping will do the following:
 - Check all connections at regular intervals, if it's been quiescent 
   for > <timeval foo> and last ping sent was > <timeval bar> send a ping.
   If it's been quiescent since > <timeval baz>, initiate connection
   shutdown/cleanup.
 - Check all connection in-transit message frags and files in transit
   to ensure they're not stuck. Clean up as appropriate.
 - Sleep for some sane value.

Network event thread will select() across the plethora of open sockets we have,
and when that select() comes back, it'll read the packet into a buffer. Once we
do that, retrieve the corresponding connection structure, overlay a Cutlass
packet structure on it, and first thing, check the MAC.  If the MAC is good,
decrypt, and then check the packet type. Take action based on the packet type
(Each one of these should be a separate subroutine, possibly calling
architecture-specific modules):
 - Text packet - If complete message, display and ACK. Otherwise, place 
   into fragmented message list for connection and ACK.
 - Audio packet - Play and ACK, or if out of sequence, discard and ACK.
   We may want to only ACK if there's been a sane interval since the last 
   ACK, letting the remote connection know how successful we're being at
   receiving their audio, i.e. so many lost packets out of so many seq.
 - File transfer packet - Write and update file structure, and ACK. 
   We may want to wait on the ACKs to so many per timeframe, so we're not
   overly chatty. If that's the case, housekeeping may need to ACK for
   us occasionally.
 - Ping packet. PONG.
 - Pong packet. Update activity counter in connection structure. (Actually,
   all of these packets should update activity counters). 
 - ACK packet - Update in-transit values. We may need to make additional
   changes, such as switching to a lower bandwidth codec if the loss rate
   is too high, cutting back the "window size", cutting back the send
   rate, or switching relays. If the "window" is clear, we may want to
   send more data on a file transfer.
 - Request to Forward - Either forward or don't, depending on permissions.
   ACK failure to forward.

User input can trigger a whole slew of things. As a matter of fact, I'm
going to punt on this until I write the "user experience" doc. 

------------------------------------------------------------------------

How connections are born, and how we keep track of them:

Unfortunately, UDP doesn't have quite the simple socket structure
that TCP has. You can bind() to a port, but there is no equivalent
of accept() that allows you to "lock in" a 
localaddr:localport/remoteaddr:remoteport pairing to a single file
descriptor.

So, in order to keep process flows sane, we have two options:

1: You can connect() to a remote UDP port. This is great, as you
have a 1->1 mapping with sockets and connections, and the remote
side port is known. You don't know your own ephemeral port, but you
don't have to, as return traffic automatically comes to you. In
addition, ICMP_PORT_UNREACHABLE error messages get returned to you.

2: You can bind() to a local port, and when inbound traffic comes
in, recvfrom() the traffic and SAVE THE REMOTE SOCKADDR_IN! After
you've processed the proper response, sendto() the saved sockaddr_in.
This means there is only one socket for many connections, and
in addition, our process can't get errors returned via ICMP.

Luckily, the one socket->many connection mappings isn't as much
of a problem as it might be, as our hashtable structure is exactly
what we need in order to cleanly resolve addressing. The ICMP
problem is a bit more troubling.

So, I propose that every time we get an inbound connection, we
try to back-connect() to get an outbound socket to the remote host.
This enables us to know more quickly if a host has rebooted or 
whatnot on us. This won't work through NAT, however. In that case,
we're going to just have to issue response packets back through the
one great all-listening socket, with the sendto() remote address
structure filled in from the hashtable.

We're also going to need to more aggressively keep track of
retransmission states through some kind of loop. We can't afford
to block on key exchange with one guy forever, and the packet that
comes back to us through the one great socket might not even be
from that guy, it may be someone else. So unfortunately we can't
just fork off a thread for each key-echange, and have it set a variable
when it's done. It's parallel processing at its bloodiest!

Next function up: looping through all existing connections.