This document is inteded for those who wish to read the ssh source code. This tries to give an overview of the structure of the code. Copyright (c) 1995 Tatu Ylonen Updated 17 Nov 1995. The software consists of ssh (client), sshd (server), scp, sdist, and the auxiliary programs ssh-keygen, ssh-agent, ssh-add, and make-ssh-known-hosts. The main program for each of these is in a .c file with the same name. There are some subsystems/abstractions that are used by a number of these programs. Configuration - Ssh configures itself with "./configure" in the source directory. The configuration system is based on GNU Autoconf. The "configure" script is generated from configure.in using the autoconf tool (2.4 required, plus a patch for AC_ARG_PROGRAM). Additionally, the file config.h.in is generated from configure.in and acconfig.h with the autoheader tool (also part of GNU Autoconf). The configure process generates Makefile from Makefile.in, the file config.h, and a few other files from their respective .in files. The configure script uses the files config.guess (to determine current host type), config.sub (to validate given host type). It may also arrange to use the install.sh script from the Makefile when installing programs. Makefile The Makefile (generated from Makefile.in with configure) contains the following major targets: all: (default) compile all executables. install: installs executables and manual pages; installs configuration files and generates host key if they don't exist. uninstall: removes installed executables and manual pages. clean: removes object files, executables, and some temporary files. distclean: removes all generated files ("distrigution state"). depend: updates dependencies. RFC: run RFC.nroff trough nroff to generate a new version of the RFC. dist: generate a distribution ".tar.gz" file. Buffer manipulation routines - These provide an arbitrary size buffer, where data can be appended. Data can be consumed from either end. The code is used heavily throughout ssh. The basic buffer manipulation functions are in buffer.c (header buffer.h), and additional code to manipulate specific data types is in bufaux.c. Compression Library - Ssh uses the GNU GZIP compression library (ZLIB). It resides in the zlib095 subdirectory. Encryption/Decryption - Ssh contains several encryption algorithms. These are all accessed through the cipher.h interface. The interface code is in cipher.c, and the implementations in des.c, idea.c, tss.c, md5.c, rc4.c, and tss.c. Multiple Precision Integer Library - Ssh uses the GNU Multiple Precision Library (gmp). The code is in the gmp-1.3.2 subdirectory. - Some auxiliary functions for mp-int manipulation are in mpaux.c. Random Numbers - The random numbers for cryptographic use are generated by using a generator with 1024 byte pool that uses MD5 to stir the pool. The generator acquires a little amount of new entropy every time it stirs the pool. - The distribution also contains the file random.c which offers a substitute for the BSD random() function. This function is only refernced from the gmp library; the function where it is used is only used in the gmp primality checking functions (ssh uses the Fermat test in addition to the gmp primality test). RSA key generation, encryption, decryption - Ssh contains its own RSA routines. It can, however, also be compiled to use RSAREF. The interface to RSA encryption/decryption is in rsaglue.c; it will either call RSAREF (which must be in the rsaref2 subdirectory if it is used - it does not come with ssh). Normally it calls functions in rsa.c. The file rsa.c also contains prime generation code and RSA key generation. RSA key files - RSA keys are stored in files with a special format. The code to read/write these files is in authfile.c. The files are normally encrypted with a passphrase. The functions to read passphrases are in readpass.c (the same code is used to read passwords). Binary packet protocol - The ssh binary packet protocol is implemented in packet.c. The code in packet.c does not concern itself with packet types or their execution; it contains code to build packets, to receive them and extract data from them, and the code to compress and/or encrypt packets. CRC code comes from crc32.c. - The code in packet.c calls the buffer manipulation routines (buffer.c, bufaux.c), compression routines (compress.c, zlib), and the encryption routines. X11, TCP/IP, and Agent forwarding - Code for various types of channel forwarding is in channels.c. The file defines a generic framework for arbitrary communication channels inside the secure channel, and uses this framework to implement X11 forwarding, TCP/IP forwarding, and authentication agent forwarding. Authentication agent - Code to communicate with the authentication agent is in authfd.c. The files gen-minfd.c, minfd.h, minfd.c Authentication methods - Code for various authentication methods resides in auth-*.c (auth-passwd.c, auth-rh-rsa.c, auth-rhosts.c, auth-rsa.c). This code is linked into the server. The routines also manipulate known hosts files using code in hostfile.c. Code in canohost.c is used to retrieve the canonical host name of the remote host. Code in match.c is used to match host names. Code for osf C2 extended security is in osfc2.c. - In the client end, authentication code is in sshconnect.c. It reads Passwords/passphrases using code in readpass.c. It reads RSA key files with authfile.c. It communicates the authentication agent using authfd.c. The ssh client - The client main program is in ssh.c. It first parses arguments and reads configuration (readconf.c), then calls ssh_connect (in sshconnect.c) to open a connection to the server (possibly via a proxy), and performs authentication (ssh_login in sshconnect.c). It then makes any pty, forwarding, etc. requests. It may call code in ttymodes.c to encode current tty modes. Finally it calls client_loop in clientloop.c. This does the real work for the session. - The client is suid root. It tries to temporarily give up this rights while reading the configuration data. The root privileges are only used to make the connection (from a privileged socket). Any extra privileges are dropped before calling ssh_login. Pseudo-tty manipulation and tty modes - Code to allocate and use a pseudo tty is in pty.c. Code to encode and set terminal modes is in ttymodes.c. Logging in (updating utmp, lastlog, etc.) - The code to do things that are done when a user logs in are in login.c. This includes things such as updating the utmp, wtmp, and lastlog files. Some of the code is in sshd.c. Writing to the system log and terminal - The programs use the functions fatal(), log(), debug(), error() in many places to write messages to system log or user's terminal. The implementation that logs to system log is in log-server.c; it is used in the server program. The other programs use an implementation that sends output to stderr; it is in log-client.c. The definitions are in ssh.h. The sshd server (daemon) - The sshd daemon starts by processing arguments and reading the configuration file (servconf.c). It then reads the host key, starts listening for connections, and generates the server key. The server key will be regenerated every hour by an alarm. - When the server receives a connection, it forks, disables the regeneration alarm, and starts communicating with the client. They first perform identification string exchange, then negotiate encryption, then perform authentication, preparatory operations, and finally the server enters the normal session mode by calling server_loop in serverloop.c. This does the real work, calling functions in other modules. - The code for the server is in sshd.c. It contains a lot of stuff, including: - server main program - waiting for connections - processing new connection - authentication - preparatory operations - building up the execution environment for the user program - starting the user program. Auxiliary files - There are several other files in the distribution that contain various auxiliary routines: ssh.h the main header file for ssh (various definitions) getput.h byte-order independent storage of integers includes.h includes most system headers. Lots of #ifdefs. tildexpand.c expand tilde in file names uidswap.c uid-swapping xmalloc.c "safe" malloc routines Substitutions files for missing functions - To ease porting, the distribution contains alternative versions of some functions for those platforms that don't have them. These are added to the CONFOBJS target by configure where needed. crypt.c password encryption crypt() (Convex needs this) memmove.c memmove() function putenv.c putenv() function random.c random() function (see discussion above) remove.c remove() function socketpair.c socketpair() function (kludge; SCO needs it) strerror.c strerror() function