Annotation of src/usr.bin/rsync/rsync.5, Revision 1.6
1.6 ! florian 1: .\" $OpenBSD: rsync.5,v 1.5 2019/02/12 19:13:03 benno Exp $
1.1 benno 2: .\"
3: .\" Copyright (c) 2019 Kristaps Dzonsons <kristaps@bsd.lv>
4: .\"
5: .\" Permission to use, copy, modify, and distribute this software for any
6: .\" purpose with or without fee is hereby granted, provided that the above
7: .\" copyright notice and this permission notice appear in all copies.
8: .\"
9: .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
10: .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
11: .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
12: .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
13: .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
14: .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
15: .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
16: .\"
1.4 benno 17: .Dd $Mdocdate: February 12 2019 $
1.1 benno 18: .Dt RSYNC 5
19: .Os
20: .Sh NAME
21: .Nm rsync
22: .Nd rsync wire protocol
23: .Sh DESCRIPTION
24: The
25: .Nm
26: protocol described in this relates to the BSD-licensed
27: .Xr openrsync 1 ,
28: a re-implementation of the GPL-licensed reference utility
29: .Xr rsync 1 .
30: It is compatible with version 27 of the reference.
31: .Pp
32: In this document, the
33: .Qq client process
34: refers to the utility as run on the operator's local computer.
35: The
36: .Qq server process
37: is run either on the local or remote computer, depending upon the
38: command-line given file locations.
39: .Pp
40: There are a number of options in the protocol that are dictated by command-line
41: flags.
42: These will be noted as
43: .Fl n
44: for dry-run,
1.3 benno 45: .Fl g
46: for group ids,
1.1 benno 47: .Fl l
48: for links,
1.6 ! florian 49: .Fl o
! 50: for user ids,
1.1 benno 51: .Fl r
52: for recursion,
53: .Fl v
54: for verbose, and
55: .Fl -delete
56: for deletion (before).
57: .Ss Data types
58: The binary protocol encodes all data in little-endian format.
59: Integers are signed 32-bit, shorts are signed 16-bit, bytes are unsigned
60: 8-bit.
61: A long is variable-length.
62: For values less than the maximum integer, the value is transmitted and
63: read as a 32-bit integer.
64: For values greater, the value is transmitted first as a maximum integer,
65: then a 64-bit signed integer.
66: .Pp
67: There are three types of checksums: long (slow), short (fast), and
68: whole-file.
69: The fast checksum is a derivative of Adler-32.
70: The slow checksum is MD4,
71: made over the checksum seed first (serialised in little-endian format),
72: then the data.
73: The whole-file applies MD4 to the file first, then the checksum seed at
74: the end (also serialised in little-endian format).
75: .Ss Multiplexing
76: Most
77: .Nm
78: transmissions are wrapped in a multiplexing envelope protocol.
79: It is composed as follows:
80: .Pp
81: .Bl -enum -compact
82: .It
83: envelope header (4 bytes)
84: .It
85: envelope payload (arbitrary length)
86: .El
87: .Pp
88: The first byte of the envelope header consists of a tag.
89: If the tag is 7, the payload is normal data.
90: Otherwise, the payload is out-of-band server messages.
91: If the tag is 1, it is an error on the sender's part and must trigger an
92: exit.
93: This limits message payloads to 24 bit integer size,
94: .Li 0x0fffffff .
95: .Pp
96: The only data not using this envelope are the initial handshake between
97: client and server.
98: .Ss File list
99: A central part of the protocol is the file list, which is generated by
100: the sender.
101: It consists of all files that must be sent to the receiver, either
102: explicitly as given or recursively generated.
103: .Pp
104: The file list itself consists of filenames and attributes (mode, time,
105: size, etc.).
106: Filenames must be relative to the destination root and not be absolute
107: or contain backtracking.
108: So if a file is given to the sender as
109: .Pa ../../foo/bar ,
110: it must be sent as
111: .Pa foo/bar .
112: .Pp
113: The file list should be cleaned of inappropriate files prior to sending.
114: For example, if
115: .Fl l
116: is not specified, symbolic links may be omitted.
117: Directory entries without
118: .Fl r
119: may also be omitted.
120: Duplicates may be omitted.
121: .Pp
122: The receiver
123: .Em must not
124: assume that the file list is clean.
125: It should not omit inappropriate files from the file list (which would
126: affect the indexing), but may omit them during processing.
127: .Pp
128: Prior to be sent from sender to receiver, and upon being received, the
129: file list must be lexicographically sorted such as with
130: .Xr strcmp 3 .
131: Subsequent references to the file are by index in the sorted list.
132: .Ss Client process
133: The client can operate in sender or receiver mode depending upon the
134: command-line source and destination.
135: .Pp
136: If the destination directory (sink) is remote, the client is in sender
137: mode: the client will push its data to the server.
138: If the source file is remote, it is in receiver mode: the server pushes
139: to the client.
140: If neither are remote, the client operates in sender mode.
141: These are all mutually exclusive.
142: .Pp
143: When the client starts, regardless its mode, it first handshakes the
144: server.
145: This exchange is
146: .Em not
147: multiplexed.
148: .Pp
149: .Bl -enum -compact
150: .It
151: send local version (integer)
152: .It
153: receive remote version (integer)
154: .It
155: receive random seed (integer)
156: .El
157: .Pp
158: Following this, the client multiplexes when reading from the server.
159: Transmissions sent from client to server are not multiplexed.
160: It then enters the
161: .Sx Update exchange
162: protocol.
163: .Ss Server process
164: The server can operate in sender or receiver mode depending upon how the
165: client starts the server.
166: This may be directly from the parent process (when invoked for local
167: files) or indirectly via a remote shell.
168: .Pp
169: When in sender mode, the server pushes data to the client.
170: (This is equivalent to receiver mode for the client.)
171: In receiver, the opposite is true.
172: .Pp
173: When the server starts, regardless the mode, it first handshakes the
174: client.
175: This exchange is
176: .Em not
177: multiplexed.
178: .Pp
179: .Bl -enum -compact
180: .It
181: send local version (integer)
182: .It
183: receive remote version (integer)
184: .It
185: send random seed (integer)
186: .El
187: .Pp
188: Following this, the server multiplexes when writing to the client.
189: (Transmissions received from the client are not multiplexed.)
190: It then enters the
191: .Sx Update exchange
192: protocol.
193: .Ss Update exchange
194: When the client or server is in sender mode, it begins by conditionally
195: sending the exclusion list.
196: At this time, this is always empty.
197: .Pp
198: .Bl -enum -compact
199: .It
200: if
201: .Fl -delete
202: and the client, exclusion list zero (integer)
203: .El
204: .Pp
205: It then sends the
206: .Sx File list .
207: Prior to being sent, the file list should be lexicographically sorted.
208: .Pp
209: .Bl -enum -compact
210: .It
211: status byte (integer)
212: .It
213: inherited filename length (optional, byte)
214: .It
215: filename length (integer or byte)
216: .It
217: file (byte array)
218: .It
219: file length (long)
220: .It
221: file modification time (optional, time_t, integer)
222: .It
223: file mode (optional, mode_t, integer)
224: .It
1.3 benno 225: if
1.6 ! florian 226: .Fl o ,
! 227: the user id (integer)
! 228: .It
! 229: if
1.3 benno 230: .Fl g ,
231: the group id (integer)
232: .It
1.1 benno 233: if a symbolic link and
234: .Fl l ,
235: the link target's length (integer)
236: .It
237: if a symbolic link and
238: .Fl l ,
239: the link target (byte array)
240: .El
241: .Pp
242: The status byte may consist of the following bits and determines which
243: of the optional fields are transmitted.
244: .Pp
245: .Bl -tag -compact -width Ds
246: .It 0x02
247: Do not send the file mode: it is a repeat of the last file's mode.
1.6 ! florian 248: .It 0x08
! 249: Like
! 250: .Li 0x02 ,
! 251: but for the user id.
1.3 benno 252: .It 0x10
253: Like
254: .Li 0x02 ,
255: but for the group id.
1.1 benno 256: .It 0x20
257: Inherit some of the prior file name.
258: Enables the inherited filename length transmission.
259: .It 0x40
260: Use full integer length for file name.
261: Otherwise, use only the byte length.
262: .It 0x80
263: Do not send the file modification time: it is a repeat of the last
264: file's.
265: .El
266: .Pp
267: If the status byte is zero, the file-list has terminated.
1.6 ! florian 268: .Pp
1.4 benno 269: If
1.6 ! florian 270: .Fl o
! 271: has been specified, the sender sends the list of all users encountered
1.5 benno 272: in the file list.
1.6 ! florian 273: Identifier zero
! 274: .Pq Qq root
! 275: is never transmitted, as it would prematurely end the list.
1.4 benno 276: .Pp
277: .Bl -enum -compact
278: .It
1.6 ! florian 279: user identifier or zero to indicate end of set (integer)
1.4 benno 280: .It
1.6 ! florian 281: non-zero length of user name (byte)
1.4 benno 282: .It
1.6 ! florian 283: user name (prior length)
1.4 benno 284: .El
1.6 ! florian 285: .Pp
! 286: The same sequence is then sent for groups if
! 287: .Fl g
! 288: has been specified.
1.4 benno 289: .Pp
1.1 benno 290: The sender then sends any IO error values, which for
291: .Xr openrsync 1
292: is always zero.
293: .Pp
294: .Bl -enum -compact
295: .It
296: constant zero (integer)
297: .El
298: .Pp
299: The server sender then reads the exclusion list, which is always zero.
300: .Pp
301: .Bl -enum -compact
302: .It
303: if server, constant zero (integer)
304: .El
305: .Pp
306: Following that, the sender receives data regarding the receiver's copy
307: of the file list contents.
308: This data is not ordered in any way.
309: Each of these requests starts as follows:
310: .Pp
311: .Bl -enum -compact
312: .It
313: file index or -1 to signal a change of phase (integer)
314: .El
315: .Pp
316: The phase starts in phase 1, then proceeds to phase 2, and phase 3
317: signals an end of transmission (no subsequent blocks).
318: If a phase change occurs, the sender must write back the -1 constant
319: integer value and increment its phase state.
320: .Pp
321: Blocks are read as follows:
322: .Pp
323: .Bl -enum -compact
324: .It
325: block index (integer)
326: .El
327: .Pp
328: In
329: .Pq Fl n
330: mode, the sender may immediately write back the index (integer) to skip
331: the following.
332: .Pp
333: .Bl -enum -compact
334: .It
335: number of blocks (integer)
336: .It
337: block length in the file (integer)
338: .It
339: long checksum length (integer)
340: .It
341: terminal (remainder) block length (integer)
342: .El
343: .Pp
344: And for each block:
345: .Pp
346: .Bl -enum -compact
347: .It
348: short checksum (integer)
349: .It
350: long checksum (bytes of checksum length)
351: .El
352: .Pp
353: The client then compares the two files, block by block, and updates the
354: server with mismatches as follows.
355: .Pp
356: .Bl -enum -compact
357: .It
358: file index (integer)
359: .It
360: number of blocks (integer)
361: .It
362: block length (integer)
363: .It
364: long checksum length (integer)
365: .It
366: remainder block length (integer)
367: .El
368: .Pp
369: Then for each block:
370: .Pp
371: .Bl -enum -compact
372: .It
373: data chunk size (integer)
374: .It
375: data chunk (bytes)
376: .It
377: block index subsequent to chunk or zero for finished (integer)
378: .El
379: .Pp
380: Following this sequence, the sender sends the followng:
381: .Pp
382: .Bl -enum -compact
383: .It
384: whole-file long checksum (16 bytes)
385: .El
386: .Pp
387: The sender then either handles the next queued file or, if the receiver
388: has written a phase change, the phase change step.
389: .Pp
390: If the sender is the server and
391: .Fl v
392: has been specified, the sender must send statistics.
393: .Pp
394: .Bl -enum -compact
395: .It
396: total bytes read (long)
397: .It
398: total bytes written (long)
399: .It
400: total size of files (long)
401: .El
402: .Pp
403: Finally, the sender must read a final constant-value integer.
404: .Pp
405: .Bl -enum -compact
406: .It
407: end-of-sequence -1 value (integer)
408: .El
409: .Pp
410: If in receiver mode, the inverse above (write instead of read, read
411: instead of write) is performed.
412: .Pp
413: The receiver begins by conditionally writing, then reading, the
414: exclusion list count, which is always zero.
415: .Pp
416: .Bl -enum -compact
417: .It
418: if client, send zero (integer)
419: .It
420: if receiver and
421: .Fl -delete ,
422: read zero (integer)
423: .El
424: .Pp
425: The receiver then proceeds with reading the
426: .Sx File list
427: as already
428: defined.
429: Following the list, the receiver reads the IO error, which must be zero.
430: .Pp
431: .Bl -enum -compact
432: .It
433: constant zero (integer)
434: .El
435: .Pp
436: The receiver must then sort the file names lexicographically.
437: .Pp
438: If there are no files in the file list at this time, the receiver must
439: exit prior to sending per-file data.
440: It then proceeds with the file blocks.
441: .Pp
442: For file blocks, the receiver must look at each file that is not up to
443: date, defined by having the same file size and timestamp, and send it to
444: the server.
445: Symbolic links and directory entries are never sent to the server.
446: .Pp
447: After the second phase has completed and prior to writing the
448: end-of-data signal, the client receiver reads statistics.
449: This is only performed with
450: .Pq Fl v .
451: .Pp
452: .Bl -enum -compact
453: .It
454: total bytes read (long)
455: .It
456: total bytes written (long)
457: .It
458: total size of files (long)
459: .El
460: .Pp
461: Finally, the receiver must send the constant end-of-sequence marker.
462: .Pp
463: .Bl -enum -compact
464: .It
465: end-of-sequence -1 value (integer)
466: .El
467: .Ss Sender and receiver asynchrony
468: The sender and receiver need not work in lockstep.
469: The receiver may send file update requests as quickly as it parses them,
470: and respond to the sender's update notices on demand.
471: Similarly, the sender may read as many update requests as it can, and
472: service them in any order it wishes.
473: .Pp
474: The sender and receiver synchronise state only at the end of phase.
475: .Pp
476: The reference
477: .Xr rsync 1
478: takes advantage of this with a two-process receiver, one for sending
479: update requests (the generator) and another for receiving.
480: .Xr openrsync 1
481: uses an event-loop model instead.
482: .\" .Sh CONTEXT
483: .\" For section 9 functions only.
484: .\" .Sh RETURN VALUES
485: .\" For sections 2, 3, and 9 function return values only.
486: .\" .Sh ENVIRONMENT
487: .\" For sections 1, 6, 7, and 8 only.
488: .\" .Sh FILES
489: .\" .Sh EXIT STATUS
490: .\" For sections 1, 6, and 8 only.
491: .\" .Sh EXAMPLES
492: .\" .Sh DIAGNOSTICS
493: .\" For sections 1, 4, 6, 7, 8, and 9 printf/stderr messages only.
494: .\" .Sh ERRORS
495: .\" For sections 2, 3, 4, and 9 errno settings only.
496: .Sh SEE ALSO
497: .Xr openrsync 1 ,
498: .Xr rsync 1 ,
499: .Xr rsyncd 5
500: .\" .Sh STANDARDS
501: .\" .Sh HISTORY
502: .\" .Sh AUTHORS
503: .\" .Sh CAVEATS
504: .Sh BUGS
505: Time values are sent as 32-bit integers.
506: .Pp
507: When in server mode
508: .Em and
509: when communicating to a client with a newer protocol (>27), the phase
510: change integer (-1) acknowledgement must be sent twice by the sender.
511: The is probably a bug in the reference implementation.