Annotation of src/usr.bin/rsync/rsync.5, Revision 1.14
1.14 ! claudio 1: .\" $OpenBSD: rsync.5,v 1.13 2021/11/26 03:42:33 jsg Exp $
1.1 benno 2: .\"
3: .\" Copyright (c) 2019 Kristaps Dzonsons <kristaps@bsd.lv>
4: .\"
5: .\" Permission to use, copy, modify, and distribute this software for any
6: .\" purpose with or without fee is hereby granted, provided that the above
7: .\" copyright notice and this permission notice appear in all copies.
8: .\"
9: .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
10: .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
11: .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
12: .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
13: .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
14: .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
15: .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
16: .\"
1.14 ! claudio 17: .Dd $Mdocdate: November 26 2021 $
1.1 benno 18: .Dt RSYNC 5
19: .Os
20: .Sh NAME
21: .Nm rsync
22: .Nd rsync wire protocol
23: .Sh DESCRIPTION
24: The
25: .Nm
26: protocol described in this relates to the BSD-licensed
27: .Xr openrsync 1 ,
28: a re-implementation of the GPL-licensed reference utility
29: .Xr rsync 1 .
30: It is compatible with version 27 of the reference.
31: .Pp
32: In this document, the
33: .Qq client process
34: refers to the utility as run on the operator's local computer.
35: The
36: .Qq server process
37: is run either on the local or remote computer, depending upon the
38: command-line given file locations.
39: .Pp
40: There are a number of options in the protocol that are dictated by command-line
41: flags.
42: These will be noted as
1.7 florian 43: .Fl D
44: for devices,
1.3 benno 45: .Fl g
46: for group ids,
1.1 benno 47: .Fl l
48: for links,
1.7 florian 49: .Fl n
50: for dry-run,
1.6 florian 51: .Fl o
52: for user ids,
1.1 benno 53: .Fl r
54: for recursion,
55: .Fl v
56: for verbose, and
57: .Fl -delete
58: for deletion (before).
59: .Ss Data types
60: The binary protocol encodes all data in little-endian format.
61: Integers are signed 32-bit, shorts are signed 16-bit, bytes are unsigned
62: 8-bit.
63: A long is variable-length.
64: For values less than the maximum integer, the value is transmitted and
65: read as a 32-bit integer.
66: For values greater, the value is transmitted first as a maximum integer,
67: then a 64-bit signed integer.
68: .Pp
69: There are three types of checksums: long (slow), short (fast), and
70: whole-file.
71: The fast checksum is a derivative of Adler-32.
72: The slow checksum is MD4,
73: made over the checksum seed first (serialised in little-endian format),
74: then the data.
75: The whole-file applies MD4 to the file first, then the checksum seed at
76: the end (also serialised in little-endian format).
77: .Ss Multiplexing
78: Most
79: .Nm
80: transmissions are wrapped in a multiplexing envelope protocol.
81: It is composed as follows:
82: .Pp
83: .Bl -enum -compact
84: .It
85: envelope header (4 bytes)
86: .It
87: envelope payload (arbitrary length)
88: .El
89: .Pp
90: The first byte of the envelope header consists of a tag.
91: If the tag is 7, the payload is normal data.
92: Otherwise, the payload is out-of-band server messages.
93: If the tag is 1, it is an error on the sender's part and must trigger an
94: exit.
95: This limits message payloads to 24 bit integer size,
1.14 ! claudio 96: .Li 0x00ffffff .
1.1 benno 97: .Pp
98: The only data not using this envelope are the initial handshake between
99: client and server.
100: .Ss File list
101: A central part of the protocol is the file list, which is generated by
102: the sender.
103: It consists of all files that must be sent to the receiver, either
104: explicitly as given or recursively generated.
105: .Pp
106: The file list itself consists of filenames and attributes (mode, time,
107: size, etc.).
108: Filenames must be relative to the destination root and not be absolute
109: or contain backtracking.
110: So if a file is given to the sender as
111: .Pa ../../foo/bar ,
112: it must be sent as
113: .Pa foo/bar .
114: .Pp
115: The file list should be cleaned of inappropriate files prior to sending.
116: For example, if
117: .Fl l
118: is not specified, symbolic links may be omitted.
119: Directory entries without
120: .Fl r
121: may also be omitted.
122: Duplicates may be omitted.
123: .Pp
124: The receiver
125: .Em must not
126: assume that the file list is clean.
127: It should not omit inappropriate files from the file list (which would
128: affect the indexing), but may omit them during processing.
129: .Pp
130: Prior to be sent from sender to receiver, and upon being received, the
131: file list must be lexicographically sorted such as with
132: .Xr strcmp 3 .
133: Subsequent references to the file are by index in the sorted list.
134: .Ss Client process
135: The client can operate in sender or receiver mode depending upon the
136: command-line source and destination.
137: .Pp
138: If the destination directory (sink) is remote, the client is in sender
139: mode: the client will push its data to the server.
140: If the source file is remote, it is in receiver mode: the server pushes
141: to the client.
142: If neither are remote, the client operates in sender mode.
143: These are all mutually exclusive.
144: .Pp
145: When the client starts, regardless its mode, it first handshakes the
146: server.
147: This exchange is
148: .Em not
149: multiplexed.
150: .Pp
151: .Bl -enum -compact
152: .It
153: send local version (integer)
154: .It
155: receive remote version (integer)
156: .It
157: receive random seed (integer)
158: .El
159: .Pp
160: Following this, the client multiplexes when reading from the server.
161: Transmissions sent from client to server are not multiplexed.
162: It then enters the
163: .Sx Update exchange
164: protocol.
165: .Ss Server process
166: The server can operate in sender or receiver mode depending upon how the
167: client starts the server.
168: This may be directly from the parent process (when invoked for local
169: files) or indirectly via a remote shell.
170: .Pp
171: When in sender mode, the server pushes data to the client.
172: (This is equivalent to receiver mode for the client.)
173: In receiver, the opposite is true.
174: .Pp
175: When the server starts, regardless the mode, it first handshakes the
176: client.
177: This exchange is
178: .Em not
179: multiplexed.
180: .Pp
181: .Bl -enum -compact
182: .It
183: send local version (integer)
184: .It
185: receive remote version (integer)
186: .It
187: send random seed (integer)
188: .El
189: .Pp
190: Following this, the server multiplexes when writing to the client.
191: (Transmissions received from the client are not multiplexed.)
192: It then enters the
193: .Sx Update exchange
194: protocol.
195: .Ss Update exchange
196: When the client or server is in sender mode, it begins by conditionally
197: sending the exclusion list.
198: At this time, this is always empty.
199: .Pp
200: .Bl -enum -compact
201: .It
202: if
203: .Fl -delete
204: and the client, exclusion list zero (integer)
205: .El
206: .Pp
207: It then sends the
208: .Sx File list .
209: Prior to being sent, the file list should be lexicographically sorted.
210: .Pp
211: .Bl -enum -compact
212: .It
213: status byte (integer)
214: .It
215: inherited filename length (optional, byte)
216: .It
217: filename length (integer or byte)
218: .It
219: file (byte array)
220: .It
221: file length (long)
222: .It
223: file modification time (optional, time_t, integer)
224: .It
225: file mode (optional, mode_t, integer)
226: .It
1.3 benno 227: if
1.6 florian 228: .Fl o ,
229: the user id (integer)
230: .It
231: if
1.3 benno 232: .Fl g ,
233: the group id (integer)
1.7 florian 234: .It
235: if a special file and
236: .Fl D ,
1.10 benno 237: the device
1.7 florian 238: .Dq rdev
239: type (integer)
1.3 benno 240: .It
1.1 benno 241: if a symbolic link and
242: .Fl l ,
243: the link target's length (integer)
244: .It
245: if a symbolic link and
246: .Fl l ,
247: the link target (byte array)
248: .El
249: .Pp
250: The status byte may consist of the following bits and determines which
251: of the optional fields are transmitted.
252: .Pp
253: .Bl -tag -compact -width Ds
1.11 benno 254: .It 0x01
255: A top-level directory.
256: (Only applies to directory files.)
257: If specified, the matching local directory is for deletions.
1.1 benno 258: .It 0x02
259: Do not send the file mode: it is a repeat of the last file's mode.
1.6 florian 260: .It 0x08
261: Like
262: .Li 0x02 ,
263: but for the user id.
1.3 benno 264: .It 0x10
265: Like
266: .Li 0x02 ,
267: but for the group id.
1.1 benno 268: .It 0x20
269: Inherit some of the prior file name.
270: Enables the inherited filename length transmission.
271: .It 0x40
272: Use full integer length for file name.
273: Otherwise, use only the byte length.
274: .It 0x80
275: Do not send the file modification time: it is a repeat of the last
276: file's.
277: .El
278: .Pp
279: If the status byte is zero, the file-list has terminated.
1.6 florian 280: .Pp
1.4 benno 281: If
1.6 florian 282: .Fl o
283: has been specified, the sender sends the list of all users encountered
1.5 benno 284: in the file list.
1.6 florian 285: Identifier zero
286: .Pq Qq root
287: is never transmitted, as it would prematurely end the list.
1.12 benno 288: This list may be incomplete or empty: the server is not obligated to
289: properly fill it in with all relevant users.
1.4 benno 290: .Pp
291: .Bl -enum -compact
292: .It
1.6 florian 293: user identifier or zero to indicate end of set (integer)
1.4 benno 294: .It
1.6 florian 295: non-zero length of user name (byte)
1.4 benno 296: .It
1.6 florian 297: user name (prior length)
1.4 benno 298: .El
1.6 florian 299: .Pp
300: The same sequence is then sent for groups if
301: .Fl g
302: has been specified.
1.4 benno 303: .Pp
1.1 benno 304: The sender then sends any IO error values, which for
305: .Xr openrsync 1
306: is always zero.
307: .Pp
308: .Bl -enum -compact
309: .It
310: constant zero (integer)
311: .El
312: .Pp
313: The server sender then reads the exclusion list, which is always zero.
314: .Pp
315: .Bl -enum -compact
316: .It
317: if server, constant zero (integer)
318: .El
319: .Pp
320: Following that, the sender receives data regarding the receiver's copy
321: of the file list contents.
322: This data is not ordered in any way.
323: Each of these requests starts as follows:
324: .Pp
325: .Bl -enum -compact
326: .It
327: file index or -1 to signal a change of phase (integer)
328: .El
329: .Pp
330: The phase starts in phase 1, then proceeds to phase 2, and phase 3
331: signals an end of transmission (no subsequent blocks).
332: If a phase change occurs, the sender must write back the -1 constant
333: integer value and increment its phase state.
334: .Pp
335: Blocks are read as follows:
336: .Pp
337: .Bl -enum -compact
338: .It
339: block index (integer)
340: .El
341: .Pp
342: In
343: .Pq Fl n
344: mode, the sender may immediately write back the index (integer) to skip
345: the following.
346: .Pp
347: .Bl -enum -compact
348: .It
349: number of blocks (integer)
350: .It
351: block length in the file (integer)
352: .It
353: long checksum length (integer)
354: .It
355: terminal (remainder) block length (integer)
356: .El
357: .Pp
358: And for each block:
359: .Pp
360: .Bl -enum -compact
361: .It
362: short checksum (integer)
363: .It
364: long checksum (bytes of checksum length)
365: .El
366: .Pp
367: The client then compares the two files, block by block, and updates the
368: server with mismatches as follows.
369: .Pp
370: .Bl -enum -compact
371: .It
372: file index (integer)
373: .It
374: number of blocks (integer)
375: .It
376: block length (integer)
377: .It
378: long checksum length (integer)
379: .It
380: remainder block length (integer)
381: .El
382: .Pp
383: Then for each block:
384: .Pp
385: .Bl -enum -compact
386: .It
387: data chunk size (integer)
388: .It
389: data chunk (bytes)
390: .It
391: block index subsequent to chunk or zero for finished (integer)
392: .El
393: .Pp
1.13 jsg 394: Following this sequence, the sender sends the following:
1.1 benno 395: .Pp
396: .Bl -enum -compact
397: .It
398: whole-file long checksum (16 bytes)
399: .El
400: .Pp
401: The sender then either handles the next queued file or, if the receiver
402: has written a phase change, the phase change step.
403: .Pp
404: If the sender is the server and
405: .Fl v
406: has been specified, the sender must send statistics.
407: .Pp
408: .Bl -enum -compact
409: .It
410: total bytes read (long)
411: .It
412: total bytes written (long)
413: .It
414: total size of files (long)
415: .El
416: .Pp
417: Finally, the sender must read a final constant-value integer.
418: .Pp
419: .Bl -enum -compact
420: .It
421: end-of-sequence -1 value (integer)
422: .El
423: .Pp
424: If in receiver mode, the inverse above (write instead of read, read
425: instead of write) is performed.
426: .Pp
427: The receiver begins by conditionally writing, then reading, the
428: exclusion list count, which is always zero.
429: .Pp
430: .Bl -enum -compact
431: .It
432: if client, send zero (integer)
433: .It
434: if receiver and
435: .Fl -delete ,
436: read zero (integer)
437: .El
438: .Pp
439: The receiver then proceeds with reading the
440: .Sx File list
441: as already
442: defined.
443: Following the list, the receiver reads the IO error, which must be zero.
444: .Pp
445: .Bl -enum -compact
446: .It
447: constant zero (integer)
448: .El
449: .Pp
450: The receiver must then sort the file names lexicographically.
451: .Pp
452: If there are no files in the file list at this time, the receiver must
453: exit prior to sending per-file data.
454: It then proceeds with the file blocks.
455: .Pp
456: For file blocks, the receiver must look at each file that is not up to
457: date, defined by having the same file size and timestamp, and send it to
458: the server.
459: Symbolic links and directory entries are never sent to the server.
460: .Pp
461: After the second phase has completed and prior to writing the
462: end-of-data signal, the client receiver reads statistics.
463: This is only performed with
464: .Pq Fl v .
465: .Pp
466: .Bl -enum -compact
467: .It
468: total bytes read (long)
469: .It
470: total bytes written (long)
471: .It
472: total size of files (long)
473: .El
474: .Pp
475: Finally, the receiver must send the constant end-of-sequence marker.
476: .Pp
477: .Bl -enum -compact
478: .It
479: end-of-sequence -1 value (integer)
480: .El
481: .Ss Sender and receiver asynchrony
482: The sender and receiver need not work in lockstep.
483: The receiver may send file update requests as quickly as it parses them,
484: and respond to the sender's update notices on demand.
485: Similarly, the sender may read as many update requests as it can, and
486: service them in any order it wishes.
487: .Pp
488: The sender and receiver synchronise state only at the end of phase.
489: .Pp
490: The reference
491: .Xr rsync 1
492: takes advantage of this with a two-process receiver, one for sending
493: update requests (the generator) and another for receiving.
494: .Xr openrsync 1
495: uses an event-loop model instead.
496: .\" .Sh CONTEXT
497: .\" For section 9 functions only.
498: .\" .Sh RETURN VALUES
499: .\" For sections 2, 3, and 9 function return values only.
500: .\" .Sh ENVIRONMENT
501: .\" For sections 1, 6, 7, and 8 only.
502: .\" .Sh FILES
503: .\" .Sh EXIT STATUS
504: .\" For sections 1, 6, and 8 only.
505: .\" .Sh EXAMPLES
506: .\" .Sh DIAGNOSTICS
507: .\" For sections 1, 4, 6, 7, 8, and 9 printf/stderr messages only.
508: .\" .Sh ERRORS
509: .\" For sections 2, 3, 4, and 9 errno settings only.
510: .Sh SEE ALSO
511: .Xr openrsync 1 ,
512: .Xr rsync 1 ,
513: .Xr rsyncd 5
514: .\" .Sh STANDARDS
515: .\" .Sh HISTORY
516: .\" .Sh AUTHORS
517: .\" .Sh CAVEATS
518: .Sh BUGS
519: Time values are sent as 32-bit integers.
520: .Pp
521: When in server mode
522: .Em and
523: when communicating to a client with a newer protocol (>27), the phase
524: change integer (-1) acknowledgement must be sent twice by the sender.
525: The is probably a bug in the reference implementation.