Annotation of src/usr.bin/locale/locale.1, Revision 1.10
1.10 ! jmc 1: .\" $OpenBSD: locale.1,v 1.9 2023/03/05 10:11:29 ajacoutot Exp $
1.1 stsp 2: .\"
1.8 schwarze 3: .\" Copyright 2016, 2020 Ingo Schwarze <schwarze@openbsd.org>
1.1 stsp 4: .\" Copyright 2013 Stefan Sperling <stsp@openbsd.org>
5: .\"
6: .\" Permission to use, copy, modify, and distribute this software for any
7: .\" purpose with or without fee is hereby granted, provided that the above
8: .\" copyright notice and this permission notice appear in all copies.
9: .\"
10: .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
11: .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
12: .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
13: .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
14: .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
15: .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
16: .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
17: .\"
1.10 ! jmc 18: .Dd $Mdocdate: March 5 2023 $
1.1 stsp 19: .Dt LOCALE 1
20: .Os
21: .Sh NAME
22: .Nm locale
1.6 schwarze 23: .Nd character encoding and localization conventions
1.1 stsp 24: .Sh SYNOPSIS
25: .Nm locale
1.8 schwarze 26: .Op Fl a | Fl m | Cm charmap
1.1 stsp 27: .Sh DESCRIPTION
1.6 schwarze 28: If the
1.1 stsp 29: .Nm
1.6 schwarze 30: utility is invoked without any arguments, the current locale
31: configuration is shown.
1.9 ajacouto 32: Values for categories that are not set in the environment or that are
1.10 ! jmc 33: overridden by
1.9 ajacouto 34: .Ev LANG
35: or
36: .Ev LC_ALL
37: are displayed between double quotes.
1.1 stsp 38: .Pp
39: The options are as follows:
1.8 schwarze 40: .Bl -tag -width charmap
1.1 stsp 41: .It Fl a
42: Display a list of supported locales.
43: .It Fl m
1.6 schwarze 44: Display a list of supported character encodings.
45: On
46: .Ox ,
47: this always returns UTF-8 only.
1.8 schwarze 48: .It Cm charmap
49: Display the currently selected character encoding.
50: On
51: .Ox ,
52: this returns either US-ASCII or UTF-8.
1.1 stsp 53: .El
1.6 schwarze 54: .Pp
1.7 schwarze 55: A locale is a set of environment variables telling programs which
56: character encoding, language and cultural conventions the user
57: prefers.
1.6 schwarze 58: Programs in the
59: .Ox
1.7 schwarze 60: base system ignore the locale except for the character encoding,
61: and it is not recommended to use any of these variables except that
62: the following non-default setting is supported as an option:
63: .Pp
64: .Dl export LC_CTYPE=en_US.UTF-8
65: .Pp
1.6 schwarze 66: Programs installed from
67: .Xr packages 7
68: may or may not change behavior according to the locale.
69: Many programs use the X/Open System Interfaces naming scheme
70: for the contents of the variables listed below, which is
71: .Sm off
72: .Ar language
73: .Op _ Ar TERRITORY
74: .Op \&. Ar encoding
75: .Op @ Ar modifier
76: .Sm on
77: .Pp
78: The behavior of some library functions may also depend on the locale,
79: and it does on most other operating systems.
80: The
81: .Ox
82: C library tends to avoid locale-dependent behavior except with
83: respect to character encoding.
84: See the manual pages of individual functions for details.
85: .Pp
86: The character encoding locale
87: .Ev LC_CTYPE
88: instructs programs which character encoding to assume for text input
89: and to use for text output.
90: A character encoding maps each character of a given character set
91: to a byte sequence suitable for storing or transmitting the character.
92: .Pp
93: The
94: .Ox
95: base system supports two locales: the default of
96: .Li LC_CTYPE=C
97: selects the US-ASCII character set and encoding, treating the bytes
98: 0x80 to 0xff as non-printable characters of application-specific
1.7 schwarze 99: meaning.
1.6 schwarze 100: .Li LC_CTYPE=POSIX
101: is an alias for
102: .Li LC_CTYPE=C .
1.7 schwarze 103: The alternative of
104: .Li LC_CTYPE=en_US.UTF-8
105: selects the UTF-8 encoding of the Unicode character set, which is
106: supported by many parts of the system, but not yet fully supported
107: by all parts.
1.6 schwarze 108: .Pp
109: If the value of
110: .Ev LC_CTYPE
111: ends in
112: .Ql .UTF-8 ,
113: programs in the
114: .Ox
115: base system ignore the beginning of it, treating for example zh_CN.UTF-8
116: exactly like en_US.UTF-8.
117: Programs from
118: .Xr packages 7
119: may however make a difference.
120: If the value of
121: .Ev LC_CTYPE
122: is unsupported, programs and libraries in the
123: .Ox
124: base systems fall back to
125: .Li LC_CTYPE=C .
126: .Pp
127: Some programs, for example
128: .Xr write 1 ,
129: deliberately ignore the locale and always use US-ASCII only.
130: See the manual pages of individual programs for details.
1.1 stsp 131: .Sh ENVIRONMENT
132: The locale configuration consists of the following environment variables:
1.6 schwarze 133: .Bl -tag -width LC_MONETARYX
134: .It Ev LC_ALL
135: Overrides all other
136: .Ev LC_*
137: variables below.
138: .It Ev LC_COLLATE
139: Intended to affect collation order.
140: It may for example affect alphabetic sorting, regular expressions
141: including equivalence classes, and the
142: .Xr strcoll 3
143: and
144: .Xr strxfrm 3
145: functions.
146: .It Ev LC_CTYPE
147: Intended to affect character encoding, character classification,
148: and case conversion.
149: For example, it is used by
150: .Xr mbtowc 3 ,
151: .Xr iswctype 3 ,
152: .Xr iswalnum 3 ,
153: .Xr towlower 3 ,
154: .Xr fgetwc 3 ,
155: .Xr fputwc 3 ,
156: .Xr printf 3 ,
157: and
158: .Xr scanf 3 .
159: .It Ev LC_MESSAGES
160: Intended to affect the output of informative and diagnostic messages
161: and the interpretation of interactive responses, in particular
162: regarding the language.
163: It is used by
164: .Xr catopen 3 .
165: .It Ev LC_MONETARY
166: Intended to affect monetary formatting.
167: .It Ev LC_NUMERIC
168: Intended to affect numeric, non-monetary formatting, for example
169: the radix character and thousands separators.
170: On other operating systems, it may for example affect
171: .Xr printf 3 ,
172: .Xr scanf 3 ,
173: and
174: .Xr strtod 3 .
175: .It Ev LC_TIME
176: Intended to affect date and time formats.
177: It may for example affect
178: .Xr strftime 3 .
179: .It Ev LANG
1.1 stsp 180: Fallback if any of the above is unset.
1.6 schwarze 181: .It Ev NLSPATH
182: Used by
183: .Xr catopen 3
184: to locate message catalogs.
185: .El
186: .Sh FILES
187: .Bl -tag -width Ds
188: .It Pa /usr/share/locale/UTF-8/LC_CTYPE
189: Character classification, case conversion, and character display
190: width database in
191: .Xr mklocale 1
192: binary output format used by
193: .Xr setlocale 3 .
194: .It Pa /usr/local/share/locale/
195: Localization data for
196: .Xr packages 7 ,
197: in particular
198: .Ev LC_MESSAGES
199: catalogs in GNU gettext format.
200: .It Pa /usr/local/share/nls/
201: Localization data for
202: .Xr packages 7 ,
203: in particular
204: .Ev LC_MESSAGES
205: catalogs in
206: .Xr catopen 3
207: format.
208: .It Pa /usr/src/share/locale/ctype/en_US.UTF-8.src
209: Character classification, case conversion, and character display
210: width database in
211: .Xr mklocale 1
212: input format.
213: .It Pa /usr/libdata/perl5/unicore/
214: Complete Unicode data used for generating the above database.
215: .It Pa /usr/src/gnu/usr.bin/perl/lib/unicore/UnicodeData.txt
216: The most important parts of Unicode data in a compact, more easily
217: human-readable format.
1.1 stsp 218: .El
1.4 jmc 219: .Sh EXIT STATUS
220: .Ex -std locale
1.1 stsp 221: .Sh SEE ALSO
1.6 schwarze 222: .Xr mklocale 1 ,
223: .Xr setlocale 3 ,
224: .Xr Unicode::UCD 3p
225: .Pp
226: Related ports: converters/libiconv, devel/gettext, textproc/icu4c
1.1 stsp 227: .Sh STANDARDS
1.6 schwarze 228: With respect to locale support, most libraries and programs in the
229: .Ox
230: base system, including the
1.1 stsp 231: .Nm
1.6 schwarze 232: utility, implement a subset of the
1.4 jmc 233: .St -p1003.1-2008
234: specification.
1.3 schwarze 235: .Sh HISTORY
236: The
237: .Nm
238: utility was first standardized in the
239: .St -xpg4 .
240: .Pp
241: It was rewritten from scratch for
242: .Ox 5.4
243: during the 2013 Toronto hackathon.
244: .Sh AUTHORS
1.4 jmc 245: .An -nosplit
1.3 schwarze 246: .An Stefan Sperling Aq Mt stsp@openbsd.org
247: with contributions from
248: .An Philip Guenther Aq Mt guenther@openbsd.org
249: and
250: .An Jeremie Courreges-Anglas Aq Mt jca@openbsd.org .
1.6 schwarze 251: This manual page was written by
252: .An Ingo Schwarze Aq Mt schwarze@openbsd.org .
1.1 stsp 253: .Sh BUGS
1.6 schwarze 254: The
255: .Nm
256: concept is inadequate for inter-process communication.
257: Two processes exchanging text, for example over a network, using
258: sockets, in shared memory, or even using plain text files always
259: need a protocol-specific way to negotiate the character encoding
260: used.
261: .Pp
1.1 stsp 262: The list of supported locales is perpetually incomplete.