SYNOPSIS
uniname ([option flags]) (<file name>)
If no input file name is supplied, uniname reads from the standard
input.
DESCRIPTION
uniname names the characters in a Unicode text file. For each charac-
ter, uniname defaults to printing the character offset, the byte off-
set, the hexadecimal UTF-32 character code, the encoding as a sequence
of hex byte values, the glyph, and the character's Unicode name. Com-
mand line flags allow undesired information to be suppressed. Glyphs
that do not display nicely, such as control characters and spaces, are
not displayed. For the Latin-1 control characters, whose official Uni-
code name is "control", the real name is given. Character and byte off-
sets both start from 0.
Where a character does not have a unique Unicode name, as is the case
with Chinese characters, the character is identified as "character in
such-and-such a range". However, if the character is a Chinese charac-
ter listed in Nelson's dictionary, the Nelson number is supplied.
By default, input is expected to be UTF-8. Native order UTF-32 may be
specified via the command line flag If invalid UTF8 is encountered, an
explanation is printed as to why it is invalid. -q.
COMMAND LINE FLAGS
-A Skip ASCII whitespace characters.
-a Skip ASCII characters.
-B Skip characters within the Basic Multilingual Plane.
-b Suppress printing of byte offset.
-c Suppress printing of character offset.
-e Suppress printing of encoding.
-g Suppress printing of glyph.
-h Print usage information.
-l Print line number.
-n Suppress printing of Unicode name.
-p Suppress printing of headers every screenfull.
-q Input is native order UTF-32.
-r Print Unicode range. The ranges reported include both official
more apprpriate. If a byte offset is used, the character offsets
shown are with respect to the beginning of the section of the
file examined rather than the beginning of the file.
-u Suppress printing of UTF32 code.
-V Validate the input. In this case, nothing is done other than
determine whether the input is valid UTF-8 Unicode. If it is, no
output is produced and the program exits with status 0. If
invalid UTF-8 is encountered, the program reports the location
of the first invalid UTF-8 encountered, explains why it is
invalid, and exits with status 1.
-v Print version information.
SEE ALSO
unidesc
REFERENCES
Unicode Standard, version 5.0
AUTHOR
Bill Poser
billposer@alum.mit.edu
LICENSE
GNU General Public License
June, 2007 uniname(1)
Man(1) output converted with
man2html