SYNOPSIS

       uniname ([option flags]) (<file name>)

       If  no  input  file  name  is supplied, uniname reads from the standard
       input.


DESCRIPTION

       uniname names the characters in a Unicode text file.  For each  charac-
       ter,  uniname  defaults to printing the character offset, the byte off-
       set, the hexadecimal UTF-32 character code, the encoding as a  sequence
       of  hex  byte values, the glyph, and the character's Unicode name. Com-
       mand line flags allow undesired information to be  suppressed.   Glyphs
       that  do not display nicely, such as control characters and spaces, are
       not displayed.  For the Latin-1 control characters, whose official Uni-
       code name is "control", the real name is given. Character and byte off-
       sets both start from 0.

       Where a character does not have a unique Unicode name, as is  the  case
       with  Chinese  characters, the character is identified as "character in
       such-and-such a range".  However, if the character is a Chinese charac-
       ter listed in Nelson's dictionary, the Nelson number is supplied.

       By  default,  input is expected to be UTF-8. Native order UTF-32 may be
       specified via the command line flag If invalid UTF8 is encountered,  an
       explanation is printed as to why it is invalid.  -q.


COMMAND LINE FLAGS

       -A     Skip ASCII whitespace characters.

       -a     Skip ASCII characters.

       -B     Skip characters within the Basic Multilingual Plane.

       -b     Suppress printing of byte offset.

       -c     Suppress printing of character offset.

       -e     Suppress printing of encoding.

       -g     Suppress printing of glyph.

       -h     Print usage information.

       -l     Print line number.

       -n     Suppress printing of Unicode name.

       -p     Suppress printing of headers every screenfull.

       -q     Input is native order UTF-32.

       -r     Print  Unicode range.  The ranges reported include both official
              more apprpriate. If a byte offset is used, the character offsets
              shown  are  with  respect to the beginning of the section of the
              file examined rather than the beginning of the file.

       -u     Suppress printing of UTF32 code.

       -V     Validate the input. In this case, nothing  is  done  other  than
              determine whether the input is valid UTF-8 Unicode. If it is, no
              output is produced and the  program  exits  with  status  0.  If
              invalid  UTF-8  is encountered, the program reports the location
              of the first invalid  UTF-8  encountered,  explains  why  it  is
              invalid, and exits with status 1.

       -v     Print version information.



SEE ALSO

       unidesc


REFERENCES

       Unicode Standard, version 5.0


AUTHOR

       Bill Poser
       billposer@alum.mit.edu


LICENSE

       GNU General Public License








                                  June, 2007                        uniname(1)

Man(1) output converted with man2html