FARSI.TXT

(24 KB) Pobierz
##Adobe File Version: 1.000
#=======================================================================
#   FTP file name:  FARSI.TXT
#
#   Contents:       Map (external version) from Mac OS Farsi
#                   character set to Unicode 2.1
#
#   Copyright:      (c) 1997-1999 by Apple Computer, Inc., all rights
#                   reserved.
#
#   Contact:        charsets@apple.com
#
#   Changes:
#
#       b02  1999-Sep-22    Update contact e-mail address. Matches
#                           internal utom<b1>, ufrm<b1>, and Text
#                           Encoding Converter version 1.5.
#       n04  1998-Feb-05    Show required Unicode character
#                           directionality in a different way. Matches
#                           internal utom<n3>, ufrm<n9>, and Text
#                           Encoding Converter version 1.3. Update
#                           header comments; include information on
#                           loose mapping of digits, and changes to
#                           mapping for the TrueType variant.
#       n01  1997-Jul-17    First version. Matches internal utom<n1>,
#                           ufrm<n2>.
#
# Standard header:
# ----------------
#
#   Apple, the Apple logo, and Macintosh are trademarks of Apple
#   Computer, Inc., registered in the United States and other countries.
#   Unicode is a trademark of Unicode Inc. For the sake of brevity,
#   throughout this document, "Macintosh" can be used to refer to
#   Macintosh computers and "Unicode" can be used to refer to the
#   Unicode standard.
#
#   Apple makes no warranty or representation, either express or
#   implied, with respect to these tables, their quality, accuracy, or
#   fitness for a particular purpose. In no event will Apple be liable
#   for direct, indirect, special, incidental, or consequential damages 
#   resulting from any defect or inaccuracy in this document or the
#   accompanying tables.
#
#   These mapping tables and character lists are subject to change.
#   The latest tables should be available from the following:
#
#   <ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/APPLE/>
#   <ftp://dev.apple.com/devworld/Technical_Documentation/Misc._Standards/>
#
#   For general information about Mac OS encodings and these mapping
#   tables, see the file "README.TXT".
#
# Format:
# -------
#
#   Three tab-separated columns;
#   '#' begins a comment which continues to the end of the line.
#     Column #1 is the Mac OS Farsi code (in hex as 0xNN)
#     Column #2 is the corresponding Unicode (in hex as 0xNNNN),
#       possibly preceded by a tag indicating required directionality
#       (i.e. <LR>+0xNNNN or <RL>+0xNNNN).
#     Column #3 is a comment containing the Unicode name.
#
#   The entries are in Mac OS Farsi code order.
#
#   Control character mappings are not shown in this table, following
#   the conventions of the standard UTC mapping tables. However, the
#   Mac OS Roman character set uses the standard control characters at
#   0x00-0x1F and 0x7F.
#
# Notes on Mac OS Farsi:
# ----------------------
#
#   1. General
#
#   The Mac OS Farsi character set is used for the Farsi (Persian)
#   localizations, and for the Persian support in the Arabic Language
#   Kit.
#
#   The Mac OS Farsi character set is based on the Mac OS Arabic
#   character set. The main difference is in the right-to-left digits
#   0xB0-0xB9: For Mac OS Arabic these correspond to right-left
#   versions of the Unicode ARABIC-INDIC DIGITs 0660-0669; for
#   Mac OS Farsi these correspond to right-left versions of the
#   Unicode EXTENDED ARABIC-INDIC DIGITs 06F0-06F9. The other
#   difference is in the nature of the font variants.
#
#   For more information, see the comments in the mapping table for
#   Mac OS Arabic.
#
#   Mac OS Farsi characters 0xEB-0xF2 are non-spacing/combining marks.
#
#   2. Directional characters and roundtrip fidelity
#
#   The Mac OS Arabic character set (on which Mac OS Farsi is based)
#   was developed in 1986-1987. At that time the bidirectional line
#   layout algorithm used in the Mac OS Arabic system was fairly simple;
#   it used only a few direction classes (instead of the 13 or so now
#   used in the Unicode bidirectional algorithm). In order to permit
#   users to handle some tricky layout problems, certain punctuation
#   and symbol characters have duplicate code points, one with a
#   left-right direction attribute and the other with a right-left
#   direction attribute. This is true in Mac OS Farsi too.
#
#   For example, plus sign is encoded at 0x2B with a left-right
#   attribute, and at 0xAB with a right-left attribute. However, there
#   is only one PLUS SIGN character in Unicode. This leads to some
#   interesting problems when mapping between Mac OS Farsi and Unicode;
#   see below.
#
#   A related problem is that even when a particular character is
#   encoded only once in Mac OS Farsi, it may have a different
#   direction attribute than the corresponding Unicode character.
#
#   For example, the Mac OS Farsi character at 0x93 is HORIZONTAL
#   ELLIPSIS with strong right-left direction. However, the Unicode
#   character HORIZONTAL ELLIPSIS has direction class neutral.
#
#   3. Behavior of ASCII-range numbers
#
#   Mac OS Farsi also has two sets of digit codes.
#
#   The digits at 0x30-0x39 may be displayed using either European
#   digit shapes or Persian digit shapes, depending on context. If there
#   is a "strong European" character such as a Latin letter on either
#   side of a sequence consisting of digits 0x30-0x39 and possibly comma
#   0x2C or period 0x2E, then the digits will be displayed using
#   European shapes, the comma will be displayed as Arabic thousands
#   separator, and the period as Arabic decimal separator. (This will
#   happen even if there are neutral characters between the digits and
#   the strong European character). Otherwise, all of these characters
#   will be displayed using the European shapes. In any case, 0x2C,
#   0x2E, and 0x30-0x39 are always left-right.
#
#   The digits at 0xB0-0xB9 are always displayed using Persian digit
#   shapes, and moreover, these digits always have strong right-left
#   directionality. These are mainly intended for special layout
#   purposes such as part numbers, etc.
#
#   4. Font variants
#
#   The table in this file gives the Unicode mappings for the standard
#   Mac OS Farsi encoding. This encoding is supported by the Tehran font
#   (the system font for Farsi), and is the encoding supported by the
#   text processing utilities. However, the other Farsi fonts actually
#   implement a somewhat different encoding; this affects nine code
#   points including 0xAA and 0xC0 (which are also affected by font
#   variants in Mac OS Arabic). For these nine code points the standard
#   Mac OS Farsi encoding has the following mappings:
#       0x8B -> 0x06BA ARABIC LETTER NOON GHUNNA (Urdu)
#       0xA4 -> <RL>+0x0024 DOLLAR SIGN, right-left
#       0xAA -> <RL>+0x002A ASTERISK, right-left
#       0xC0 -> <RL>+0x274A EIGHT TEARDROP-SPOKED PROPELLER ASTERISK,
#               right-left
#       0xF4 -> 0x0679 ARABIC LETTER TTEH (Urdu)
#       0xF7 -> 0x06A4 ARABIC LETTER VEH (for transliteration)
#       0xF9 -> 0x0688 ARABIC LETTER DDAL (Urdu)
#       0xFA -> 0x0691 ARABIC LETTER RREH (Urdu)
#       0xFF -> 0x06D2 ARABIC LETTER YEH BARREE (Urdu)
#
#   The TrueType variant is used for the Farsi TrueType fonts: Ashfahan,
#   Amir, Kamran, Mashad, NadeemFarsi. It differs from the standard
#   variant in the following ways:
#       0x8B -> 0xF882 Arabic ligature "peace on him" (corporate char.)
#       0xA4 -> 0xF86B+0x0631+0x064A+0x0627+0x0644 Arabic ligature rial,
#               currency sign (uses transcoding hint, see below)
#       0xAA -> 0x00D7 MULTIPLICATION SIGN (RL)
#       0xC0 -> 0x002A ASTERISK (RL)
#       0xF4 -> 0x00B0 DEGREE SIGN (RL)
#       0xF7 -> 0xFDFA ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM
#       0xF9 -> 0x25CF BLACK CIRCLE (RL)
#       0xFA -> 0x25A0 BLACK SQUARE (RL)
#       0xFF -> 0x25B2 BLACK UP-POINTING TRIANGLE (RL)
#
# Unicode mapping issues and notes:
# ---------------------------------
#
#   1. Matching the direction of Mac OS Farsi characters
#
#   When Mac OS Farsi encodes a character twice but with different
#   direction attributes for the two code points - as in the case of
#   plus sign mentioned above - we need a way to map both Mac OS Farsi
#   code points to Unicode and back again without loss of information.
#   With the plus sign, for example, mapping one of the Mac OS Farsi
#   characters to a code in the Unicode corporate use zone is
#   undesirable, since both of the plus sign characters are likely to
#   be used in text that is interchanged.
#
#   The problem is solved with the use of direction override characters
#   and direction-dependent mappings. When mapping from Mac OS Farsi
#   to Unicode, we use direction overrides as necessary to force the
#   direction of the resulting Unicode characters.
#
#   The required direction is indicated by a direction tag in the
#   mappings. A tag of <LR> means the corresponding Unicode character
#   must have a strong left-right context, and a tag of <RL> indicates
#   a right-left context.
#
#   For example, the mapping of 0x2B is given as <LR>+0x002B; the
#   mapping of 0xAB is given as <RL>+0x002B. If we map an isolated
#   instance of 0x2B to Unicode, it should be mapped as follows (LRO
#   indicates LEFT-RIGHT OVERRIDE, PDF indicates POP DIRECTION
#   FORMATTING):
#
#     0x2B ->  0x202D (LRO) + 0x002B (PLUS SIGN) + 0x202C (PDF)
#
#   When mapping sev...
Zgłoś jeśli naruszono regulamin