Home > Computing and Information Technology > Computer programming / software engineering > Programming and scripting languages: general > Unicode Demystified: A Practical Programmer's Guide to the Encoding Standard
15%
Unicode Demystified: A Practical Programmer's Guide to the Encoding Standard

Unicode Demystified: A Practical Programmer's Guide to the Encoding Standard

          
5
4
3
2
1

Out of Stock


Premium quality
Premium quality
Bookswagon upholds the quality by delivering untarnished books. Quality, services and satisfaction are everything for us!
Easy Return
Easy return
Not satisfied with this product! Keep it in original condition and packaging to avail easy return policy.
Certified product
Certified product
First impression is the last impression! Address the book’s certification page, ISBN, publisher’s name, copyright page and print quality.
Secure Checkout
Secure checkout
Security at its finest! Login, browse, purchase and pay, every step is safe and secured.
Money back guarantee
Money-back guarantee:
It’s all about customers! For any kind of bad experience with the product, get your actual amount back after returning the product.
On time delivery
On-time delivery
At your doorstep on time! Get this book delivered without any delay.
Notify me when this book is in stock
Add to Wishlist

About the Book

Unicode is a critical enabling technology for developers who want to internationalize applications for global environments. But, until now, developers have had to turn to standards documents for crucial information on utilizing Unicode. In Unicode Demystified, one of IBM's leading software internationalization experts covers every key aspect of Unicode development, offering practical examples and detailed guidance for integrating Unicode 3.0 into virtually any application or environment. Writing from a developer's point of view, Rich Gillam presents a systematic introduction to Unicode's goals, evolution, and key elements. Gillam illuminates the Unicode standards documents with insightful discussions of character properties, the Unicode character database, storage formats, character sequences, Unicode normalization, character encoding conversion, and more. He presents practical techniques for text processing, locating text boundaries, searching, sorting, rendering text, accepting user input, and other key development tasks. Along the way, he offers specific guidance on integrating Unicode with other technologies, including Java, JavaScript, XML, and the Web. For every developer building internationalized applications, internationalizing existing applications, or interfacing with systems that already utilize Unicode.

Table of Contents:
Preface. I. UNICODE IN ESSENCE: AN ARCHITECTURAL OVERVIEW OF THE UNICODE STANDARD. 1. Language, Computers, and Unicode. What Unicode Is. What Unicode Isn't. The Challenge of Representing Text in Computers. What This Book Does. How This Book Is Organized. Part I: Unicode in Essence. Part II: Unicode in Depth. Part III: Unicode in Action. 2. A Brief History of Character Encoding. Prehistory. The Telegraph and Morse Code. The Teletypewriter and Baudot Code. Other Teletype and Telegraphy Codes. FIELDATA and ASCII. Hollerith and EBCDIC. Single-Byte Encoding Systems. Eight-Bit Encoding Schemes and the ISO 2022 Model. ISO 8859. Other 8-Bit Encoding Schemes. Character Encoding Terminology. Multiple-Byte Encoding Systems. East Asian Coded Character Sets. Character Encoding Schemes for East Asian Coded Character Sets. Other East Asian Encoding Systems. ISO 10646 and Unicode. How the Unicode Standard Is Maintained. 3. Architecture:Not Just a Pile of Code Charts. The Unicode Character-Glyph Model. Character Positioning. The Principle of Unification. Alternate-Glyph Selection. Multiple Representations. Flavors of Unicode. Character Semantics. Unicode Versions and Unicode Technical Reports. Unicode Standard Annexes. Unicode Technical Standards. Unicode Technical Reports. Draft and Proposed Draft Technical Reports. Superseded Technical Reports. Unicode Versions. Unicode Stability Policies. Arrangement of the Encoding Space. Organization of the Planes. The Basic Multilingual Plane. The Supplementary Planes. Noncharacter Code Point Values. Conforming to the Standard. General. Producing Text as Output. Interpreting Text from the Outside World. Passing Text Through. Drawing Text on the Screen or Other Output Devices. Comparing Character Strings. Summary. 4. Combining Character Sequences and Unicode Normalization. How Unicode Non-spacing Marks Work. Dealing Properly with Combining Character Sequences. Canonical Decompositions. Canonical Accent Ordering. Double Diacritics. Compatibility Decompositions. Singleton Decompositions. Hangul. Unicode Normalization Forms. Grapheme Clusters. 5. Character Properties and the Unicode Character Database. Where to Get the Unicode Character Database. The UNIDATA Directory. UnicodeData.txt. PropList.txt. General Character Properties. Standard Character Names. Algorithmically Derived Names. Control-Character Names. ISO 10646 Comments. Block and Script. General Category. Letters. Marks. Numbers. Punctuation. Symbols. Separators. Miscellaneous. Other Categories. Properties of Letters. SpecialCasing.txt. CaseFolding.txt. Properties of Digits, Numerals, and Mathematical Symbols. Layout-Related Properties. Bidirectional Layout. Mirroring. Arabic Contextual Shaping. East Asian Width. Line-Breaking Property. Normalization-Related Properties. Decomposition. Decomposition Type. Combining Class. Composition Exclusion List. Normalization Test File. Derived Normalization Properties. Grapheme Cluster-Related Properties. Unihan.txt. 6. Unicode Storage and Serialization Formats. A Historical Note. UTF-32. UTF-16 and the Surrogate Mechanism. Ending-ness and the Byte Order Mark. UTF-8. CESU-8. UTF-EBCDIC. UTF-7. Standard Compression Scheme for Unicode. BOCU. Detecting Unicode Storage Formats. II. UNICODE IN DEPTH: A GUIDED TOUR OF THE CHARACTER REPERTOIRE. 7. Scripts of Europe. The Western Alphabetic Scripts. The Latin Alphabet. The Latin-1 Characters. The Latin Extended A Block. The Latin Extended B Block. The Latin Extended Additional Block. The International Phonetic Alphabet. Diacritical Marks. Isolated Combining Marks. Spacing Modifier Letters. The Greek Alphabet. The Greek Block. The Greek Extended Block. The Coptic Alphabet. The Cyrillic Alphabet. The Cyrillic Block. The Cyrillic Supplementary Block. The Armenian Alphabet. The Georgian Alphabet. 8. Scripts of the Middle East. Bidirectional Text Layout. The Unicode Bidirectional Layout Algorithm. Inherent Directionality. Neutrals. Numbers. The Left-to-Right and Right-to-Left Marks. The Explicit Override Characters. The Explicit Embedding Characters. Mirroring Characters. Line and Paragraph Boundaries. Bidirectional Text in a Text-Editing Environment. The Hebrew Alphabet. The Hebrew Block. The Arabic Alphabet. The Arabic Block. Joiners and Non-joiners. The Arabic Presentation Forms B Block. The Arabic Presentation Forms A Block. The Syriac Alphabet. The Syriac Block. The Thaana Script. The Thaana Block. 9. Scripts of India and Southeast Asia. Devanagari. The Devanagari Block. Bengali. The Bengali Block. Gurmukhi. The Gurmukhi Block. Gujarati. The Gujarati Block. Oriya. The Oriya Block. Tamil. The Tamil Block. Telugu. The Telugu Block. Kannada. The Kannada Block. Malayalam. The Malayalam Block. Sinhala. The Sinhala Block. Thai. The Thai Block. Lao. The Lao Block. Khmer. The Khmer Block. Myanmar. The Myanmar Block. Tibetan. The Tibetan Block. The Philippine Scripts. 10. Scripts of East Asia. The Han Characters. Variant Forms of Han Characters. Han Characters in Unicode. The CJK Unified Ideographs Area. The CJK Unified Ideographs Extension A Area. The CJK Unified Ideographs Extension B Area. The CJK Compatibility Ideographs Block. The CJK Compatibility Ideographs Supplement Block. The Kangxi Radicals Block. The CJK Radicals Supplement Block. Ideographic Description Sequences. Bopomofo. The Bopomofo Block. The Bopomofo Extended Block. Japanese. The Hiragana Block. The Katakana Block. The Katakana Phonetic Extensions Block. The Kanbun Block. Korean. The Hangul Jamo Block. The Hangul Compatibility Jamo Block. The Hangul Syllables Area. Half-width and Full-width Characters. The Half-width and Full-width Forms Block. Vertical Text Layout. Ruby. The Interlinear Annotation Characters. Yi. The Yi Syllables Block. The Yi Radicals Block. 11. Scripts from Other Parts of the World. Mongolian. The Mongolian Block. Ethiopic. The Ethiopic Block. Cherokee. The Cherokee Block. Canadian Aboriginal Syllables. The Unified Canadian Aboriginal Syllabics Block. Historical Scripts. Runic. Ogham. Old Italic. Gothic. Deseret. 12. Numbers, Punctuation, Symbols, and Specials. Numbers. Western Positional Notation. Alphabetic Numerals. Roman Numerals. Han Characters as Numerals. Other Numeration Systems. Numeric Presentation Forms. National and Nominal Digit Shapes. Punctuation. Script-Specific Punctuation. The General Punctuation Block. The CJK Symbols and Punctuation Block. Spaces. Dashes and Hyphens. Quotation Marks, Apostrophes, and Similar-Looking Characters. Paired Punctuation. Dot Leaders. Bullets and Dots. Special Characters. Line and Paragraph Separators. Segment and Page Separators. Control Characters. Characters That Control Word Wrapping. Characters That Control Glyph Selection. The Grapheme Joiner. Bidirectional Formatting Characters. Deprecated Characters. Interlinear Annotation. The Object Replacement Character. The General Substitution Character. Tagging Characters. Noncharacters. Symbols Used with Numbers. Numeric Punctuation. Currency Symbols. Unit Markers. Math Symbols. Mathematical Alphanumeric Symbols. Other Symbols and Miscellaneous Characters. Musical Notation. Braille. Other Symbols. Presentation Forms. Miscellaneous Characters. III. UNICODE IN ACTION: IMPLEMENTING AND USING THE UNICODE STANDARD. 13 Techniques and Data Structures for Handling Unicode Text. Useful Data Structures. Testing for Membership in a Class. The Inversion List. Performing Set Operations on Inversion Lists. Mapping Single Characters to Other Values. Inversion Maps. The Compact Array. Two-Level Compact Arrays. Mapping Single Characters to Multiple Values. Exception Tables. Mapping Multiple Characters to Other Values. Exception Tables and Key Closure. Tries as Exception Tables. Tries as the Main Lookup Table. Single Versus Multiple Tables. 14. Conversions and Transformations. Converting Between Unicode Encoding Forms. Converting Between UTF-16 and UTF-32. Converting Between UTF-8 and UTF-32. Converting Between UTF-8 and UTF-16. Implementing Unicode Compression. Unicode Normalization. Canonical Decomposition. Compatibility Decomposition. Canonical Composition. Optimizing Unicode Normalization. Testing Unicode Normalization. Converting Between Unicode and Other Standards. Getting Conversion Information. Converting Between Unicode and Single-Byte Encodings. Converting Between Unicode and Multibyte Encodings. Other Types of Conversions. Handling Exceptional Conditions. Dealing with Differences in Encoding Philosophy. Choosing a Converter. Line-Break Conversion. Case Mapping and Case Folding. Case Mapping on a Single Character. Case Mapping on a String. Case Folding. Transliteration. 15 Searching and Sorting. The Basics of Language-Sensitive String Comparison. Multilevel Comparisons. Ignorable Characters. French Accent Sorting. Contracting Character Sequences. Expanding Characters. Context-Sensitive Weighting. Putting It All Together. Other Processes and Equivalences. Language-Sensitive Comparison on Unicode Text. Unicode Normalization. Reordering. A General Implementation Strategy. The Unicode Collation Algorithm. The Default UCA Sort Order. Alternate Weighting. Optimizations and Enhancements. Language-Insensitive String Comparison. Sorting. Collation Strength and Secondary Keys. Exposing Sort Keys. Minimizing Sort Key Length. Searching. The Boyer-Moore Algorithm. Using the Boyer-Moore Algorithm with Unicode. “Whole Word” Searches. Using Unicode with Regular Expressions. 16. Rendering and Editing. Line Breaking. Line-Breaking Properties. Implementing Boundary Analysis with Pair Tables. Implementing Boundary Analysis with State Machines. Performing Boundary Analysis Using a Dictionary. A Few More Thoughts on Boundary Analysis. Performing Line Breaking. Line Layout. Glyph Selection and Positioning. Font Technologies. Poor Man's Glyph Selection. Glyph Selection and Placement in AAT. Glyph Selection and Placement in OpenType. Special-Purpose Rendering Technology. Compound and Virtual Fonts. Special Text-Editing Considerations. Optimizing for Editing Performance. Accepting Text Input. Handling Arrow Keys. Handling Discontiguous Selection. Handling Multiple-Click Selection. 17. Unicode and Other Technologies. Unicode and the Internet. The W3C Character Model. XML. HTML and HTTP. URLs and Domain Names. Mail and Usenet. Unicode and Programming Languages. The Unicode Identifier Guidelines. Java. C and C++. Javascript and JScript. Visual Basic. Perl. ICU. Unicode and Operating Systems. Microsoft Windows. MacOS. Varieties of UNIX. Conclusion. Glossary. Bibliography. Index.


Best Sellers


Product Details
  • ISBN-13: 9780201700527
  • Publisher: Pearson Education (US)
  • Publisher Imprint: Addison Wesley
  • Edition: 1
  • Language: English
  • Returnable: Y
  • Spine Width: 42 mm
  • Weight: 1300 gr
  • ISBN-10: 0201700522
  • Publisher Date: 27 Sep 2002
  • Binding: Paperback
  • Height: 185 mm
  • No of Pages: 896
  • Series Title: English
  • Sub Title: A Practical Programmer's Guide to the Encoding Standard
  • Width: 187 mm


Similar Products

How would you rate your experience shopping for books on Bookswagon?

Add Photo
Add Photo

Customer Reviews

REVIEWS           
Click Here To Be The First to Review this Product
Unicode Demystified: A Practical Programmer's Guide to the Encoding Standard
Pearson Education (US) -
Unicode Demystified: A Practical Programmer's Guide to the Encoding Standard
Writing guidlines
We want to publish your review, so please:
  • keep your review on the product. Review's that defame author's character will be rejected.
  • Keep your review focused on the product.
  • Avoid writing about customer service. contact us instead if you have issue requiring immediate attention.
  • Refrain from mentioning competitors or the specific price you paid for the product.
  • Do not include any personally identifiable information, such as full names.

Unicode Demystified: A Practical Programmer's Guide to the Encoding Standard

Required fields are marked with *

Review Title*
Review
    Add Photo Add up to 6 photos
    Would you recommend this product to a friend?
    Tag this Book
    Read more
    Does your review contain spoilers?
    What type of reader best describes you?
    I agree to the terms & conditions
    You may receive emails regarding this submission. Any emails will include the ability to opt-out of future communications.

    CUSTOMER RATINGS AND REVIEWS AND QUESTIONS AND ANSWERS TERMS OF USE

    These Terms of Use govern your conduct associated with the Customer Ratings and Reviews and/or Questions and Answers service offered by Bookswagon (the "CRR Service").


    By submitting any content to Bookswagon, you guarantee that:
    • You are the sole author and owner of the intellectual property rights in the content;
    • All "moral rights" that you may have in such content have been voluntarily waived by you;
    • All content that you post is accurate;
    • You are at least 13 years old;
    • Use of the content you supply does not violate these Terms of Use and will not cause injury to any person or entity.
    You further agree that you may not submit any content:
    • That is known by you to be false, inaccurate or misleading;
    • That infringes any third party's copyright, patent, trademark, trade secret or other proprietary rights or rights of publicity or privacy;
    • That violates any law, statute, ordinance or regulation (including, but not limited to, those governing, consumer protection, unfair competition, anti-discrimination or false advertising);
    • That is, or may reasonably be considered to be, defamatory, libelous, hateful, racially or religiously biased or offensive, unlawfully threatening or unlawfully harassing to any individual, partnership or corporation;
    • For which you were compensated or granted any consideration by any unapproved third party;
    • That includes any information that references other websites, addresses, email addresses, contact information or phone numbers;
    • That contains any computer viruses, worms or other potentially damaging computer programs or files.
    You agree to indemnify and hold Bookswagon (and its officers, directors, agents, subsidiaries, joint ventures, employees and third-party service providers, including but not limited to Bazaarvoice, Inc.), harmless from all claims, demands, and damages (actual and consequential) of every kind and nature, known and unknown including reasonable attorneys' fees, arising out of a breach of your representations and warranties set forth above, or your violation of any law or the rights of a third party.


    For any content that you submit, you grant Bookswagon a perpetual, irrevocable, royalty-free, transferable right and license to use, copy, modify, delete in its entirety, adapt, publish, translate, create derivative works from and/or sell, transfer, and/or distribute such content and/or incorporate such content into any form, medium or technology throughout the world without compensation to you. Additionally,  Bookswagon may transfer or share any personal information that you submit with its third-party service providers, including but not limited to Bazaarvoice, Inc. in accordance with  Privacy Policy


    All content that you submit may be used at Bookswagon's sole discretion. Bookswagon reserves the right to change, condense, withhold publication, remove or delete any content on Bookswagon's website that Bookswagon deems, in its sole discretion, to violate the content guidelines or any other provision of these Terms of Use.  Bookswagon does not guarantee that you will have any recourse through Bookswagon to edit or delete any content you have submitted. Ratings and written comments are generally posted within two to four business days. However, Bookswagon reserves the right to remove or to refuse to post any submission to the extent authorized by law. You acknowledge that you, not Bookswagon, are responsible for the contents of your submission. None of the content that you submit shall be subject to any obligation of confidence on the part of Bookswagon, its agents, subsidiaries, affiliates, partners or third party service providers (including but not limited to Bazaarvoice, Inc.)and their respective directors, officers and employees.

    Accept

    New Arrivals


    Inspired by your browsing history


    Your review has been submitted!

    You've already reviewed this product!
    ASK VIDYA