Awesome Code Points
      
    
    
      This is a curated list of characters in Unicode, that have interesting
      (and maybe not widely known) features or are awesome in some other way.
    
    Table of Contents
    
      - Standalone code points
 
      - 
        Code points that affect others
        
          - 
            Breaking and Gluing other characters
          
 
        
       
      - 
        Record holders and extremes
      
 
      - 
        For funsies
        
          - Games
 
        
       
      - Contributing
 
      - License
 
    
    Standalone Code Points
    
      - 
        
          The code points of the Unicode blocks
          Box Drawing (U+2500
          to U+257F) and
          Block Elements
          (U+2580 to U+259F) cover most of your monospace command-line
          visualization needs.
        
          ╭───────╮
  │Unicode│
  │rules! │
  ╰┬─────┬╯
       
      - 
        U+2E2E REVERSED QUESTION
        MARK - the “irony mark” to express irony/sarcasm. A useful character⸮
      
 
      - 
        U+D800 to
        U+DFFF - surrogate code
        points. They are only reserved to ease
        UTF-16 encoding.
      
 
      - 
        U+FEFF ZERO WIDTH NO-BREAK
        SPACE - it’s name suggests, that it can be used like U+2060 WORD JOINER.
        And in fact the latter was introduced to inherit its semantics. This is
        because U+FEFF had become a special beacon called the
        byte order mark, that was placed on the beginning of some UTF-8 files. In complying
        software (including many text editors) this character is stripped from
        the start of a file and handled as metadata. In non-complying software
        (like the PHP interpreter) this leads to all sorts of fun behaviour.
      
 
      - 
        U+FFFD REPLACEMENT CHARACTER
        - when a character cannot be displayed (e.g., decoding an erroneous
        UTF-8 sequency), this code point steps into the breach.
      
 
      - 
        U+1D455 is missing. It
        would be an italic small “h”. It was not encoded, because it would be
        identical to the Planck constant ℎ (U+210E).
      
 
      - 
        
          U+FF03 FULLWIDTH NUMBER
          SIGN - it is the “Japanese Hashtag” #. Sites like
          Twitter accept it as equivalent to the regular # (U+0023).
        
       
    
    Code Points that Affect Others
    
      - 
        
          U+202D and
          U+202E - change the text
          direction. Relevant XKCD:
        
        
       
      - 
        U+FE0E VARIATION SELECTOR-15
        - force black-&-white emoji. If this code point follows an
        emoji, an explicit monochrome rendering of the emoji is requested (if
        the client supports it).
      
 
      - 
        U+FE0F VARIATION SELECTOR-16
        - force colorful emoji. If this code point follows an emoji, an explicit
        colorful rendering of the emoji is requested (if the client supports
        it).
      
 
      - 
        
          Diacritics and combining marks: There is a
          host of characters,
          that add to the characters before. Those are called Combining Marks.
          Unicode provides a
          handy FAQ on
          the details, but in a nutshell: If you add one after a character, it
          is placed on top of that previous one. So, a + ̊ = å. This
          may lead to all kinds of funny problems, because for some
          combinations there are pre-composed characters. Our little
          å here can also be encoded as U+00E5. You might note,
          that while this has a length of one character, the combination of
          a and combining ring has a length of two characters.
        
        Of course, one can also do fun things with those characters like
        this answer on
        StackOverflow.
       
      - 
        
          The
          Regional Indicator Symbols
          U+1F1E6 to U+1F1FF resemble the 26 latin characters. They are used to
          create flag emoji. Since the Unicode consortium didn’t feel like
          getting on board with international politics, the solution to flags is
          to combine these 26 characters to the respective ISO code for a
          country. Examples:
        
        
          
            
          
          
            
              | USA | 
              US | 
              U+1F1FA + U+1F1F8 | 
              🇺🇸 | 
            
            
              | Germany | 
              DE | 
              U+1F1E9 + U+1F1EA | 
              🇩🇪 | 
            
            
              | China | 
              CN | 
              U+1F1E8 + U+1F1F3 | 
              🇨🇳 | 
            
          
        
       
      - 
        
          Skin color of emoji: There are five code points, that control the skin
          color of emoji,
          U+1F3FB to U+1F3FF. They are called “Emoji Modifier Fitzpatrick Type” 1 to 6, with 1
          the palest and 6 the darkest. If one of these characters follows an
          emoji, that emoji is meant to be rendered in the appropriate skin
          color of
          the Fitzpatrick scale. If no such modifier is added, the skin tone should be unnatural,
          e. g., bright yellow. Fun fact: Since the Fitzpatrick modifiers are
          normal code points, emoji with such skin colors have the length 2,
          which Twitter users noticed first. Here is a comparison chart
          directly from the specification:
        
        
          
            
          
          
            
              | U+1F3FB | 
              EMOJI MODIFIER FITZPATRICK TYPE-1-2 | 
              
                 
                 
               | 
            
            
              | U+1F3FC | 
              EMOJI MODIFIER FITZPATRICK TYPE-3 | 
              
                 
                 
               | 
            
            
              | U+1F3FD | 
              EMOJI MODIFIER FITZPATRICK TYPE-4 | 
              
                 
                 
               | 
            
            
              | U+1F3FE | 
              EMOJI MODIFIER FITZPATRICK TYPE-5 | 
              
                 
                 
               | 
            
            
              | U+1F3FF | 
              EMOJI MODIFIER FITZPATRICK TYPE-6 | 
              
                 
                 
               | 
            
          
        
       
    
    
      Breaking and Gluing other characters
    
    
      - 
        U+00A0 NO-BREAK SPACE -
        force adjacent characters to stick together. Well known as
        
  in HTML.
       
      - 
        U+00AD SOFT HYPHEN - (in
        HTML: 
­) like ZERO WIDTH SPACE, but show a hyphen
        if (and only if) a break occurs.
       
      - 
        U+200B ZERO WIDTH SPACE -
        the inverse to U+00A0: create no space, but allow word breaking.
      
 
      - 
        U+200D ZERO WIDTH JOINER -
        force adjacent characters to be joined together (e.g., arabic characters
        or supported emoji). Apple uses this to compose some emoji like
        different families.
      
 
      - 
        U+2060 WORD JOINER - the
        same as U+00A0, but completely invisible. Good for writing
        
@font-face on Twitter.
       
    
    
      For better comparison of which code point has which effect, consult this
      table:
    
    
      
        
      
      
        
          | create space | 
          ✓ | 
          ✗ | 
          ✗ | 
          ✗ | 
          ✗ | 
        
        
          | allow breaking | 
          ✗ | 
          ✓ | 
          ✓ | 
          ✗ | 
          ✗ | 
        
        
          | possible change | 
          ✗ | 
          ✓ | 
          ✗ | 
          ✓ | 
          ✗ | 
        
      
    
    
      Smashing Magazine featured
      a comprehensive article
      on the different types of whitespace.
    
    Record Holders and Extremes
    
      - 
        U+0000 <control> -
        first code point.
      
 
      - 
        U+10FFFF
        (non-character) - last code point. The whole rest of its plane
        apart from U+10FFFE, the code points in the 0x10000-0x10FFFD range, are
        private use characters, guaranteed to be never filled by a future
        Unicode standard.
      
 
      - 
        U+1F402 OX - shortest name.
      
 
      - 
        U+1FBA8 BOX DRAWINGS LIGHT DIAGONAL UPPER CENTRE TO MIDDLE LEFT AND
        MIDDLE RIGHT TO LOWER CENTRE and U+1FBA9 BOX DRAWINGS LIGHT DIAGONAL
        UPPER CENTRE TO MIDDLE RIGHT AND MIDDLE LEFT TO LOWER CENTRE - longest
        name: 88 characters each.
      
 
      - 
        U+FDFA ARABIC LIGATURE
        SALLALLAHOU ALAYHE WASALLAM - longest decomposition form: 18 characters.
      
 
      - 
        U+5146 and
        U+16B61 - code points that
        represent the highest “single-digit” number. In both cases that’s
        1,000,000,000,000, a trillion.
      
 
      - 
        U+0F33 TIBETAN DIGIT HALF
        ZERO - code point that represents the lowest “single-digit”
        number and at the same time the only negative one, -½.
      
 
      - 
        The trophy for most useless code points goes to
        U+0080,
        U+0081 and
        U+0099. These so-called C1
        control characters are more or less unspecified. They got into Unicode,
        because they were present in the very first version of what should later
        become ISO 10646, the ISO-standardized version of Unicode. They
        were meant to be part of an upgrade to ISO 2022, that never came to be.
      
 
      - 
        A close second place in this regard goes to the CJK unified ideographs
        妛,
        挧,
        暃,
        椦,
        槞,
        蟐,
        袮,
        閠,
        駲,
        墸,
        壥, and
        彁. These so-called
        “ghost characters”
        came to Unicode via the Japanese JIS standard, where they were added,
        because they were mis-read or misinterpreted from other signs, when JIS
        was compiled from original printed text sources.
      
 
      - 
        U+006F LATIN SMALL LETTER O
        - leads the list of characters with confusable shapes. Of all the
        possible mappings in the
        list of confusable characters, the small “o” leads with a whopping 73 entries of similar looking
        glyphs, followed by
        U+006C LATIN SMALL LETTER L
        with 70 entries.
      
 
      - 
        U+1F4C0 DVD - only code
        point name without any vowel (source)
      
 
    
    For Funsies
    
      - 
        U+1680 OGHAM SPACE MARK - a
        space that looks like a dash. Great to bring programmers close to
        madness: 
1 +  2 === 3.
       
      - 
        U+037E GREEK QUESTION MARK -
        a look-alike to the semicolon. Also a fun way to annoy developers.
      
 
      - 
        U+1DD2 COMBINING US ABOVE -
        this is the most romantic code point.
      
 
      - 
        U+F8FF PRIVATE USE CODEPOINT
        - this private use code point is rendered as Apple logo on many Apple
        devices.
      
 
      - 
        U+1F574 MAN IN BUSINESS
        SUIT LEVITATING - A rather curious character, that only made it into
        Unicode for its appearance in the Webdings font (for reasons of
        backwards compatibility).
      
 
      - 
        U+1F596 RAISED HAND WITH
        PART BETWEEN MIDDLE AND RING FINGERS - the Vulcan salute. Live long and
        prosper! 🖖
      
 
      - 
        U+1F918 SIGN OF THE HORNS -
        Rock on! 🤘
      
 
      - 
        U+2800 BRAILLE PATTERN BLANK
        - A Braille pattern that has zero of its six or eight dots filled in.
        According to the standard: “* while this character is imaged as a
        fixed-width blank in many fonts, it does not act as a space” Essentially
        it is rendered as white-space, but since it is designated as
        not white-space it isn’t matched by white-space-validating
        regular expressions. This can be used to bypass all kinds of validation
        that disallows or trims white-space.
      
 
    
    Games
    
      For plain-text gaming, Unicode is well equipped with several complete
      sets:
    
    
    Contributing Your Code Points
    See the contribution guide for details.
    License
    
      
    
    
      To the extent possible under law,
      the contributors
      have waived all copyright and related or neighboring rights to this work.
      See the license file for details.