Berkeley CSUA MOTD:2011:December:23 Friday <Thursday>
Berkeley CSUA MOTD
2011/12/23-2012/2/6 [Computer/Rants] UID:54271 Activity:nil
        Oh noes! What Would Bill Gates Do?
           Microsoft to Transition Corporate IT to Google Apps
2011/12/23-2012/2/6 [Computer/SW/Languages/Python] UID:54272 Activity:nil
12/23   In Python, why is it that '好'=='\xe5\xa5\xbd' but
        u'好'!='\xe5\xa5\xbd' ? I'm really baffled. What
        is the encoding of '\xe5\xa5\xbd'?
        \_ '好' means '\xe5\xa5\xbd', which is just a string of bytes; it has
           length 3.  Python doesn't know what encoding it's in.  u'好' means
           u'\u597d', which is a string of Unicode characters; it has length 1,
           and Python recognizes it as a single Chinese character.  However,
           it doesn't have any particular encoding!  You have to encode it as
           a byte string before you can output it, and you can choose whatever
           encoding you want.  u'好'.encode('utf-8') returns '\xe5\xa5\xbd'.
           \_ wow thanks. I always thought unicode == utf-8, boy I was
              so wrong. This is all very confusing.
              \_ dear dumbass:
                 \_ If all you've used is UTF-8, you'd have no reason to
                    suspect there are other Unicode encodings (and really,
                    if UTF-8 had been designed first, there probably wouldn't
                    be).  Not knowing about them doesn't make you dumb.
Berkeley CSUA MOTD:2011:December:23 Friday <Thursday>