Berkeley CSUA MOTD:Entry 22374
Berkeley CSUA MOTD
 
WIKI | FAQ | Tech FAQ
http://csua.com/feed/
2025/05/25 [General] UID:1000 Activity:popular
5/25    

2001/9/10-11 [Computer/SW/Languages/Java] UID:22374 Activity:high
9/10    When and when isn't == equivalent to equals() in regard to Objects?
        \_ == is reference equality (true iff it is the same instance of the
           String s1 = "foo"; String s2 = "foo";
           object).  You can overload equals() to do anything you want,
           in principle; by convention, it is used to test whether the objects
           are semantically different.  Default implementation of equals()
           (in class Object) simply does reference comparison with ==.
           One common class that overloads equals() is String; it is not
           very intuitive when two Strings are == or !=.  See 3.10.5 of
       http://java.sun.com/docs/books/jls/second_edition/html/lexical.doc.html
           Compare Strings with equals() unless *both* Strings are intern()ed.
           Compare Objects with equals() if the method could be overloaded,
           doesn't matter otherwise.  If this all is too complicated,
           that's because Java is broken.  -- misha
           \_ yeah, perl is much better    -- troll
              \_ Fix your garbage collection. -- troll2
        \_
           String s1 = new String("foo"); String s2 = new String ("foo");
           String s1 = "foo"; String s2 = "foo";
           (s1.equals(s2))      //TRUE, because of String's 'equals' impl
           \_ Correct.
           (s1 == s2)           //FALSE, because they don't point at the
                                        same objects
           (thanks for the correction)
           \_ This is correct when s1 and s2 are assigned at
              runtime. If s1 and s2 are assigned to literals as above,
              the strings will be interned. So (s1 == s2) as you have
              it is true. But (s1 == "food".substring(0, 3)) is false
              because the substring call happens at runtime.
2025/05/25 [General] UID:1000 Activity:popular
5/25    

You may also be interested in these entries...
2013/4/29-5/18 [Computer/SW/Languages/C_Cplusplus, Computer/SW/Compilers] UID:54665 Activity:nil
4/29    Why were C and Java designed to require "break;" statements for a
        "case" section to terminate rather than falling-through to the next
        section?  99% of the time poeple want a "case" section to terminate.
        In fact some compilers issue warning if there is no "break;" statement
        in a "case" section.  Why not just design the languages to have
        termination as the default behavior, and provide a "fallthru;"
	...
2013/5/1-18 [Computer/SW/Languages/Java, Computer/Theory] UID:54669 Activity:nil
5/1     What's the difference between CS and Computer Engineering?
        http://holykaw.alltop.com/top-ten-paying-degrees-for-college-graduates
        \_ One is science and the other is engineering.
        \_ From http://en.wikiquote.org/wiki/Computer_science
           'A folkloric quotation ... states that "computer science is no more
           about computers than astronomy is about telescopes."  The design
	...
2013/3/5-26 [Computer/SW/Languages/Java] UID:54618 Activity:nil
3/5     Three emergency Java updates in a month. Why do I have a feeling
        that the third one won't be the last one?
        \_ Bingo!
	...
2012/12/18-2013/1/24 [Computer/SW/Languages/Perl] UID:54561 Activity:nil
12/18   Happy 25th birthday Perl, and FUCK YOU Larry Wall for fucking up
        the computer science formalism that sets back compilers development
        back for at least a decade:
        http://techcrunch.com/2012/12/18/print-happy-25th-birthday-perl
        \_ I tried to learn Perl but was scared away by it.  Maybe scripting
           lanauages have to be like that in order to work well?
	...
2012/12/4-18 [Computer/SW/Languages/Java] UID:54544 Activity:nil
12/4    Holy cow, everyone around me in Silicon Valley is way beyond
        middle class according to Chinni's definition:
        http://en.wikipedia.org/wiki/American_middle_class
        \_ Let's set our goals higher:
           http://en.wikipedia.org/wiki/Upper_middle_class_in_the_United_States
           \_ How about this one?
	...
2012/10/29-12/4 [Science/Disaster, Computer/SW/Languages/Java, Politics/Domestic/President/Bush] UID:54516 Activity:nil
10/29   Go Away Sandy.
        \_ Sorry, Coursera is performing preventive maintenance for this
           class site ahead of Hurricane Sandy. Please check back in 15 minutes.
           class site ahead of Hurricane Sandy. Please check back in 15
           minutes.
        \_ Bitch.
	...
Cache (8192 bytes)
java.sun.com/docs/books/jls/second_edition/html/lexical.doc.html
The Java platform will track the Unicode specification as it evolves. The precise version of Unicode used by a given release is specified in the documentation of the class Character. The first 128 characters of the Unicode character encoding are the ASCII characters. A Unicode escape of the form \uxxxx, where xxxx is a hexadecimal value, represents the Unicode character whose encoding is xxxx. This translation step allows any program to be expressed using only ASCII characters. The longest possible translation is used at each step, even if the result does not ultimately make a correct program while another lexical translation would. This translation step results in a sequence of Unicode input characters: UnicodeInputCharacter: UnicodeEscape RawInputCharacter UnicodeEscape: \ UnicodeMarker HexDigit HexDigit HexDigit HexDigit UnicodeMarker: u UnicodeMarker u RawInputCharacter: any Unicode character HexDigit: one of 0 1 2 3 4 5 6 7 8 9 a b c d e f A B C D E F The \, u, and hexadecimal digits here are all ASCII characters. In addition to the processing implied by the grammar, for each raw input character that is a backslash \, input processing must consider how many other \ characters contiguously precede it, separating it from a non-\ character or the start of the input stream. If this number is even, then the \ is eligible to begin a Unicode escape; If an eligible \ is not followed by u, then it is treated as a RawInputCharacter and remains part of the escaped Unicode stream. If an eligible \ is followed by u, or more than one u, and the last u is not followed by four hexadecimal digits, then a compile-time error occurs. The character produced by a Unicode escape does not participate in further Unicode escapes. For example, the raw input \u005cu005a results in the six characters \ u 0 0 5 a, because 005c is the Unicode value for \. It does not result in the character Z, which is Unicode character 005a, because the \ that resulted from the \u005c is not interpreted as the start of a further Unicode escape. The Java programming language specifies a standard way of transforming a program written in Unicode into ASCII that changes a program into a form that can be processed by ASCII-based tools. The transformation involves converting any Unicode escapes in the source text of the program to ASCII by adding an extra u-for example, \uxxxx becomes \uuxxxx-while simultaneously converting non-ASCII characters in the source text to a \uxxxx escape containing a single u. This transformed version is equally acceptable to a compiler for the Java programming language ("Java compiler") and represents the exact same program. The exact Unicode source can later be restored from this ASCII form by converting each escape sequence where multiple u's are present to a sequence of Unicode characters with one fewer u, while simultaneously converting each escape sequence with a single u to the corresponding single Unicode character. Implementations should use the \uxxxx notation as an output format to display Unicode characters when a suitable font is not available. This definition of lines determines the line numbers produced by a Java compiler or other system component. LineTerminator: the ASCII LF character, also known as "newline" the ASCII CR character, also known as "return" the ASCII CR character followed by the ASCII LF character InputCharacter: UnicodeInputCharacter but not CR or LF Lines are terminated by the ASCII characters CR, or LF, or CR LF. The two characters CR immediately followed by LF are counted as one line terminator, not two. The result is a sequence of line terminators and input characters, which are the terminal symbols for the third step in the tokenization process. As a special concession for compatibility with certain operating systems, the ASCII SUB character (\u001a, or control-Z) is ignored if it is the last character in the escaped input stream. Consider two tokens x and y in the resulting input stream. If x precedes y, then we say that x is to the left of y and that y is to the right of x. For example, in this simple piece of code: class Empty { } we say that the } token is to the right of the { token, even though it appears, in this two-dimensional representation on paper, downward and to the left of the { token. This convention about the use of the words left and right allows us to speak, for example, of the right-hand operand of a binary operator or of the left-hand side of an assignment. These comments are formally specified by the following productions: Comment: TraditionalComment EndOfLineComment TraditionalComment: / * NotStar CommentTail EndOfLineComment: / / CharactersInLine opt LineTerminator CommentTail: * CommentTailStar NotStar CommentTail CommentTailStar: / * CommentTailStar NotStarNotSlash CommentTail NotStar: InputCharacter but not * LineTerminator NotStarNotSlash: InputCharacter but not * or / LineTerminator CharactersInLine: InputCharacter CharactersInLine InputCharacter These productions imply all of the following properties: * Comments do not nest. As a result, the text: /* this comment /* // /** ends here: */ is a single complete comment. Identifier: IdentifierChars but not a Keyword or BooleanLiteral or NullLiteral IdentifierChars: JavaLetter IdentifierChars JavaLetterOrDigit JavaLetter: any Unicode character that is a Java letter (see below) JavaLetterOrDigit: any Unicode character that is a Java letter-or-digit (see below) Letters and digits may be drawn from the entire Unicode character set, which supports most writing scripts in use in the world today, including the large sets for Chinese, Japanese, and Korean. This allows programmers to use identifiers in their programs that are written in their native languages. The Java letters include uppercase and lowercase ASCII Latin letters A-Z (\u0041-\u005a), and a-z (\u0061-\u007a), and, for historical reasons, the ASCII underscore (_, or \u005f) and dollar sign ($, or \u0024). The $ character should be used only in mechanically generated source code or, rarely, to access preexisting names on legacy systems. The "Java digits" include the ASCII digits 0-9 (\u0030-\u0039). Two identifiers are the same only if they are identical, that is, have the same Unicode character for each letter or digit. Identifiers that have the same external appearance may yet be different. For example, the identifiers consisting of the single letters LATIN CAPITAL LETTER A (A, \u0041), LATIN SMALL LETTER A (a, \u0061), GREEK CAPITAL LETTER ALPHA (A, \u0391), and CYRILLIC SMALL LETTER A (a, \u0430) are all different. Unicode composite characters are different from the decomposed characters. For example, a LATIN CAPITAL LETTER A ACUTE (, \u00c1) could be considered to be the same as a LATIN CAPITAL LETTER A (A, \u0041) immediately followed by a NON-SPACING ACUTE (, \u0301) when sorting, but these are different in identifiers. See The Unicode Standard, Volume 1, pages 412ff for details about decomposition, and see pages 626-627 of that work for details about sorting. This may allow a Java compiler to produce better error messages if these C++ keywords incorrectly appear in programs. An integer literal may be expressed in decimal (base 10), hexadecimal (base 16), or octal (base 8): IntegerLiteral: DecimalIntegerLiteral HexIntegerLiteral OctalIntegerLiteral DecimalIntegerLiteral: DecimalNumeral IntegerTypeSuffix opt HexIntegerLiteral: HexNumeral IntegerTypeSuffix opt OctalIntegerLiteral: OctalNumeral IntegerTypeSuffix opt IntegerTypeSuffix: one of l L An integer literal is of type long if it is suffixed with an ASCII letter L or l (ell); The suffix L is preferred, because the letter l (ell) is often hard to distinguish from the digit 1 (one). A decimal numeral is either the single ASCII character 0, representing the integer zero, or consists of an ASCII digit from 1 to 9, optionally followed by one or more ASCII digits from 0 to 9, representing a positive integer: DecimalNumeral: 0 NonZeroDigit Digits opt Digits: Digit Digits Digit Digit: 0 NonZeroDigit NonZeroDigit: one of 1 2 3 4 5 6 7 8 9 A hexadecimal numeral consists of the leading ASCII characters 0x or 0X followed by one or more ASCI...