Entry 21813 (Berkeley CSUA MOTD)

Berkeley CSUA MOTD:Entry 21813
WIKI \| FAQ \| Tech FAQ
`http://csua.com/feed/`
2025/07/13 [General] UID:1000 Activity:popular
7/13
2001/7/16-17 [Computer/HW/CPU] UID:21813 Activity:high
7/16    I tried the mips3-sgi-irix6.2 and the mips4-sgi-irix6.2 versions of
        SETI@home on the same IRIX64 machine, and the latter only gives about
        5% higher speed over the former.  Anyone knows why a 64-bit version of
        such data-intensive program as SETI@home doesn't give much higher
        performance over its 32-bit counterpart?

        BTW, my 250MHz R10000 IRIX seems to be able to complete a work unit in
        about the same time as my 1.2GHz P4 NT.  Wow!
        \_ Why would it? A 64-bit architecture would only save you on a
           few instructions as you attempt to load the register file with
           64-bit values. That's about it. So even if SETI was using
           64-bit it wouldn't run much faster. SIMD extensions like AltiVec
           and MMX can derive some sub-word parallelism but MIPS isn't a
           SIMD extension. 64-bit versions of IRIX also have a newer ABI
           (n64) which could theoretically be more efficient (with register
           allocation rules and such) but I'm sure it'snot significant
           \_ But wasn't the selling point of 32-bit apps/CPUs over 16-bit
              apps/CPUs in the industry years ago that "your data-intensive
              applications will run much faster it can process thirty-two bits
              of data at a time instead of sixteen bits"?  The same doesn't
              apply to 32bit --> 64bit?  I understand that there are certain
              instructions that doesn't benefit from a bigger data size (like
              testing a boolean), but I was still expecting a difference bigger
              than 5%.
              \_ Probably because 16-bit programs attempted to mimic a 32-bit
                 computation by breaking it up into two parts and merging
                 the results together somehow. Plus, in the x86 world you
                 were limited to 2^16 bytes which meant that if you had
                 programs larger than 64K it would require that you use the
                 segment:offset addressing which isn't very efficient itself.