Berkeley CSUA MOTD:Entry 38823
Berkeley CSUA MOTD
 
WIKI | FAQ | Tech FAQ
http://csua.com/feed/
2024/11/23 [General] UID:1000 Activity:popular
11/23   

2005/7/26-28 [Computer/SW/Database] UID:38823 Activity:nil
7/26    Hello SQL experts, I have two tables, A and B. A has 1/2 million
        rows and B has 2 million rows. The relationship is A.id=B.A_id.
        Whenever I have something like:
        SELECT A.id, AVG(B.val) FROM A,B WHERE A.date<'2008-01-01' AND
           A.id=B.A_id GROUP BY A.id;
        It takes about 20-30 seconds. Anyone know why? I've already
        indexed all the ids.
        \_ Do you have an index on A.date?
           \_ Yes I do. I found out something new. I did a
              SELECT COUNT(*) FROM ... <rest is the same here>
              and I got 63 million rows. I'm guessing something
              is really screwy here, do you have a clue?
              How about INNER JOIN, is that faster? How do I use it?
              \_ The syntax you gave is an inner join by default in most
                 databases.  In mysql or pgsql at least, try adding "explain"
                 to the beginning of your select.  This will tell you which
                 keys it's trying to use, how many rows each step gets
                 filtered down to, etc.  --dbushong
        \_ Hello I'm the op. I've reduced my problem to the following.
           Say I have 4 tables, A, B, C, and D. When I do a
           SELECT COUNT(*) and I join A and B, it is pretty fast.
           When I join A, B, and C, it is twice as slow. And when
           I join all of them, it is FOUR times as slow. I've made
           sure that all the joint columns are INDEXed. Why is this
           happening?
           \_ You're specifying 3 join conditions, right?  You have non-unique
              indexes on the foreign keys and unique primary keys?
              \_ Thanks Dave. Basically, column A.id is unique, column B.id
                 is not unique. A.id maps to B.id. Similarly, C.id is not
                 not unique, but also maps to B.id. Is this the reason?
                 I need a one to many mapping and I don't know how to get
                 around it.
          \_ if you're using mysql, mysql can only use one index per table.
             if you're joining on multiple columns, that could be your problem
2024/11/23 [General] UID:1000 Activity:popular
11/23   

You may also be interested in these entries...
2013/4/9-5/18 [Computer/SW/Languages/C_Cplusplus, Computer/SW/Apps, Computer/SW/Languages/Perl] UID:54650 Activity:nil
4/04    Is there a good way to diff 2 files that consist of columns of
        floating point numbers, such that it only tells me if there's a
        difference if the numbers on a given line differ by at least a given
        ratio?  Say, 1%?
        \_ Use Excel.
           1. Open foo.txt in Excel.  It should convert all numbers to cells in
	...
2011/12/29-2012/2/6 [Computer/SW/Database] UID:54274 Activity:nil
12/29   Is it worthwhile to use ext4 on VMs? Is Journaling necessary on VMs?
         \_ what about DBs?  I read somewhere ext3 was better for DB voumes (mysql)
	...
2010/11/21-2011/1/13 [Computer/SW/Languages/Web] UID:53988 Activity:moderate
11/21   Lifehacker's recommending Dreamhost as a personal web hosting service.
        Apart from csua, who do you guys use? --erikred
        \_ What do you want to use it for? Do you need CGI or PHP?  My
           brother worked for Dreamhost and said they are unethical. In
           fact, he sued them. This refers to their treatment of customers
           and employees both. I don't know who  or what "Lifehacker" is,
	...
2010/8/12-9/7 [Computer/SW/Languages/Perl] UID:53922 Activity:nil
8/12    Ruby coders, do you mostly DIY your stuff or use the ruby libs out
        there?   How is their quality compared to other libs you have used
        for other langs?  Thx.
        \_ I use Ruby for hobby stuff, etc.  I use libraries for system stuff
           (web access, process, etc.) but that's about it.  Perl libraries are
           much better/more complete.  I assume because of the maturity and
	...
2010/1/22-30 [Computer/HW/Laptop, Computer/SW/OS/OsX] UID:53655 Activity:high
1/22    looking to buy a new development laptop
        needs ssdrive, >6 hr possible batt life, and runs linux reasonably
        Anyone have a recommendation? Thx.
        \_ thinkpad t23 w ssdrive and battery inplace of drive bay
        \_ Ever wondered what RICHARD STALLMAN uses for a laptop?  Well,
           wonder no more!
	...
2009/10/27-11/3 [Computer/HW/Drives] UID:53474 Activity:nil
10/27   I just read an article that Facebook had moved their database
        to all SSD to speed throughput, but now I can't find it. Has
        anyone else seen this? Any experience with doing this? -ausman
        \_ I hope you're not running mission critical data:
           http://ask.slashdot.org/story/09/10/27/1559248/Reliability-of-PC-Flash-SSDs?from=rss
        \_ Do you have any idea how much storage space is used by Facebook,
	...
2009/9/23-10/5 [Computer/SW/Database] UID:53392 Activity:nil
9/23    I never took CS188, is there a good book that's an intro to formal
        database theory, normalization, etc.?  I've got experience with SQL
        (MySQL & MSSQL), and understand tables, etc.
        \_ You mean CS186?
           \_ Oops, yah.  188 is AI or something?
              \_ That's right.
	...