Berkeley CSUA MOTD:Entry 37752
Berkeley CSUA MOTD
 
WIKI | FAQ | Tech FAQ
http://csua.com/feed/
2017/10/17 [General] UID:1000 Activity:popular
10/17   

2005/5/18 [Computer/SW/Unix, Academia/Berkeley/CSUA/Motd] UID:37752 Activity:low
5/18    Hey kchang!  You posted a request for people to stop adding and
        deleting stuff from the motd.  But your request didn't show up in your
        archive.  Why not?
        \_ Because the request got deleted in between archive intervals.
           You can read the techical FAQ, the archiver is unfortunately
           not comprehensive. Also, I don't normally do this but I thought
           it's best that I took out the sicko ascii art in the archiver.
           It is listed as "Entry has been invalidated." For more info,
           http://csua.com/?entry=faq1
           http://csua.com/?entry=faq2
ERROR, url_link recursive (eces.Colorado.EDU/secure/mindterm2) 2017/10/17 [General] UID:1000 Activity:popular
10/17   

You may also be interested in these entries...
2012/8/29-11/7 [Computer/SW/Security] UID:54467 Activity:nil
8/29    There was once a CSUA web page which runs an SSH client for logging
        on to soda.  Does that page still exist?  Can someone remind me of the
        URL please?  Thx.
        \_ what do you mean? instruction on how to ssh into soda?
           \_ No I think he means the ssh applet, which, iirc, was an applet
              that implemented an ssh v1 client.  I think this page went away
	...
2012/3/29-6/4 [Computer/HW/Memory, Computer/HW/CPU, Computer/HW/Drives] UID:54351 Activity:nil
3/29    A friend wants a PC (no mac). She doesn't want Dell. Is there a
        good place that can custom build for you (SSD, large RAM, cheap video
        card--no game)?
        \_ As a side note: back in my Cal days more than two decades ago when
           having a 387SX made me the only person with floating-point hardware,
           most machines were custom built.
	...
2012/1/27-3/26 [Computer/SW/Unix] UID:54299 Activity:nil
1/27    Interesting list of useful unix tools. Shout out to
        cowsay even!
        http://www.stumbleupon.com/su/3428AB/kkovacs.eu/cool-but-obscure-unix-tools
        \_ This is nice.  Thanks.
	...
2011/10/26-12/6 [Computer/SW/Unix] UID:54202 Activity:nil
10/24  What's an easy way to see if say column 3 of a file matches a list of
       expressions in a file? Basically I want to combine "grep -f <file>"
       to store the patterns and awk's $3 ~ /(AAA|BBB|CCC)/ ... I realize
       I can do this with "egrep -f " and use regexp instead of strings, but
       was wondering if there was some magic way to do this.
       \_ UNIX has no magic. Make a shell script to produce the ask or egrep
	...
2011/3/12-4/20 [Consumer/CellPhone, Computer/HW/Laptop] UID:54057 Activity:nil
3/12    I am curious what others think of tablets like iPad. They don't seem
        useful to me, but I use my computer for more than web browsing,
        Facebook, and Twitter. Why would I buy one instead of a laptop?
        They seem like a disabled laptop to me, but at a higher price.
        \_ You are most likely a coder.  iPad is not for coders.  They are
           what you get your non-technical friends.  Or musicians.  Look at
	...
2011/2/6-19 [Computer/Networking] UID:54028 Activity:nil
2/5     hmm.
$netstat -at | grep LISTEN
tcp        0      0 *:43300                 *:*                     LISTEN
        \_ this is an sshd
tcp        0      0 *:49416                 *:*                     LISTEN
tcp        0      0 *:36201                 *:*                     LISTEN
	...
2010/6/8-30 [Computer/Companies/Yahoo] UID:53853 Activity:nil
6/8     Newly wed husband and wife found from old picture that they have
        actually crossed path 30yrs ago: http://www.csua.org/u/qwv
        My question is how do stories like this find its way to news media?  Do
        people just go "hey something very interesting happens in our lives.
        Let's call up a news agency or two to tell the world about it."?
        \_ "Your video will begin after a word from our sponsors."
	...
2010/3/8-30 [Computer/SW/Unix] UID:53745 Activity:nil
3/8     I have a mod_rewrite question that I think should be straight-
        forward but I think I'm not getting something.
        I have a virtual server with some root, say /home/user/public_html/
        and in there I have two subdirs, say /app1/ and /app2/
        and i want the following:
        http://mysite/app1   -->   /home/user/public_html/app1
	...
2010/3/10-30 [Computer/SW/Mail] UID:53751 Activity:nil
3/10    What email program do people in Cal CS use nowadays?  In my school days
        people used /usr/bin/mail, then RMail in emacs, then VMail in emacs.
        After my days people used Elm, Pine, Mutt (I forgot which order).  In
        my first two jobs we could tell the seniority of fellow engineers based
        on which email program they use at work, because everyone used what
        they used to use in their school years.  In my last two jobs though,
	...
2013/10/24-2014/2/5 [Academia/Berkeley/CSUA/Motd, Computer/SW] UID:54746 Activity:nil
9/26    I remember there was web version of the motd with search function
        (originally due to kchang ?).  The last time I used it it was hosted
        on the csua website but I can't remember its url (onset of dementia?)
        now. Can somebody plz post it, tnx.
        \_ http://csua.com
           \_ for some reason I couldn't log in since Sept and the archiver
	...
2012/9/5-11/7 [Academia/Berkeley/CSUA, Academia/Berkeley/CSUA/Motd] UID:54472 Activity:nil
9/4     It looks like there are some issues with wallall at the moment. Any
        plans for it getting fixed? I can run wall, but wallall just gives an
        error.
        \_ Asking questions on the motd will not get any attention from
           any undergrad. You should email politburo or perhaps csua. -ausman
        \_ Asking questions on the motd will not get attention from any
	...
2012/4/23-6/4 [Academia/Berkeley/CSUA/Motd] UID:54359 Activity:nil
4/19    Motd updater thingy seems to be broken, does anyone know why?
        If not, I will take a look later in the day. -ausman
        \_ /etc/motd.public is not getting copied into /etc/motd for a while.
           \_ Now it works and no one knows why. Strange. -ausman
	...
2012/2/6-3/26 [Academia/Berkeley/CSUA, Academia/Berkeley/CSUA/Motd] UID:54301 Activity:nil
2/6     Um, what happened to http://www.csua.berkeley.edu/~myname ?
        "The requested URL /~myname/ was not found on this server."
        \_ Try emailing root or politburo. I don't think that the
           undergrads use this machine anymore. -ausman
        \_ Ausman is mostly right. LDAP went down due to an expired cert and
           took down most of the rest of our stuff. It's probably a thing with
	...
2012/2/24-3/26 [Academia/Berkeley/CSUA/Motd] UID:54313 Activity:nil
2/24    What newsreader should I use on soda?
        \_ USENIX? You serious? Everyone switched to RSS.
           \_ I think you mean usenet not usenix.  usenet was generally much
              better than blogs / rss (cf. comp.lang.c, comp.lang.perl,
              the usenet oracle, alt.* with digg, slashdot, etc.)
           link:reader.google.com is the best
	...
Cache (1779 bytes)
csua.com/?entry=faq1 -> compilers.cs.ucla.edu/%7Ekchang/motd/?entry=faq1
Comparison of entries is not simple unix diff, rather it is independent of white spaces and words. If one entry is at least 85 similar to another entry, the later entry is taken while both the responses are merged. How come entries on Kais Motd seem longer than the other primitive, RCS/CVS motd archivers? So even if some responses are changed and/or censored, Kais Motd does its best merging new responses. Case in point, look at the following highly censored post that lasted for 6 days. So trying to time a highly controversial troll to be archived is futile because itll most likely be deleted by other people before its polled. In addition, even if its polled and archived, other Kais Motd superusers not me will most likely delete it. Polling is done anywhere between 1-4 times a day on a PURE RANDOM schedule. So if your entry got deleted, then it didnt get lucky enough to be polled and saved at the right time. If you want a comprehensive motd, there are other RCSed entries motd,v you can look at. To go a step further, how would a computer categorize the following sentence: How many Republicans does it take to screw a lightbulb? Well, its easy for human beings to recognize that it is a prelude to a joke, but I know of no algorithm that could recognize this highly context sensitive sentence, reliably. If youd like to help, you can volunteer to be a Kais Motd superuser and manually categorize and/or set rules. Couldnt I break the system by totally messing up the motd entries? In contrast my UCLA account has about 3000X more space than my CSUA account, 4X the processor speed per processor, and has 4 processors. Also I can run power hungry lexical/context matching algorithms exponential time if I seriously want to for a whole night without getting squished.
Cache (5565 bytes)
csua.com/?entry=faq2
Phase 3 is optimization (analysis on intermediate format. Ok maybe not exactly compiler work but it's cool to think of it that way. pl" is Measure Of Similarity and does a comprehensive comparison between one entry to another. Comparison of entries is *not* unix file diff, rather it is independent of white spaces and words. If an entry is at least 85% similar to another entry, the later entry is taken while both the responses are merged. I've used PostgreSQL a lot before and I just don't need transaction and other fancy features. Just go to a bookstore and compare the number of MySQL books vs. Informix/Sybase books, the difference is astonishing (20 vs. If AI is in your field of study and you're good with Perl, please email me! At any rate, automatic categorization process is more of an art than science, so imperfection is expected. In short, my algorithm is simple-- every path has a rule, and every path inherits a parents' rule. The rules are applied to every single motd entry, and are assigned points. Then it sorts those results and finds the most specific path. If something happens to have too many categories, then the algorithm finds the "least upper bound" path, which finds the common parent. So it traverses hierarchies both up and down to try to find a range of 1-3 categories. Some entries will still have too many categories even after applying the rules, and are put into Uncategorized/Multicategory. How would you personally categorize an entry that contains the following: "George Bush should use Linux, convert to Christianity, and ride bike." If it's hard for humans to categorize that, then it's even harder for computers. Anyways, as a computer scientist, I'm always worried about the run time. If you're familiar with fast AND accurate adaptive algorithms, please email me. Well, it's easy for human beings to recognize that it is a prelude to a joke, but I know of no algorithm that could recognize this highly context sensitive sentence, reliably. If you have such an algorthm, tell me, we'll work on it. In summary, it's hard to make something accurate, AND FAST. If you'd like to help, you can volunteer to be a Kais Motd superuser and manually categorize and/or set rules. From 4/18/04-now, 10-12 times a day, on a pure random schedule. So trying to time a highly controversial troll to be archived is futile because it'll most likely be deleted by other people before it's polled. In addition, even if it's polled and archived, other Kais Motd superusers (not me) will most likely delete it. Unix grep is O, does not normalize data, and has a tremendous I/O bottleneck. On the other hand, MySQL FULLTEXT search is cached in memory, with normalized data, and search is only lg n Can I see your source code? I need someone who's not only good at programming but is strong algorithmically. You can't just "hash" entries in MD5 or others because hashes are dependent on a few keywords (for example, deleting just one character changes the entire hash). So to check for duplicates, you compare 1 entry to all others, then another entry to all others, so on so forth and eventually you'll need to make (summation of i where i=0 to n) comparisons or (n*(n+1))/2=O(n^2) comparisons! A month of motd (300-400 entries) takes about 1 minute to check for duplicates. pl sec 1/12 400 160000 1 (tested) x 1/2 2500 6250000 70 (tested) 01 1 5000 25000000 218 02 (projected) 2 10000 100000000 873 04 (projected) 3 15000 225000000 1963 06 (projected) 4 20000 400000000 3490 08 (projected) 5 25000 625000000 5453 10 (projected) "# entries" is the estimated number of entries by year 1 thru 5, based on past history of August 2003 to March 2004, and assuming the growth is constant, the number of entries will grow linearly to 25000 by the 5th year. pl will take 5453 minutes, or 37 days to check for duplicates! Essentually the program will not be able to keep up with the rate of growth. The time required to search will grow from 200 ms by year 1 to 1 whole second by the 5th year, which is still quite acceptable. Now do you see why it's critical to find an algorithm that does this faster? Keep in mind these numbers are based on a P4 quad Linux machine with 4 gigs of RAM. We can find adhoc solutions like minimizing n, better hashing, and/or others, but it'll essentially still be O(n^2). Finding better/faster/more accurate ways to check for similarity, at least for this project, is a fun challenge. If you're up to the job and would like to help, email me. What are other things I need to know before checking in the code? I don't want overly special Perl syntax that can really mess up the aesthetics of the code. If you think you're cool and can write super obfiscated Perl code, you better not check in the code. My code has a lot of comments, and you should follow the conventions even if you think it sucks. "Closest to use" local var declarations, caps for global (avoid this), long intuitive variable names (use emacs for name completion if you're lazy), and consistent auto indentations. Speaking of indentations, there are only a few white spaces allowed in the source code. No line feed, DOS style carriage returns, and especially **TABS**. If you check in tabs, I will personally give you a 2 hour lecture on why they're bad. I can't really code well and I can't really help you out, but I'd like to understand your code anyways. pl, you can make minimal changes and maximal damages to the database. I'm not particularly keen to exposing all the loopholes in which people can really pollute the database with minimal work.