Berkeley CSUA MOTD:Entry faq1
Berkeley CSUA MOTD
 
WIKI | FAQ | Tech FAQ
http://csua.com/feed/
2024/11/22 [General] UID:1000 Activity:popular
11/22   

Berkeley Blog FAQ:
==================
What is Berkeley Blog?
\_ Blog is short for WebLog, or weBLOG. A typical blog combines text, 
   images, and links to other web sites and other blogs. Berkeley Blog
   existed since the 70s on Berkeley's UNIX system, decades before
   internet even became widely available.

What is Berkeley CSUA?
\_ CSUA is the Computer Science Undergraduate Association. It is
   where Berkeley Blog is hosted on. CSUA is a student/alumni social
   club that is located in UC Berkeley's famous Soda Hall, where world-
   famous science professors, researchers, and students work at [and 
   occasionally live in]. The main web site, http://csua.berkeley.edu 
   is a student volunteered run web site and is sponsored by the engineering
   department. The current site you're looking at, http://csua.com is 
   a completely different organization which is alumni run and is 
   managed and privately funded by a small number of Berkeley alumni.

How does the Berkeley Blog work?
\_ It works today the same way it worked in the 70s-- by allowing any
   UNIX user on CSUA to read/write a publicly writable text file. The
   file is normally /etc/motd, which CSUA modified to be /etc/motd.public.
   
What is MOTD?
\_ MOTD stands for Message of the Day. On the Berkeley CSUA machine, any 
   user can write to a public writable file called motd (/etc/motd.public).
   Occasionally, 10-15 students and alumni are editting motd
   on the machine which has at times caused interesting, humorous, and 
   unexpected social interactions among the users. Motd is completely 
   unmoderated, and since modification to it is mostly anonymous, it is
   a popular platform for people to vent, to rant, and to make extremely 
   politically incorrect statements.

   In short: MOTD = Berkeley Blog = 1970s concept of blog

How do I post on Berkeley blog?
\_ First, you need to be an active Berkeley student. Go to 337 Soda Hall
   on the north campus and they'll create an account for you. When you 
   have an account you can use your favorite editor (vi, emacs, pico) to 
   edit /etc/motd.public. Have fun!

Why are motd posts hosted on http://csua.com instead of http://csua.berkeley.edu?
\_ Because the student run machine is over-loaded. In addition hosting the 
   site managed by experienced alumni ensures higher fault tolerance and 
   some redundancy. The student run machine has been managed by
   newbie sysadms who are still trying to learn UNIX, and thus the machine
   has been hacked and defaced over and over again.
   
     One of the core missions of the alumni chapter of CSUA 
   (http://csua.com is to help students and alumni to interacting with 
   each other as it has been done for decades. Alumni members contribute 
   to CSUA by holding infosessions, giving technical/professional advice, 
   giving equipments/funds, and others. In most cases they benefit by 
   recruiting talented [and cheap] students fresh from college. 

What are some of your hidden features?
\_ Try lynx -dump 'http://csua.com/?text=1day
   There are *many* others, most of them require a username/password.
   Go to the main page, all the way to the bottom, then click the
   little dot "." to see more. To gain full access please make donation
   to CSUA. To go the bottom of the page to find out how.
 
How did you get all the data?
\_ Thank mehlhaff and Google. All the entries prior to 1998 are from
   Google's newsgroup archive. Entries from 1998 to February 11, 2004 are
   from mehlhaff's incremental RCS motd.v file. He did an *excellent* job 
   archiving the old entries. I took those files and wrote a script that 
   randomly picked 8 entries from each day and "inserted" (parse, compare, 
   merge, insert) the entries into the database. Each year's worth of entries 
   takes about 1-3 hours to insert. Then it takes several hours to auto 
   fetch additional URL pages, and several hours to auto categorize them.
   Starting from February 11 I used my own motd extraction program to archive
   the entries.

How come some dates ranges/entries are completely missing?
\_ Read above. Thank mehlhaff.

How come entries seem longer than the other primitive, RCS/CVS motd 
archivers?
\_ Because Berkely Motd checks for duplication of entries and duplication
   of responses of those entries, then *merges* them. It is not line
   and carriage return sensitive. It is not unix file diff. It looks at
   similarities per paragraph.
     So even if some responses are changed and/or censored, Berkeley Motd 
   does its best merging new responses. Case in point, look at the following
   highly censored post that lasted for 6 days. At any given day the
   response is only 0-35 lines, but 6 days of repeated responses and
   censoring results in over 100 lines:
   http://csua.com/?entry=12295
     Of course there will be entries that are very similar but were
   entered differently (less that 2.5%). If you see them, please email
   to one of the Berkeley Motd superusers. Please don't email me.

How come some entries are missing?
\_ Read above. Polling is done on a pure random schedule.
   PURE RANDOM schedule. So if your entry got deleted, then 
   it didn't get lucky enough to be polled and saved at the right time.
   If you want a **comprehensive** motd, there are other RCSed entries 
   (motd,v) you can look at.

Do you edit/modify entries?
\_ If they exist, then no modifications were made. If they're 
   censored, then  other Berkeley Motd superusers did it. I don't have 
   a lot of time to read most of these entries. I pretty much read 
   all the technical entries in Computers/SW/Languages, 
   Computers/HW/CPU, etc, while skipping most of the other trolls.

How did you categorize the entries?
\_ I borrowed Google's Pigeon Technology(tm). It is an amazing technology!
   As a Berkeley Motd user, you're familiar with the speed and accuracy of a 
   Berkeley Motd search. How exactly does it manage to find the right results 
   for every query as quickly as it does? The heart of the search technology 
   is PigeonCategorize, a system for ranking entries developed by Google 
   founders Larry Page and Sergey Brin.   
   

   Building upon the breakthrough work, low cost pigeon clusters (PCs) could 
   be used to compute the relative value of web pages faster than human 
   editors or machine-based algorithms. And while Berkeley Motd has only 1
   brilliant Computer Scientist working to improve every aspect of Berkeley Motd, 
   PigeonCategorize continues to provide the basis for all of the web search 
   tools 24x7x365.
    
    

Couldn't I break the system by totally messing up the motd entries?
\_ You sure could. Please do try, preferably using a motd mudging
   script, croned, forked bombed, that uses a lot of cpu time.

What other features are you adding?
\_ -Letting other Berkeley Motd superusers add/change categories and entries.
   -Letting other Berkeley Motd superusers add/change categorization rules.
   -Adaptive algorithm for categorization
   -Faster and/or more accurate entry duplication finding algorithm.

Do you have suggestions to active CSUA members who post on motd?
\_ Yes, it's stupid to post stuff on /csua/pub/jobs/*. It can't be
   easily accessed from Berkeley Motd. Be a good boy/girl and post a 
   URL instead.

Your web site has been around for a while, how come I never knew about it?
\_ Because you're not 3733+ enuf.