Berkeley CSUA MOTD:Entry faq1
2024/11/22 [General] UID:1000 Activity:popular
Berkeley Blog FAQ:
==================
What is Berkeley Blog?
\_ Blog is short for WebLog, or weBLOG. A typical blog combines text,
images, and links to other web sites and other blogs. Berkeley Blog
existed since the 70s on Berkeley's UNIX system, decades before
internet even became widely available.
What is Berkeley CSUA?
\_ CSUA is the Computer Science Undergraduate Association. It is
where Berkeley Blog is hosted on. CSUA is a student/alumni social
club that is located in UC Berkeley's famous Soda Hall, where world-
famous science professors, researchers, and students work at [and
occasionally live in]. The main web site, http://csua.berkeley.edu
is a student volunteered run web site and is sponsored by the engineering
department. The current site you're looking at, http://csua.com is
a completely different organization which is alumni run and is
managed and privately funded by a small number of Berkeley alumni.
How does the Berkeley Blog work?
\_ It works today the same way it worked in the 70s-- by allowing any
UNIX user on CSUA to read/write a publicly writable text file. The
file is normally /etc/motd, which CSUA modified to be /etc/motd.public.
What is MOTD?
\_ MOTD stands for Message of the Day. On the Berkeley CSUA machine, any
user can write to a public writable file called motd (/etc/motd.public).
Occasionally, 10-15 students and alumni are editting motd
on the machine which has at times caused interesting, humorous, and
unexpected social interactions among the users. Motd is completely
unmoderated, and since modification to it is mostly anonymous, it is
a popular platform for people to vent, to rant, and to make extremely
politically incorrect statements.
In short: MOTD = Berkeley Blog = 1970s concept of blog
How do I post on Berkeley blog?
\_ First, you need to be an active Berkeley student. Go to 337 Soda Hall
on the north campus and they'll create an account for you. When you
have an account you can use your favorite editor (vi, emacs, pico) to
edit /etc/motd.public. Have fun!
Why are motd posts hosted on http://csua.com instead of http://csua.berkeley.edu?
\_ Because the student run machine is over-loaded. In addition hosting the
site managed by experienced alumni ensures higher fault tolerance and
some redundancy. The student run machine has been managed by
newbie sysadms who are still trying to learn UNIX, and thus the machine
has been hacked and defaced over and over again.
One of the core missions of the alumni chapter of CSUA
(http://csua.com is to help students and alumni to interacting with
each other as it has been done for decades. Alumni members contribute
to CSUA by holding infosessions, giving technical/professional advice,
giving equipments/funds, and others. In most cases they benefit by
recruiting talented [and cheap] students fresh from college.
What are some of your hidden features?
\_ Try lynx -dump 'http://csua.com/?text=1day
There are *many* others, most of them require a username/password.
Go to the main page, all the way to the bottom, then click the
little dot "." to see more. To gain full access please make donation
to CSUA. To go the bottom of the page to find out how.
How did you get all the data?
\_ Thank mehlhaff and Google. All the entries prior to 1998 are from
Google's newsgroup archive. Entries from 1998 to February 11, 2004 are
from mehlhaff's incremental RCS motd.v file. He did an *excellent* job
archiving the old entries. I took those files and wrote a script that
randomly picked 8 entries from each day and "inserted" (parse, compare,
merge, insert) the entries into the database. Each year's worth of entries
takes about 1-3 hours to insert. Then it takes several hours to auto
fetch additional URL pages, and several hours to auto categorize them.
Starting from February 11 I used my own motd extraction program to archive
the entries.
How come some dates ranges/entries are completely missing?
\_ Read above. Thank mehlhaff.
How come entries seem longer than the other primitive, RCS/CVS motd
archivers?
\_ Because Berkely Motd checks for duplication of entries and duplication
of responses of those entries, then *merges* them. It is not line
and carriage return sensitive. It is not unix file diff. It looks at
similarities per paragraph.
So even if some responses are changed and/or censored, Berkeley Motd
does its best merging new responses. Case in point, look at the following
highly censored post that lasted for 6 days. At any given day the
response is only 0-35 lines, but 6 days of repeated responses and
censoring results in over 100 lines:
http://csua.com/?entry=12295
Of course there will be entries that are very similar but were
entered differently (less that 2.5%). If you see them, please email
to one of the Berkeley Motd superusers. Please don't email me.
How come some entries are missing?
\_ Read above. Polling is done on a pure random schedule.
PURE RANDOM schedule. So if your entry got deleted, then
it didn't get lucky enough to be polled and saved at the right time.
If you want a **comprehensive** motd, there are other RCSed entries
(motd,v) you can look at.
Do you edit/modify entries?
\_ If they exist, then no modifications were made. If they're
censored, then other Berkeley Motd superusers did it. I don't have
a lot of time to read most of these entries. I pretty much read
all the technical entries in Computers/SW/Languages,
Computers/HW/CPU, etc, while skipping most of the other trolls.
How did you categorize the entries?
\_ I borrowed Google's Pigeon Technology(tm). It is an amazing technology!
As a Berkeley Motd user, you're familiar with the speed and accuracy of a
Berkeley Motd search. How exactly does it manage to find the right results
for every query as quickly as it does? The heart of the search technology
is PigeonCategorize, a system for ranking entries developed by Google
founders Larry Page and Sergey Brin.
Building upon the breakthrough work, low cost pigeon clusters (PCs) could
be used to compute the relative value of web pages faster than human
editors or machine-based algorithms. And while Berkeley Motd has only 1
brilliant Computer Scientist working to improve every aspect of Berkeley Motd,
PigeonCategorize continues to provide the basis for all of the web search
tools 24x7x365.
Couldn't I break the system by totally messing up the motd entries?
\_ You sure could. Please do try, preferably using a motd mudging
script, croned, forked bombed, that uses a lot of cpu time.
What other features are you adding?
\_ -Letting other Berkeley Motd superusers add/change categories and entries.
-Letting other Berkeley Motd superusers add/change categorization rules.
-Adaptive algorithm for categorization
-Faster and/or more accurate entry duplication finding algorithm.
Do you have suggestions to active CSUA members who post on motd?
\_ Yes, it's stupid to post stuff on /csua/pub/jobs/*. It can't be
easily accessed from Berkeley Motd. Be a good boy/girl and post a
URL instead.
Your web site has been around for a while, how come I never knew about it?
\_ Because you're not 3733+ enuf.