Berkeley CSUA MOTD:Entry 43976
Berkeley CSUA MOTD
 
WIKI | FAQ | Tech FAQ
http://csua.com/feed/
2025/05/24 [General] UID:1000 Activity:popular
5/24    

2006/8/11-14 [Computer/SW/Editors/Emacs] UID:43976 Activity:nil
8/11    How do I open a unicode file containing English text in Emacs 21.3.1
        such that I won't see tons of "^@" characters?  Thanks.
        \_ The emacs portion of the Unicode-HOWTO may be useful:
           http://www.ibiblio.org/pub/Linux/docs/HOWTO/Unicode-HOWTO
2025/05/24 [General] UID:1000 Activity:popular
5/24    

You may also be interested in these entries...
2013/2/19-3/26 [Computer/SW/OS/OsX] UID:54611 Activity:nil
2/19    I program a lot by sshing to a Linux cluster.  So I'm used to using
        Xemacs to code.  This works fine from a Linux or Windows workstation,
        but sometimes I have to use a Mac.  On Mac, the meta is usually
        bound to option, but that often doesn't work over ssh for some reason.
        This makes using emacs a real pain.  Any suggestions on how to fix it?
        (Other than "use vi")
	...
2007/11/9-12 [Computer/SW/Editors/Emacs] UID:48590 Activity:nil 75%like:48591
11/9    What's the best Xemacs font for coding?
        \- the Entrella Font
          \_ no no, the ETANRETLA font!
	...
2007/11/9-11 [Computer/SW/Editors/Emacs] UID:48591 Activity:nil 75%like:48590
11/9    What's the best standard Xemacs font?
	...
2006/3/17-18 [Computer/SW/Editors/Vi, Computer/SW/Editors/Emacs] UID:42280 Activity:moderate
3/16    I'm looking to troll the motd. What are some good topics guaranteed
        to get results?
        \_ sex!
        \_ Easy.  Anything of interest to more than 1 person that isn't
           resolvable.  Next!
        \_ http://www.trollwars.com  -John
	...
2005/10/27-28 [Computer/SW/Editors, Computer/SW/Editors/Emacs] UID:40292 Activity:low
10/27   My new job is completely Windoze based...anyone know of a windows
        version of TextWrangler.  I'm looking for a free text editor with
        extended capabilities.  Thanks.  -scottyg
        \_ I'm completely in love with Crimson Editor.
         \_Thanks, I downloaded SciTE and it seems to rock.  I've heard good
           things about Crimson Editor though. -scottyg
	...
2005/3/5-8 [Computer/SW/Editors/Emacs, Computer/Networking] UID:36537 Activity:kinda low
3/5     I have a problem with X. Let's say I open up xemacs. If I don't do
        anything to it after 15 minutes, connection would die and I'd have
        to restart it. How do I make it more persistent? ok thx
        \_ you're connecting through a NAT gateway, aren't you... and X11
           forwarding throuh ssh?  Turn on connection keep-alives
           \_ NAT yes, X11 forward no (raw forward). Where is the option?
	...
2004/6/1-2 [Computer/SW/Editors/Emacs] UID:30524 Activity:low
6/1     Argh!  I accidently hit double-bucky-kill-XEmacs.  Is there some
        magical incantation to make it open up the 20-some-odd files I was
        editing?  (I don't need to recover lost data).
        \_ How about this? Find the last .saves-xxxxx file in your home dir,
           run "grep -v # ~/.saves-xxxxx", then somehow pass the resulting
           lines as arguments to the elisp function (find-file ......).
	...
2004/4/29 [Computer/SW/Editors/Emacs, Computer/Networking] UID:13470 Activity:nil
4/29    Does anoyone know why ^K (delete line) works so slowly in xemacs
        over X-forwarding?  It takes like, 5 seconds a line over my DSL
        connection.  What's the deal?
        \_ It needs to re-transmit the whole screen so as to redraw?
	...
2004/3/17-18 [Computer/SW/Languages/Java, Academia/Berkeley/CSUA/Motd] UID:12730 Activity:nil
3/17    How can I change the indentation level in xemacs when writing C
        and Java?  I looked online, and through the help files but
        couldn't find anything.  (It's currently set at 4, I want 2)
                -jrleek
        \_ I thought the motd censor liked technical posts.  Why hasn't
           he answered this one?
	...
2003/12/30-31 [Computer/SW/Editors/Emacs] UID:11619 Activity:nil
12/30   Does anyone know what happened to Windows port of Emacs 21.x?
        I can only find downloads for 20.7 now on gnu.
        \_ look harder: http://ftp.gnu.org/gnu/windows/emacs/21.3
           \_ There are only README files in there.
              \_ tried reading any of them? you might have noticed that
                 http://ftp.gnu.org was hacked a while back, and they haven't
	...
2003/11/28 [Computer/SW/Unix] UID:11258 Activity:nil
11/27   I am trying to install custom editing modes to a default solaris xemacs
        installation which I have no root access.  Using the menu doesn't help
        as it attempts to install packages at location I have no access.  How
        do I instruct xemacs to use alternate path to install additional
        packages?
        \_ (setq load-path (cons "~/path/to/emacs/pkg" load-path))
	...
2003/8/20-21 [Computer/SW/Editors/Emacs] UID:29413 Activity:high
8/20    Georgy (she has a CSUA acct) got /.'d...
        \_ she was also on the front page of USA Today, but of course that's
           nothing to slashdot.
        \_ she has my view on text editors:
        I'm so glad you asked!!
        Both. vi for quick editing, emacs (NOT xemacs) for coding projects.
	...
Cache (8192 bytes)
www.ibiblio.org/pub/Linux/docs/HOWTO/Unicode-HOWTO
People in different countries use different characters to represent the words of their native languages. Nowadays most applications, including email systems and web browsers, are 8-bit clean, ie they can operate on and display text correctly provided that it is represented in an 8-bit character set, like ISO-8859-1. There are far more than 256 characters in the world - think of cyrillic, hebrew, arabic, chinese, japanese, korean and thai -, and new characters are being invented now and then. It is impossible to store text with characters from different character sets in the same document. For example, I can cite russian papers in a German or French publication if I use TeX, xdvi and PostScript, but I cannot do it in plain text. As long as every document has its own character set, and recognition of the character set is not automatic, manual user intervention is inevitable. cn/ I had to tell Netscape that the web page is coded in GB2312. ISO has issued a new standard ISO-8859-15, which is mostly like ISO-8859-1 except that it removes some rarely used characters (the old currency sign) and replaced it with the Euro sign. If users adopt this standard, they have documents in different character sets on their disk, and they start having to think about it daily. But computers should make things simpler, not more complicated. The solution of this problem is the adoption of a world-wide usable character set. The use of 1 byte to represent 1 character is, however, an accident of history, caused by the fact that computer development started in Europe and the US where 96 characters were found to be sufficient for a long time. There are basically four ways to encode Unicode characters in bytes: UTF-8 128 characters are encoded using 1 byte (the ASCII characters). The other 2147418112 characters (not assigned yet) can be encoded using 4, 5 or 6 characters. This encoding can only represent the first 65536 Unicode characters. UTF-16 This is an extension of UCS-2 which can represent 1112064 Unicode characters. The first 65536 Unicode characters are represented as two bytes, the other ones as four bytes. The space requirements for encoding a text, compared to encodings currently in use (8 bit per character for European languages, more for Chinese/Japanese/Korean), is as follows. This has an influence on disk storage space and network download speed (when no form of compression is used). UTF-8 No change for US ASCII, just a few percent more for ISO-8859-1, 50% more for Chinese/Japanese/Korean, 100% more for Greek and Cyrillic. Given the penalty for US and European documents caused by UCS-2, UTF-16, and UCS-4, it seems unlikely that these encodings have a potential for wide-scale use. The Microsoft Win32 API supports the UCS-2 encoding since 1995 (at least), yet this encoding has not been widely adopted for documents - SJIS remains prevalent in Japan. UTF-8 on the other hand has the potential for wide-scale use, since it doesn't penalize US and European users, and since many text processing programs don't need to be changed for UTF-8 support. In the following, we will describe how to change your Linux system so it uses UTF-8 as text encoding. The problem with it is that you end up with two versions of your program: one which understands UCS-2 text but no 8-bit encodings, and one which understands only old 8-bit encodings. Moreover, there is an endianness issue with UCS-2 and UCS-4. edu/in- notes/iana/assignments/character-sets says about ISO-10646-UCS-2: "this needs to specify network byte order: the standard does not specify". And RFC 2152 is even clearer: "ISO/IEC 10646-1:1993 specifies that when characters the UCS-2 form are serialized as octets, that the most significant octet appear first." Whereas Microsoft, in its C/C++ development tools, recommends to use machine-dependent endianness (ie little endian on ix86 processors) and either a byte-order mark at the beginning of the document, or some statistical heuristics. The UTF-8 approach on the other hand keeps char*' as the standard C string type. As a result, your program will handle US ASCII text, independently of any environment variables, and will handle both ISO-8859-1 and UTF-8 encoded text provided the LANG environment variable is set accordingly. html 2 Display setup We assume you have already adapted your Linux console and X11 configuration to your keyboard and locale. This is explained in the Danish/International HOWTO, and in the other national HOWTOs: Finnish, French, German, Italian, Polish, Slovenian, Spanish, Cyrillic, Hebrew, Chinese, Thai, Esperanto. Doing so will only cause problems when you switch to Unicode. When you call unicode_start', the console's screen output is interpreted as UTF-8. Also, the keyboard is put into Unicode mode (see "man kbd_mode"). You will want to use display characters from different scripts on the same screen. psf) which covers Latin, Cyrillic, Hebrew, Arabic scripts. It covers ISO 8859 parts 1,2,3,4,5,6,8,9,10 all at once. To work around the constraint that a VGA font can only cover 512 characters simultaneously, he provides a rich Unicode font (2279 characters, covering Latin, Greek, Cyrillic, Hebrew, Armenian, IPA, math symbols, arrows, and more) in the typical 8x16 size and a script which permits to extract any 512 characters as a console font. diff from Edmund Thomas Grimley Evans and Stanislav Voronyi. org> has implemented an UTF-8 console terminal emulator. It uses Unicode fonts and relies on the Linux frame buffer device. Even if they are not Unicode fonts, they will help in displaying Unicode documents: at least Netscape Communicator 4 and Java will make use of foreign fonts when available. The following programs are useful when installing fonts: . "mkfontdir directory" prepares a font directory for use by the X server, needs to be executed after installing fonts in a directory. "xset -q | sed -e '1,/^Font Path:/d' | sed -e '2,$d' -e 's/^ //'" displays the X server's current font path. "xset fp+ directory" adds a directory to the X server's current font path. To add a directory permanently, add a "FontPath" line to your /etc/XF86Config file, in section "Files". "xset fp rehash" needs to be executed after calling mkfontdir on a directory that is already contained in the X server's current font path. "xfontsel" allows you to browse the installed fonts by selecting various font properties. "xlsfonts -fn fontpattern" lists all fonts matching a font pattern. In particular, "xlsfonts -ll -fn font" lists the font properties CHARSET_REGISTRY and CHARSET_ENCODING, which together determine the font's encoding. The following fonts are freely available (not a complete list): . The ones contained in XFree86, sometimes packaged in separate packages. For example, SuSE has only normal 75dpi fonts in the base xf86' package. The other fonts are in the packages xfnt100', xfntbig', xfntcyr', xfntscl'. gz As already mentioned, they are useful even if you prefer XEmacs to GNU Emacs or don't use any Emacs at all. However, this approach is more complicated, because instead of working with Font' and XFontStruct', the programmer has to deal with XFontSet', and also because not all fonts in the font set need to have the same dimensions. Markus Kuhn has assembled fixed-width 75dpi fonts with Unicode encoding covering Latin, Greek, Cyrillic, Armenian, Georgian, Hebrew scripts and many symbols. They cover ISO 8859 parts 1,2,3,4,5,7,8,9,10,13,14,15,16 all at once. These fonts are required for running xterm in utf-8 mode. Markus Kuhn has also assembled double-width fixed 75dpi fonts with Unicode encoding covering Chinese, Japanese and Korean. Roman Czyborra has assembled an 8x16 / 16x16 75dpi font with Unicode encoding covering a huge part of Unicode. It is not fixed-width: 8 pixels wide for European characters, 16 pixels wide for Chinese characters. gz /usr/X11R6/lib/X11/fonts/misc # cd /usr/X11R6/lib/X11/fonts/misc # mkfontdir # xset fp rehash . Primoz Peterlin has assembled an ETL family fonts covering Latin, Greek, Cyrillic, Armenian, Georgian, Hebrew scripts. Mark Leisher has assembled a proportional, 17 pixel high (12 point), font, called ClearlyU, covering Latin, Greek...