EMACS 18.59 by Thomas Bellman

Originally Howard Gayle wrote a set of patches for GNU Emacs
18.55 for displaying, sorting and converting 8-bit characters.
However, Emacs 18.55 contains some bugs, and I wanted to apply
them to 18.57.  This wasn't straightforward, since there were
lots of internal differences between 18.55 and 18.57.  However, I
finally succeeded in applying them.  At least I though I had...
When Emacs 18.58 was released soon after, I applied my patches to
that version.  I soon found out that I had a bug, causing Emacs
to sporadically, but repeatably, abort and dump core.  After many
months, Linus Tolke (Linus@Lysator.LiU.Se) got tired of this, and
found the bug.  Now the time has come for Emacs 18.59!

Below follows what Howard Gayle has written about his patches for
Emacs 18.55.  You should read that before installing these
patches.  Just substitute 18.59 when he speaks about 18.55.  If
you have only the diffs, then you need to recompile all the .elc
files in the lisp directory, but if you have the entire patched
18.59, I have recompiled them for you (using a newer and,
presumably better, byte-compiler).  I have also included an extra
elisp file, iso-chars.el, for displaying ISO 8859-1 characters,
since I wanted an alternative to Howard Gayles variants.  Sorry,
not documented, but it exists...

Included in this version, is a patch by Niclas Wiberg
(nicwi@isy.liu.se) that allows insertion of eight-bit characters
under X-windows, while retaining the use of the Meta key.  It
works by inserting a C-q before any character with the high bit
set.  Thus it is not quite as general as I would like, but since
some people liked it, and it wasn't any worse than before, I
include it.

Install the patches by copying all files into the respective
directories in the Emacs distribution.  The apply the diffs in
the file 'DIFFS'.

You will also have to rebuild the info files emacs* in the info
directory, if you only have the diffs.

There is one known incompatibility with the original Emacs.  The
standard Emacs interprets the regexp "[abc-]" as being equal to
"[abc---]", while Howard Gayle's patches makes Emacs give the
error "Invalid regexp: Premature end of regular expression" when
seeing such a regexp.  I know of only one elisp package that uses
a regexp of this form, and that is supersite.el.  If you are
using supersite, then the easy way out is to change that (single)
regexp.

If you find any bugs when using these routines, or if you find
any bugs not in the standard 18.59, I would like to know about
them.  I might take myself some time and try to fix them, but no
promise.

Share and enjoy!

--
Thomas Bellman, Lysator Academic Computer Club
University of Linkoping, Sweden
Bellman@Lysator.LiU.Se


------------------------------------------------------------------------


      SUMMARY

I have modified GNU Emacs version 18.55 to handle many 8-bit
character sets, including the ISO 8859 character sets.  For each
character, it is possible to customize the byte(s) sent to the
terminal to display that character.  X11R4 is also supported, to
an extent.  Case determination, case changing, and sorting can
all be customized.  Input facilities are primitive.


      DISCOURAGEMENT

Emacs version 19 will support 8-bit character sets.  That
support is based on my modifications, but there will probably be
some differences between the 8-bit character set support in this
modified version 18.55 and the support in version 19.
Therefore, if you can wait for version 19 I urge you to do so.
Richard Stallman says he does not know when version 19 will be
available.

This is alpha-test software.  It has known bugs.  I'm posting it
to alt.sources to emphasize that, and to avoid having it
archived.  If you don't know your way around GNU Emacs, please
don't try to install it.  I don't have time to provide support.
(But please send bug reports anyway.)

Input support is primitive.  X windows support is for X11 only,
and is incomplete.


      CHARACTER SETS SUPPORTED

So I haven't scared you off yet.  OK, you were warned.  My
modifications allow GNU Emacs to handle any character set
provided that each character is represented by exactly one 8-bit
byte, and the codes for space, newline, and horizontal tab are
the same as in ASCII.  Now for some definitions.


      DEFINITIONS

A glyf is something that takes up exactly one position on the
display of a terminal, terminal emulator, or window system.  For
example, 'a' is a glyf, as is a yellow, blinking, underlined '7'
on a red background.  It may be necessary to transmit many bytes
to a terminal to display one glyf.  A rope is a sequence of
glyfs.  (The name is an analogy to string, which is a sequence
of characters.) For example, the glyf '^' followed by the glyf
'C' forms a rope of length 2.  Glyfs are represented as unsigned
16-bit integers.  Ropes are represented as vectors of glyfs.


      CHAR TABLES

There's a new lisp object: char tables.  A char table specifies,
for each 8-bit character, the rope to use to display that
character.  Char tables are associated with windows, not
buffers, so one buffer can be displayed in several different
windows with several different char tables.


      CASE TABLES

Another new lisp object, case tables, specify for each 8-bit
character the case: upper, lower, or none.


      SORT TABLES

Another new lisp object, sort tables, specify for each 8-bit
character its sorting position.  Sort tables are also used for
searching.  Special sort tables can be set up, for example, to
ignore diacritical marks when searching.


      TRANS TABLES

Finally, trans tables are lisp objects that map each 8-bit
character into some other character.  They are used for case
conversion, and can also be used for character set conversion.


      ISO 8859/1 SUPPORT

I include support for displaying ISO 8859/1 characters.  On
ASCII terminals they display as various ropes, e.g. A with grave
accent displays as {`A}.  If your terminal can display some of
the characters correctly, e.g. by using shift-out and shift-in,
then you can write a lisp/term file to do that.  I include as an
example lisp/term/fa4440a.el for the Facit 4440 Twist terminal
with a Swedish PROM.  If your terminal (emulator) provides full
ISO 8859/1, you can just send 8-bit characters to it directly.
See the code in lisp/term/x-win.el starting with "(if (fboundp
'get-glyf)" for an example.


      SWEDISH SUPPORT

I include support for Swedish as an example of language
support.  This includes a swedish mode analogous to text mode,
and sort tables for Swedish alphabetical order.


      INPUT

Input is kludgy.  The file lisp/iso8859-1-insert.el defines
little functions to insert each non-ASCII ISO 8859/1 character.
These are put into the global keymap under C-x 8, which is
supposed to be mnemonic for 8859.  So e.g. "C-x 8 ` A" runs
insert-A-grave.  This is OK for infrequently used characters,
but for those you use often I suggest you use programmable keys
on your terminal, if possible.  For example, Swedish uses o with
umlaut a lot, so I have one of the programmable keys on my
terminal set up to transmit "C-q 3 6 6".  Using C-q also means
this works with e.g. incremental search, not just for
inserting.

Here's what I do on my Facit 4440 Twist:
   1) Press Setup
   2) Press 5 to enter Setup B mode
   3) Press F4 C-q 3 4 5 C-Return
      Press F5 C-Q 3 4 4 C-Return
      Press F6 C-Q 3 6 6 C-Return
      Press F7 C-Q 3 5 1 C-Return
      Press F8 C-Q 3 7 4 C-Return
      Press Shift-F4 C-Q 3 0 5 C-Return
      Press Shift-F5 C-Q 3 0 4 C-Return
      Press Shift-F6 C-Q 3 2 6 C-Return
      Press Shift-F7 C-Q 3 1 1 C-Return
      Press Shift-F8 C-Q 3 3 4 C-Return
   4) Press S to save everything in nonvolatile memory.
This puts a with ring on function key 4, a with umlaut on F5, o
with umlaut on F6, e with acute accent on F7, and u with umlaut
on F8.


      X WINDOWS SUPPORT

Only X11 is supported, not X10.  I've only tried this on X11R4.
Eventually, the idea is for each glyf, which is really just  an
unsigned 16-bit integer, to be treated as two bytes.  The low
order byte selects one face code in a font, for example 'g'.
The high order byte selects a graphic context (GC).  But for
now, there's only one GC.

For input of frequently-used characters I just hacked
stringFuncVal in src/x11term.c.  You may wish to do the same.

Many of the X11R4 fonts advertised as ISO 8859/1 don't really
contain all the characters; 7x14 does, so that's what I use for
now.  Here's another font to try:

>From: jw@sics.se (Johan Widen)
>Newsgroups: comp.windows.x
>Subject: eightbit version of the 'fixed' font available
>Message-ID: <1990Mar9.164011.1775@sics.se>
>Date: 9 Mar 90 16:40:11 GMT
>Distribution: comp
>Organization: Swedish Institute of Computer Science, Kista
>
>An eightbit version of the X11R4 'fixed' font (also known as 6x13) is available
>for anonymous ftp from
>	sics.se (192.16.123.90)
>in the compressed tar file
>	archive/fixed.bdf.Z
>
>The glyphs below 128 are unchanged. The ISO-8859-1 characters from 160 to 255
>have been added.
>
>I'm interested in any improvements/fixes that you make to this font.
>
>--
>Johan Widen
>SICS, PO Box 1263, S-164 28 KISTA, SWEDEN	Internet: jw@sics.se
>Tel: +46 8 752 15 32	Ttx: 812 61 54 SICS S	Fax: +46 8 751 72 30


      OTHER APPLICATIONS

These modifications have other uses than supporting 8-bit
character sets.  The file lisp/emphasis.el uses the high bit to
indicate emphasis, e.g. underlining, of 7-bit ASCII.  A hook in
lisp/man.el then displays italicized test in manual entries with
emphasis if possible.

The file lisp/rot13.el contains a disgusting hack that displays
a buffer in another window, but with a rot13 char table.  I
really use this when reading rec.humor.funny with Gnews.

If you don't like unprintable characters to be displayed in
octal, you can change to hex or whatever.


      RELATED SOFTWARE

My cz system lets you print ISO 8859/1 text on PostScript
printers.  It interfaces to GNU Emacs.  To get it, get these
articles from your nearest comp.sources.misc archive:

cz          comp.sources.misc volume  8 issues 65-75, 77-78 ( 1 Oct 1989)
                                        issue  97           (28 Oct 1989)
libhoward   comp.sources.misc volume  8 issues 80-87        ( 1 Oct 1989)
                                        issue  96           (28 Oct 1989)


      BUGS

It should be possible to format texinfo files into info files by
doing this (e.g. for cl.texinfo):
   % cd man; emacs -batch -funcall batch-texinfo-format cl.texinfo
   texinfo formatting /usr/local/free/gnu-emacs/18.55i/man/cl.texinfo...
   Formatting Info file...
   Making tags table for Info file...
   >> Error: (void-variable This)
   >>  point at
   >>  Info file: cl,    -*-Text-*-
   >>  produced by texinfo-format-buffer
   >>  from file: cl.texinfo
   >>  Copyright (C
But that gives the error shown.  However this works:
   % emacs -batch -load info -funcall batch-texinfo-format cl.texinfo
To the first person who supplies me with a fix for this bug, I
offer a color portrait of the Swedish Royal Family, with a
genuine Swedish postage stamp on the other side.


      INSTALLATION

Start with a copy of GNU Emacs 18.55 as distributed.  Parts 1
through 4 are shar archives; unshar them.

Two of the lisp files have high-order bits set.  They are
encoded with Brad Templeton's abe system, which was posted to
comp.sources.misc on 4 June 1989 as volume 7, issues 1 and 2,
archive name abe.  To extract them, you must have the dabe
command.  Do:
   % cd lisp
   % dabe el.abe
   % cd ..

Parts 5 through 12 are context diffs.  Parts 11 and 12 are
together the diffs to man/emacs.tex; they must be concatenated.
Apply the diffs with patch.

Now install Emacs as usual.  When byte-recompiling the elisp
code, it may be necessary to load case-table.el, char-table.el,
sort-table.el, and trans-table.el first.  Be sure to
byte-compile all the new .el files you intend to use.  Here's
the complete list:
   case-table.el
   char-table-vt100.el
   char-table.el
   emphasis.el
   iso8859-1-ascii.el
   iso8859-1-insert.el
   iso8859-1-swedish.el
   iso8859-1.el
   rot13.el
   sort-table.el
   swedish.el
   trans-table.el
   term/id100.el
   term/fa4440a.el
   term/fa4440b.el

You'll probably want to load some character set and language
support from lisp/site-init.el.  For example, ours starts like
this:
   (load "iso8859-1")
   (garbage-collect)
   (load "iso8859-1-insert")
   (garbage-collect)
   (load "swedish")
   (garbage-collect)


      CHANGES

Here's a brief summary of what I changed in each file.  In src:
abbrev.c: expand-abbrev: Use casetab.h macros.
   Use HYPHEN.
alloc.c:
   GC case, char, sort, and trans tabs.
buffer.c:
   reset_buffer_local_variables: Initialize case_table_v, etc..
   Drop selective_display_ellipses.
buffer.h:
   Add case_table_p, etc. & buffer_char_table.  Drop ctl_arrow.
casefiddle.c: casify_object & casify_region: Use casetab.h macros.
config.h-dist: Add 30000 to PURESIZE.
cmds.c: Use chartab.h macros.
data.c: Add arg_out_of_range.
dired.c: Use standard_downcase_table_p instead of downcase_table.
dispextern.h: Change char to glyf_t.
dispnew.c: Use chartab.h macros.  Change char to glyf_t.  Check for X 
   windows in chartab.c now.
editfns.c: Use casetab.h & chartab.h macros.
emacs.c: Call init_case_table_once, init_char_table_once,
   syms_of_case_table, and syms_of_char_table.
fileio.c: #include casetab.h
fns.c: Add string-lessp*.
indent.c:
   Use chartab.h macros.
   Use char table to compute lengths instead of hard code.
   Drop selective_display_ellipses.
keyboard.c: Use ROPE_LEN to check if direct insertion OK.
lisp.h:
   Move case macros to casetab.h.
   Add Lisp_Chartab and related definitions.
minibuf.c: Use casetab.h macros.
process.c: Use transtab.h macros.
print.c: Print out char tables.
regex.c: Drop translate.
regex.h: Use sort table when compiling pattern.
scroll.c: lisp.h must be included before dispextern.h.
search.c: Remove downcase_table & compute_trt_inverse.
   syms_of_search: Remove initialization of downcase_table.
   Use NEWLINE.
term.c: char -> glyf.
termchar.h: Replace vector DCICcost by function.
termhooks.h: {insert,write,delete}_chars_hook ->
   {insert,write,delete}_glyfs_hook
window.c:
   Add window-char-table & set-window-char-table.
   Save char tables for saved windows.
window.h: Add window_char_table.
xdisp.c:
   Use chartab.h macros.  char->glyf.
   Drop selective_display_ellipses.
x11term.c: char->glyf
ymakefile: Add new files and include dependencies.

In lisp:
keypad.el: Add backtab code.  Comments.
man.el: Add manual-entry-hook.  Default to default-manual-entry-hook,
   which removed underlining and overstriking.
mlconvert.el: Changing control-code display is different.
rmail.el: Run rmail-get-new-mail-hook after getting new mail.
sendmail.el: Run mail-send-hook just before sending mail.
sort.el: string< -> string-lessp*
text-mode.el: (provide 'text-mode)
term/x-win.el: direct-map high-order ISO 8859 bits

In etc:
NEWS
makedoc.com

In man:
emacs.tex


      EMAIL

Here's how I read and send email in ISO 8859/1 while still
living in a 7-bit (ISO 646) world.  I run Chip Salzenberg's
deliver program.  My .deliver file looks like this:
   cat $HEADER $BODY | 78seus | deliver -n "$1"
   echo DROP
(OK, I'm lying.  My real .deliver file also saves a copy of
incoming messages.  Also, it has absolute path names to 78seus
and deliver, because they're not in /usr/bin.  But you get the
idea.)

The 78seus filter is part of my cz system (see above).  It
converts mixed English and Swedish to ISO 8859/1.  Cz also has
one for Danish, plus a paper on how to make your own.

I then read mail with GNU Emacs rmail mode, as usual.

When sending mail I write it in ISO 8859/1 in Emacs sendmail
mode.  Just before sending it, sendmail runs mail-send-hook,
which is set in lisp/swedish.el to call the function
8859-to-swascii-buffer.  This function maps the ISO 8859/1 to
ISO 646.

Deliver was posted to comp.sources.unix on 16 October 1989 as
volume 20, issues 23 through 26, archive name deliver2.0.  These
are the patches I know about:
   1 comp.sources.unix volume 20 issue 27 (16 Oct 1989)
   2 comp.sources.bugs,comp.mail.misc 15 Dec 1989
   3 comp.sources.bugs,comp.mail.misc 15 Dec 1989
   4 comp.sources.bugs,comp.mail.misc 15 Dec 1989
   5 comp.sources.bugs,comp.mail.misc 19 Dec 1989
   6 comp.sources.bugs,comp.mail.misc 19 Feb 1990
   7 comp.sources.bugs,comp.mail.misc  7 Mar 1990
   8 comp.sources.bugs,comp.mail.misc  7 Mar 1990
   9 comp.sources.bugs,comp.mail.misc  7 Mar 1990
--
Howard Gayle
TN/ETX/TT/HL
Ericsson Telecom AB
S-126 25 Stockholm
Sweden
howard@ericsson.se
uunet!ericsson.se!howard
Phone: +46 8 719 5565
FAX  : +46 8 719 8439