[Starkit] system encoding in tclkit
Sergey Vlasov
vsu at altlinux.ru
Sat Apr 16 18:32:59 CEST 2005
Hello!
Does anyone know how to make tclkit find and use the proper system
encoding in a non-English locale?
I have tried tclkit-linux-x86-8.4.9 with LANG=ru_RU.KOI8-R, and also
the 8.4.9 Win32 version under Windows 98, and both have the same
problem: even if I rebuild tclkit with all encoding files (as
described at http://www.equi4.com/tkunicode.html), I still get:
$ tclkit
% encoding system
iso8859-1
(the Win32 version gives cp1252, which is also bad - the real system
encoding in that case is cp1251).
When a Tk application is launched in such environment (I tried
Notebook - http://notebook.wjduquette.com/), it has major problems:
all keyboard input is assumed to be in the broken system encoding,
therefore I get iso8859-1 accented letters instead of Cyrillic
characters. Obviously, this makes starkits unusable.
I tried the newer tclkit version (8.5.a2) on Linux
(http://www.equi4.com/pub/tk/8.5a2/tclkit-linux-x86.gz), and with
LANG-ru_RU.KOI8-R it does not start at all:
$ tclkit-linux-x86-8.5a2
system encoding "
zsh: abort tclkit-linux-x86-8.5a2
If I execute "encoding system koi8-r" (after either adding the
appropriate encoding files to tclkit, or copying them as described in
http://wiki.tcl.tk/10382), then it seems to work (I tried to insert
this statement into notebook2.1.1.vfs/lib/app-notebook/notebook.tcl
after copying of the encoding files, and such hacked Notebook can
handle Cyrillic characters properly). But obviously hardcoding the
encoding name is not acceptable - the encoding should be determined
automatically, like the "real" Tcl does it.
Looking at tclUnixInit.c:TclpSetInitialEncodings(), I see that Tcl
tries several methods to detect the system encoding and uses the first
encoding for which Tcl_SetSystemEncoding() succeeds. However, at this
point the encoding files stored inside tclkit are not yet available
(because vfs is not initialized), therefore all calls to
Tcl_SetSystemEncoding() fail, and system encoding is left set to
"identity". Then the tclkit bootstrap code "fixes" this:
# fix system encoding, if it wasn't properly set up (200207.004 bug)
if {[encoding system] eq "identity"} {
switch $::tcl_platform(platform) {
windows { encoding system cp1252 }
macintosh { encoding system macRoman }
default { encoding system iso8859-1 }
}
}
Encoding initialization in Windows is more simple (no guesswork
needed, just wsprintfA(buf, "cp%d", GetACP())), but has the same
problem - Tcl_SetSystemEncoding() call fails, and the system encoding
is left set to "identity" to be "fixed" later by the tclkit bootstrap
code.
One thing which is particilarly bad is that setting the system
encoding later does not really fix everything. The first call to
TclpSetInitialEncodings() is supposed to convert the Tcl library path
from native encoding to UTF-8, which obviously requires valid encoding
definitions. Also, running a starkit located in a directory which
name contains Cyrillic characters fails under Windows (looks like it
takes the filename encoded in cp1251, assumes that it is encoded in
cp1252, then converts it to Unicode, then converts this Unicode to the
system encoding using Windows functions - which gives garbage because
the corresponding Unicode characters do not exist in cp1251). The
Linux version does not have this problem.
Some possible solutions which I could imagine:
1) Find some way to make encoding files stored in tclkit accessible
earlier, so that Tcl_SetSystemEncoding() called from
TclpSetInitialEncodings() would succeed. Not sure if this is
really possible, and it requires putting all encoding files into
tclkit, which apparently is frowned upon.
2) Hack Tcl to build all those encodings into the executable (seems
even worse than the first suggestion).
3) Store the list of all encoding names tried by
TclpSetInitialEncodings() somewhere, and try them in the same order
at some later time, when the required encoding files are available.
This will not fix the problems with invalid filename encodings, but
would be better than nothing.
Maybe someone could suggest something better? This problem really
needs to be fixed - it makes tclkit practically unusable for
non-English users.
--
Sergey Vlasov
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.equi4.com/pipermail/starkit/attachments/20050416/11c2e6fc/attachment.bin
More information about the Starkit
mailing list