Tcl (over-) flow

Two months ago, I knew almost nothing about Tcl - a powerful and widely available scripting language . Now, I'm not sure what I know, nor what there is to know about Tcl.... even though it is clear that the world-wide web is filled with resources related to Tcl.

This is brief personal summary of how I stumbled across a tool (well... someone hinted at Tcl and helped me a lot, to be honest), decided to find out more to evaluate its usefulness in my situation, fell into a state of profound confusion - and a proposal on how to turn what I see as a major obstacle into an opportunity.

Context

A lot of my work is related to a long-term project which is an attempt to design, implement, and refine a new way of storing and manipulating data. Although partly research, this project is also made available on a commercial basis, as a C++ class library called MetaKit (see https://www.equi4.com/metakit/ ). The library is eh.... different at a fairly fundamental level, and as a consequence, it doesn't really do that much... yet - I hope. This is the unfortunate consequence of reconstructing a building from its foundations on up.

So one of the things I need, is a way to create high-level tools on top of it, such as data browsers, import- and export-facilities, generic data-editors, an SQL interface, a development environment, too name just a few. Some of the key requirements are: being lightweight and highly portable, ruling out technologies like ODBC, OLE-DB, and ActiveX.

Tcl

Ah, but there is Tcl - a scripting language which is easy to learn (for simple things), very much cross-platform, has an amazingly powerful user interface toolkit called Tk, is mature yet actively evolving, seems to be very robust, and is widely available and used - no doubt also because it is freely available. It "even" seems to be weak in data storage and manipulation - this could be an opportunity in more ways than one... bingo! I became very interested in Tcl, almost on sight.

The exploration starts

There is a site at Sun. No wait, that seems to be out of fashion. There is a new site of a new company, called Scriptics. Hmmm... they have gone commercial and are planning to do great things. Nice, but where do I turn now ???

Ah, there is a usenet newsgroup. What a relief, a discussion area with many interesting discussions. Much nicer than the shareware scene I just left (filled with too many Bill G. wannabe emulators...). Lets subscribe, and track this place for a few weeks. Too bad it does take a few weeks to get some feel for group dynamics, though...

I want to get started now !!!

Ok, NeoSoft seems to have a lot of software. Let's try to find the Tcl/Tk code and try it out. Hmmm... that site is not easy to figure out for newbies. Where is Tcl? Ok, back to the SunScipt site... they seem to have all the latest and greatest stable releases.

Excellent, the Tcl/Tk Win95 installation is a native installer. I'll get that and figure out how to get stuff for my Linux and Mac boxes later. Yes... it works. Let's try something. Eh... wait, what's the syntax exactly? Hmmm... I have John Ousterhout's 7.3/3.6 book, do these examples still work? After all, it's now called Tcl/Tk 8.0... Ah, ok - it works. Wow, impressive, so it's compatible too!

This is too easy...

Yes, it was. I immediately started looking around at extensions, since my first interest was in hooking my own C++ classes into Tcl. On Windows, Microsoft C++ support required different libs for linking (the installer seems to use Borland C++), and getting a DLL linked, placed in the correct area, and loaded from a Tcl script dit take a considerable amount of head-scratching. In fact, I decided to experiment with SWIG after running into a problem (and having almost no clue how to proceed), and was amazed at how quickly it created a fully working extension. Though SWIG was adding a pointer mechanism I did not want, it was an extremely useful learning tool. The working example it created was the main cause for me to continue these experiments. In fact, SWIG demonstrated how easy it would be to hook this stuff up to Python and Perl as well. SWIG is an incredible tool, even if you end up not using it...

Ok, I'm hooked

After this point, there was no longer a serious doubt at whether to use Tcl/Tk, but only how to get to grips with it. I had not yet written a single meaningful script with it.

One thing is obvious. Tcl/Tk is powerful. And highly extensible. There are lots of extensions scattered around the net. It would be foolish to try to develop applications without at least exploring the capabilities of such extensions.

And there are HTML viewers, HTTPD servers, and so much more. Gosh, server connections look trivial. Hey, there is a way to embed Tcl in browsers, and there are server-side scripting facilities. And look, wow, the text widget is fascinating - it is incredible flexible! There is a WebTk editor which illustrates how HTML text can be displayed but also edited, this could the "styled text" component I need for one of my future projects!

And there are mega-widgets, tree views of different types, and, and, and... gasp.

Ok, let's just try a few things. BLT, hmm... 2.3 doesn't work with Tcl/Tk 8.0 - yet I want to try it. Ok, let's first spend some time on a Linux setup. Oops, my Linux box was a networked machine until now (it has an old, small screen, and moved to the basement for lack of room). Tk? Hmm... needs X. X? Hmm... I need something to be an X server on my Windows 95 main workstation. Well... to make a long story short, it took a lot more downloading and tweaking to get things link BLT to run (on Linux, good enough for an evaluation, evidently). Still more tweaking to get Tcl/Tk on the Mac.

But it's all there. And it works. There were no hurdles which became show-stoppers. This is very encouraging.

Hmm, there is something called [incr Tcl], and there is TclX, and TIX, and the Plus Patches. Lots of things to try. I'm learning a lot (and investing a lot of time along the way). TableTk... hey, just saw a new 2.0 announcement, let's try it - it would be nice to be able to use something like this when I plan for a generic database viewer and editor. Many more days and nights go by...

To wrap it all up: yes, Tcl/Tk is awesome. Some extensions (not all, this world is not perfect) offer a fantastic amount of ready-made functionality. Some seem to be a bit old, even though they appear to still work just fine. Others seem to be supported only on Unix, leaving me with an uncertain decision on whether it could be used more widely. That makes it a bit harder to decide whether to even invest any time just evaluating them. But hey, you get a lot more than what you pay for (in the monetary sense at least, the time investment is another issue...).

It's not all peaches...

One of the things that happen all the time when exploring new stuff like this on the web, is the "Aha, oops, where?" syndrome. You see a lot of things, understand only a tiny fraction, while trying to focus on one new bit of information, and then later on when things fall into place, you remember seeing something else that now makes a lot more sense... so you want to go back and have a look at it. This could be an explanation, a new concept, an example, a utility, a Tcl script or extension, anything really. But the problem is: where did I see this before ? Although a browser has a "Back" button and saves its history, it does not usually work across sessions, i.e. over a period of several days.

It is time to enter reference mode ...

This is the moment when searching starts to become important. Very important. Infuriatingly important. Now is the time to go into FAQs and lists of links. See what others have to say, hope that my problem (of finding something) is a common one. But it never is. Everyone has a specific set of interests and focuses on different issues. Everyone has their own level of expertise, what is educational to some is cause for others to skim at high speed. I know a lot about some things, yet next to nothing about others. When I'm in reference mode, it is by sheer necessity, and I'm always impatient to the extreme to get out of it again. It's that unpleasant... there is joy in learning, and in creating, even in fixing, but never in going back and repeating.

In reference mode, Tcl is a mess. Don't get me wrong, there are probably few areas which are not a mess - this is not a criticism of Tcl per se, much less of the people who invest so much time in selflessly explaining and guiding others.

There are links of lists. About what? How do I know what the focus of interest is, if I know nothing about the person who created the list? Does that mean I have to also first read about the writer? Where would it end? And then, how old is the list? Who warns me if the information is grossly out of date, what if the list has been abandoned? Of course, one can usually tell by the number of broken links - but this seems like a very indirect and inefficient way to find out. And this page the link points to - how up-to-date is it? Does it apply to the situation I'm in right now? It may be old, yet perfectly valid. But it might also be relatively new yet still not apply anymore. There are many, many dozens of lists. When I see a link, I have no way of knowing whether the information it points to is small or a huge collection of valuable information. And then there is formatting - everyone has their own personal style perferences. Fine. So do I. But when I have to visit dozens of sites, it becomes a nuisance no matter how close some pages are to my own estethic values.

Too many sites have no search engine access. AltaVista and the other main web-indexers are not an option, since they would return far too many results on a query such as "Tcl AND editing". So, in reference mode, you could be one link away from what you are looking for and never notice. Because you would only find that one page back through the same path, by visual recognition.

And the net is so slooowww ... technically, it's all a miracle. Being able to jump from one end of the world to the other in a fraction of a second. But only sometimes. First of all, you have to be connected to the internet. Many people are, permanently. Many more are not, and pay dearly - per minute of connect time. And in wasted time - by waiting for a modem to go through its little dance and CONNECT ... Given my situation, I cannot complain - an on-demand dial-in through ISDN (1-3 seconds) with a router hooked up to my in-house ethernet. Yet even this is indirect enough that "connecting" is still a conscious activity - and an ever-recurring inconvenience. Imagine writing a Tcl script, and needing some more info about a specific Tcl built-in command, or worse - having to find the command first. It would not be workable to have the documentation reside on the web, even with the fastest access. Not today. Not tomorrow.

The hoarding scenario

One option is to grab everything you - might - need, and copy it to your local hard disk. Works great. Disks are dirt cheap and blindingly fast. Disk manufacturers are having a ball, and it doesn't look like their role is about to end. NC? Look around - see anyone using them today? Oh yes, some people drag notebooks around - but the disks in there are definitely not becoming smaller...

Time passes

Have you heard? There is a new release of X? It has Y and Z, you really should look at it! Solves all your problems.

Wait, I need to know about these trends - let's get on a couple of mailing lists. Announcements sure are important. Usenet is ok, but announcements by email are more effective - they come to you when it is time. No need to track anything and filter out the noise.

Hmmm... did you say email? Yeah, I know that. Isn't that the place where all these spammer live? No thanks.

Ok. I'm on the lists. Installed an anti-spam filter. Great. It seems to filter out most of the junk. It's workable - good.

Time passes

Once in a while, announcements come in. New X? Hmm... I'm not using that now. Let's store the message, I might need it one day.

Time passes

Ah, where did I put that message about X. It need it now, and can't find it! Let's see, where is the software, what is its homepage? Hmmm... moved, a broken link. Ok, back to reference mode - old style. Search on-line, scan the variety of docs out there. Ok, got it.

Now, let's see. I have an old version on my disk. Where did it install? Can I safely delete it? How do I upgrade? Wait, it needs Y, do I have that, is it the new release? What else is affected if I alter this?

And so on... no need to spell this out in further detail. All I know is: there must be many, many, many people going through this process. Some of it is no doubt inevitable. The price of progress - or rather, the price of change. Yet it all seems like such a grossly inefficient way of spending time, especially considering the number of people who are probably going through the same process at one time or another.

This is no doubt one of the driving forces behind the NC and the "zero-administration" PC.

Can we do better?

On a grand scale? Probably not - not yet anyway... But on a smaller scale, yes, I think it is possible to create relatively simple tools to deal with some of the issues described here.

What needs to be addressed is, ahem, is... [insert pause for maximum impact here] ... the impedance mismatch between the ephemeral aspect of email and usenet on the one hand, and the static content of web pages and files on the other.

Wait... don't run away. It's very simple. Email and usenet messages are flows of information. They move from one person to one or more recipients. And the recipients then have the - impossible, IMO - task of deciding what to do with that information. Read it - now? later? - and/or save it. Web pages are sometimes so large and filled with information that we create local copies to have them around as quickly as possible when needed. Scripts and software work the same way, we copy them locally to use ("run") them. But the moment information is copied, it creates a maintenance problem. Is it valid? How do I know whether it is? Can I remove it? How do I refresh it? Who tracks its status?

I don't like the popular terms push and pull - they do not quite cover the issues described here. A much better distinction is made by the terms data and meta-data . When data is copied, it is critically important to keep track of its meta-data - something which even the most simple copy file command does not do. The meta-data in this case being the origin of the file.

What I propose is a scheme whereby local copies of data are stored with information about where it came from.

Let's put this into perspective. This is not a proposal to change the world. Not right away, anyway :)

The first part of the idea is to create a tool which manages lists of information. The most practical use would be to use it to maintain annotated lists of URLs and that other highly condensed package of knowledge - the FAQ. Each list has an owner/maintainer and a simple structure consisting of perhaps a few typed fields. The minimum would probably be a title and some sort of free-text contents. A more formalized list would be split in fields such as name, contact info, comments, URLs, version info, FTP links, etc. All depending a lot on the intended use, evidently. The editing tool which manages such a list also tracks modification times of each entry, and contains the logic to propagate only changes over the network (with synchronization and robust recovery).

The second piece of the puzzle is a server which accepts any number of lists from any source. To make it useful by anyone, that server should be able to serve the contents of a selected list as HTML pages. This will allow anyone with a browser to view the information maintained by each of the list owners. The rendering of the lists as HTML could be set to match the list maintainer's preferences or adjusted to create a more uniform style across all the different lists. In fact, both versions could be available - for the visitor to select. One step further, the server could be extended to accept a mailing-list type of subscription to send out notifications of changes, for those viewers who want to stick with the traditional web-page + email announcement approach. This server could be extended in many ways - how about colorizing links depending on the last modification date of the page they refer to, or adding the size of that page, or even a little marker to indicate whether that page contains many links itself? Lots of hints which can make the decision to follow a particular link easier. I don't want to go everywhere, I want quick hints to help me decide before I go there!

The base-line scenario is not very different from what happens today. A list owner makes changes to the list (this can be done off-line), and then clicks on an "update" button to send the changes out to the server. This is a simple and very efficient operation, because only differences are propagated to the server. A refinement would be to manage the list on the server as an option. So far, this is just a server showing its lists to anyone who cares to look. Perhaps a little nicer in presentation (or more consistent), perhaps a little better maintained, but nothing to get really excited about...

Make it local and make it "live"

But the third piece of the puzzle is what makes it all different. A client which is able to keep a local copy of a selected set of lists on a user's machine. Whenever they feel like it, they can click on 'refresh' to make sure all the lists they see are updated. Again, very efficiently - with a minimum of fuss, and little time wasted. An optional refinement: make it automatic and periodic, perhaps even at the frequency specified by the list maintainer. Another option: use specially formatted email as delivery mechanism, using the standard mailing lists (or even a usenet newsgroup) to distribute changes. Third refinement: allow the user to create a list for all emails coming from a specified standard mailing list. One feature which will be very important is to make the lists searchable - FAST - and preferably as full-text or by keyword.

What is this? A one-way mailing list with memory. Usenet newsgroups with moderators. A newsletter distribution with back issues.

Big deal? No. It should be possible to create all this in pure Tcl. Nicely platform independent. The server could scale up to use some database when the size of this increases beyond a certain point.

But this is not just a mailing list or a moderated newsgroup. The memory is important. The fact that a single person determines its contents is important. And the fact that a single person can add - but also remove - entries is important. The way I see it, the structure of the list can be altered and propagated just like the contents of any entry. The list owner/maintainer controls precisely what each copy looks like. Just like a web page with links or a FAQ. But the information is locally available and instantly useable. Wait... did anyone say "runnable"? Yes, this same mechanism could also be used to maintain a software package... Tcl scripts, but in due time any sort of infrmation. And it can scale in many ways... replicated servers, alternate type of clients, plug-ins, you name it...

And Tcl has everything in place to make it happen. Right now.

Imagine a Tcl-based solution, and then think at how it could go places. I don't see why it couldn't be used in many, many more situations where a community is driven by a core of dedicated authors (both writers and programmers). Wouldn't it be ironic to see this project serve a Python and Perl community as well? Then again, why not - language barriers are there to be stepped over, both in real-life and in virtual communities.

Now... is anyone else interested in helping create this mechanism and the tools/scripts it requires?

I can hardly wait to have it available...

� May 1998