Wednesday, August 31, 2005

a conversion is needed

Using email headers to implement classifier headers is simply the wrong way to go.

When displaying lists of emails, for example, it simply takes too long. And now the idea of searching (by classifier headers) has come up both in discussions with Norm and at work, and searching would be very very slow indeed with the current implementation.

The good thing is that TKCS properties are fully indexed. I simply need to change the implementation... and convert existing content.

Right now there are only two true descriptor headers, content-type and content-transfer-encoding. So it is going to be quite easy to identify which are the classifier headers, making the content conversion is quite doable. And if we dump a snapshot of the Ark (at a loss of all history), then the conversion starts to get easy.

What I need to do then is to insert a new release before 1.7 which focuses on the conversion and other administrative tools. Then when we move into 1.8, the Ark will have regained some of its lost agility.

Descriptor headers will then simply become headers, and classifier headers (which includes virtually all existing headers) then become tags.

Tuesday, August 30, 2005

Including content in the release

In 1.7 we will have an enhanced dump facility:
"Enhance dump to be able to exclude selected cabinets."

The point is to be able to include a cabinet, CompStrm, as part of each release. This will provide a means of releasing commonly used UnitDescriptors. This can easily be done once the process of migrating data from one release to another can exclude any Cabinets being replaced.

In talking with Thomas today, I realized that some of the content from www.compstrm.org, specificly CswCmds and CswTerms, should likely also be included in a release.That way these Topics would be in close agreement with the released code.

(OK, so I'm a bit slow sometimes.)

Sunday, August 28, 2005

DescriptorUnit content

I've been thinking about how to distribute common DescriptorUnit content in a release. This has led to a series of work items.

1. Change dump to be cabinet based, excluding any deleted cabinets.

This will allow cabinets to be posted and the original deleted, a means of removing history and keeping the Ark to a reasonable size. (Though posting and deleting a cabinet are both slow operations.)

2. Enhance dump to be able to exclude selected cabinets.

By this means, a cabinet included in the release can replace an older version of the same cabinet.

3. Make the DescriptorUnit link WikiName based, rather than a hard link. And limit valid links to being either a reference to a Topic in the same cabinet or a reference to a Topic in a special DescriptorUnit cabinet.

Now common DescriptorUnit content can be included in the release. Also, when posting between cabinets, any references DescriptorUnits which are local to the original cabinet are either resolved in the new cabinet or become invalid.

User impact of descriptors

As I write this, I'm installing 1.7 alpha on www.compstrm.org. Now as you navigate the Ark, you can see when the number of applicable commands changes--for any of 6 different command sets. So the user gets at least a small hint that, after a cls command for example, additional commands are now available. (I suspect that breaking commands into smaller command sets will help a lot, too.)

My plan then is to add a seventh command set, application, for descriptor-defined commands. This way the user can at least look and see what application-specific commands are available.

Now, what are the impact of descriptors from a user's perspective? Descriptors can add/drop/change commands and alter the default display as well. This tells us where the value proposition is for 1.7.

Saturday, August 27, 2005

5 categories of commands

Under http://www.compstrm.org/wiki/cc/ReadMe/CswCmds I have now arranged the commands into five categories:
  1. Classifier,
  2. Content,
  3. Descriptor,
  4. Navigation and
  5. Structure.

(Of course, some commands are under more than one category.)

Now I'm thinking of adding a command navigation line above the third line of the display template, allowing you to pick a command category. That would bring up a page describing the commands in the category. Drilling down further might bring up a full command description and a form for running the command.

This would probably work by adding additional templates.

The general idea here is that applications would be a simple extension on this, giving the user visibility into the application-specific commands applicable to various Topics.

What is the utility? How is the user impacted?

I'm doing a lot of thinking about descriptors.

There is pleanty to do, fer sure! Lots of code I could be writing. But I want to make sure I'm going in the right direction. I'm getting into a new area here and I need to be able to determine when things are right and when they are just wrong.

OK, so I already did a LOT of descriptor work at the TKCS level. But there was no GUI--it was a pure command-line interface.

In the Ark, there is some variation, but there's only the Ark, Cabinets, Folders, etc. and they all work roughly the same way. Now we introduce type. How do I, as a user, gain an appreciation of the differences between different types of Pages?

The least change would be errors. I set a header, and I'm told that I can not use that header name, or that the value is incorrect. The other extreme is to choose from a menu of activities, then a menue of options and menues of values.

The right solution might be to allow the user to move from one extreme to the other, depending on how familiar a user was with a particular type or activity.

Hmm. Lets focus on one critical part--selecting or specifying an activity. A knowledgable user would just type the command name. On the other end? Perhaps a many-layered menu? And again, how do we drive out to the user the unique aspects of the current Topic? Perhaps when displaying a Topic of a given type, we can give a link to a help page for that type of Topic. (Having a type-specific documentation link on a Topic display sounds like a good idea to me.)

OK, lets look at creation of a LEnt. An LSec might allow only a single type, or a limited set of types of LEnts. Each type of LEnt would then have various required and optional headers, valid value ranges, etc. This seems quite reasonable.

ClassifierHeadersDescriptor?

I've been reworking the specs for 1.8, and updating the CswTerms and Cmds.

No, I haven't detailed out all of 1.8--I intend to learn a bit as I go. But I have made a good start.

Catch the latest at http://www.compstrm.org/wiki/uuid/BPRGo+qvONVL+QxkzCFLww3o/_

Thursday, August 25, 2005

descriptors

In 1.7 we're about to take a big step up, adding descriptors. Descriptors are used to describe a Topic, its behavior and its parts. This is in contrast to classifiers, which deal with the relationships between a Topic and other Topics. Rolonicly speaking, descriptors deal with a role as a whole, while classifiers deal with a role as a part of a larger whole.

Now we introduce type, which is a descriptor that defines much of the behavior of a Topic/header/descriptor. This is in contrast to usage (Ark, Cabinet, Drawer, Folder, etc.), which defines how a Topic interconnects with other Topics.

Descriptors and Classifiers, type and usage, these are all metadata. But in Rolonics, we always take care to distinguish these two kinds of metadata: Descriptors deal with a Topic/Role as a whole; Classifiers deal with a Topic/Role as a part of a greater whole.

One constraint we will follow in the development of descriptors is that Cabinets are relatively independent. The complex of typing data we develop then can only be used by Topics in the same Cabinet. Of course, we're free to post typing data from one cabinet to another, as we are with any other Topic.

Tuesday, August 23, 2005

Switching 1.7/1.8, simplifying

I've decided to delay the changes displays, as well as simplify them.

A changes command will now be attached to a calendar day, and will list the journal for the day.

Now with the start of 1.7, I'll be doing a lot of long-awaited refactoring.

Monday, August 22, 2005

A first set of slides

I'm using Star Office, so I exported the slides as a .pdf file. See http://compstrm.sourceforge.net/ArkIntro.pdf

Its only 9 slides, so its just a quick overview. Even so, I probably put in too much material.

The time has come to talk of many things.

For now, email is good enough. And yes, there's still a lot of room for improvement. Perhaps next year in version 2? I'm also thinking that custom displays would be nice, but I would not want to work on that until after 1.7, in any case.

To complete 1.6, I still need to do Cabinet-level dumps. As a major user, Norm needs this and I would find it helpful as well. But I've got a 3-day weekend coming up, Monday being a Sun India holiday, and I'm sure I can complete it then.

Meanwhile, aside from needing a code break, I really want to get going on some slides--at least an initial set, with more slides later. And this looks like the perfect time to do that.

Sunday, August 21, 2005

Email, its a wrap, and its great!

Email is a great way to add content to the Ark.

At the office, a lot of content comes via email, and I share my imap account with an Ark. I've set up a folder, Ark, and any email I want added to the Ark I just move to that folder--its posted under the current day. I've also got a mail filter which grabs any email with the words "Ark Content" in the subject and plops it into the Ark folder.

Now with routing in place, anyone can send me an email with a subject in the form "Ark Content {WikiPathname} ..." and it ends up filed in the desired place in the Ark.

Saturday, August 20, 2005

still pounding on email

Went through over 150 emails today, testing the new email processing code. As a side benifit, I'll note that I had NO threading problems. 1 update or many queries is working very well now, thankyou. I get a wealth of emails at work and all my discarded emails make for great test data.

I've now got email header retention to the barest minimum. Also, the code is a whole lot cleaner. And that's good, 'cause I still have a few things I want to do, like promoting text/plain content.

I still very much need to change gears and write some slides. Perhaps in another week, but not too much longer than that, I can get started.

Tuesday, August 16, 2005

changing priorities

There is ALWAYS more code to write! But I am thinking now that code is not the higest priority.

Email is now in beta. That makes it a whole lot easier to capture content at the office. And content is what attracts users. And users find high-level documentation helpful when getting started.

So my thoughts are turning to doing some slides, finally.

Yes, email routing is important, and cabinet dumps are a necessary tool. I'll get to them. But for the moment, I'm going to take some time off from being a code junky.

Monday, August 15, 2005

email--what's left?

First, I'd like to do some stress testing. Because of the Ark's slow update, reading email represents a considerable load. I'd also like to see some user experience with email before declaring it beta.

After that, its subject-based routing. I've talked with several users and potential users, and this is a key feature. Now I'm not talking about filtering, but something more direct--like the pathname of the Topic where the EmailLSec is to be created. (In the case of subject routing, a copy would NOT be added to the calendar.) But there are some details to be worked out:
  1. Lets start the subject line with a keyword, like "route", followed by the destination pathname.
  2. If there is an error, the email is posted to the calendar with an error header.
  3. The destination Topic must already exist, at least to start. Automatic Topic creation is a nice feature, but I'd like to place some contraints on it--like not automaticly creating a Cabinet!

Saturday, August 13, 2005

progress on email

I've changed pop and imap now so that less time is spent under the update lock.

I'm also now processing emails successfully, including multi-part and nested emails.

I still need to update the display to handle html content as well as binary files.

Thursday, August 11, 2005

Things are looking up

With the release of 1.6-0532b, the Ark is behaving reasonably again.

Now, finally, I can focus on the fine details of displaying emails properly.

I'll also note that when reading email, the database locks up for longer than necessary... Gotta change that!

This release is beta, except for email. Internal email data formats are subject to change.

falling forward

With the bugs found in the latest release of 1.5, I plan to fall forward to 1.6, making it beta except for the use of email, which is still strictly alpha, and depricating 1.5.

:-(

slow progress on email

I'm finding many new and exciting ways to destroy database integrity. (I had forgotten how much fun email is to process.)

So don't expect much anytime soon.

Wednesday, August 10, 2005

pulling in new users

Found a great way to get new users onto the Ark today at work.

Just put content in the Ark and then pass out a URL via email. (This was for a new project I started today.)

Now, I can't wait until I can archive project emails in the Ark. Everything in one place. That's when payback begins. :-)

email headers--just too much

My first hack at email headers was to build a list of headers to be excluded from most displays. That didn't work very well. There seems to be an endless number of email headers in use. And all those headers are slowing things down, besides looking quite ugly.

So I figure I'll create a "AllHeaders" LSec for each mail, with all the email headers there as TextContent, not as headers, keeping only from and subject on the top-level email lsec.

I may also need to keep a few more headers, like content-type, but these I can exclude from most displays.

Once headers are under control I can then start looking at displaying the email content in a better form.

Shouldn't take too long. I'm also thinking that soon after finalizing header processing, 1.6 can go beta.

handling multi-part email

Well, I've built up a list of headers not to display, and use it in the topics and default displays.

I've also managed to create an lsec for each mime-part.

The problem at this point is dealing with the displays of the message content. This is going to be a long road, starting with encoded text. I figure for html, I need to start using frames, too.

And then there's the attachments.

Tuesday, August 09, 2005

some progress on reading mail

I am delighted to report that Data (the Ark) is now able to read email from both imap and pop3 servers, posting them to $cd (current day) in the calendar.

Works great, but for two issues:
  1. Too many headers are created! For now I plan to suppress the display of selected headers. Later this should be configurable.
  2. Crash and burn on multi-part mime email. I need to create separate LSecs for each part.

I don't think this is going to take too long. But then we need a way to view attachments. :-(

reading email

I've reorganized the schedule a bit, inserting a new 1.6 to support reading email.

One of the problems I've faced in using the Ark is in adding content. In one application, all the content comes by email, and its just too difficult to use. So I've only use the Ark for taking notes and documentation. Time to change that.

Meanwhile, I've closed out 1.5, and moved Cabinet-level dumps to the new 1.6.

http://www.compstrm.org/wiki/cc/CswIntentions

Monday, August 08, 2005

Adding (signon) events to the calendar

Things may have been "secured", but until you can see the hack attempts, you have no idea what is going on.

The calendar is looking like a good place to log events, including signon events. The idea is to create an LSec for each type of event in a day, and create LEnts for each event.

Well, at least we'll be seing something in the calendar! ;-)

Sunday, August 07, 2005

Pretty Good Security

The CompStrm Wiki is now reasonably well secured, without recourse to encription or SSL.

Avoiding SSL in particular means that its a whole lot easier for anyone to bring up their own server.

Also, having reasonably good security together with access controls will be a big value-add.

I'll note that the weakest point is registration--its just hard to communicate a password safely without encription.

Single signon is one of the neater features, and is itself well secured. The catch here is that you use the same username/password in all the Arks and that their clocks be within a few minutes of each other. (Different timezones are not a problem.)

Friday, August 05, 2005

Securing CompStrm

I seem to have my security hat on lately. Been thinking about biting the javascript and implementing a sha challenge into the logon page. That would be a whole lot nicer than passing the password in the clear each time. (I'd still pass it in the clear during registration.)

Anyway, today I fixed the security holes in my use of cookies (unreleased):

  1. Cookies have been enhanced. Rather than carrying the internal form of the password (a sha hash of the password + a constant), cookies now carry the expiration timestamp and a sha hash of the internal password + the expiration timestamp.
  2. When validating a cookie, the timestamp from the cookie is first checked to see if it has expired.

Thursday, August 04, 2005

rcc--Remote cc cmd

The rcc command will work the same as the cc command, except that it navigates to a Topic in a remote Ark.

This command takes the same arguments as the cc command, except for an additional first argument which gives the hostname or hostname:port of the remote Ark.

Typical usage would be as a link embeded in text content:
{CommandHelphttp:/wiki/rcc/www.compstrm.org/ReadMe/CswCmd}

And yes, the rcc command will make use of single signon.

Wednesday, August 03, 2005

navigating a list

Quite often, I want to navigate down a list, viewing each item in turn. This could be done by adding next and prior links to each item, but that is not an agile approach, as it mixes structure and content.

The Ark already has a concept of lists. Indeed, a Topic may be added to more than one list. All we really need are the navigational commands, next and prior.

Tuesday, August 02, 2005

single signon

Right now we have a very primitive security system:
  • There is no SSL, no certificates. (Makes it much easier to put up a server.)
  • There is no encryption. (No restrictions on running in France or anywhere else.)
  • Persistent cookies are optional and hold the user name and a sha hash of the password.
  • There is no delete user or change password.

With such a system, it should be trivial to impliment single signon:

  1. The user clicks on a link to the server being used (the link contains a request for another server).
  2. The server creates a redirect with the user name and a token for the other server. (The token consists of the time and a sha hash of the time and the user password.)
  3. The browser redirects the user to the other server, passing the username, token, and request.
  4. The other server does a session logon (cookieless) and returns the requested page. If the user did not exist or the login fails, the login/registration page is displayed.

The key here is for the user to have the same name/password on all servers. Then it becomes pretty seamless.