Wiki Dream

Friday, May 06, 2005

Dynamics of Peer Production

I've become fascinated over the past few months with what Yochai Benkler calls Commons Based Peer Production (CBPP) and which he proposes as a new means of production that complements both the market and the firm. The key idea behind these websites is that they are designed in such a way as to facilitate self-organizing communities that create (and use) information goods in an intellectual commons.

Prominent examples of such CBPP websites are the wikiwikiweb, wikipedia, slashdot, kuro5hin, clickworkers, sourceforge, citeseer, blogspot, flickr, groklaw, and many more. These sites all allow any netizen to join the community and to add content to the common information repository.

What is interesting about these sites is that they also have engineered in mechanisms that allow the community to cooperate in enhancing the quality of the information. Thus, not only does the community create content, but it passes judgement on the content, and it creates a document history which records the complete record of all changes to the information. In some cases, the meta-level discussions about the content are also fully recorded( e.g. in mailing list archives at sourceforge or in the discussion pages of wikipedia).

The software behind these sites is, in many cases, being released into the intellectual commons and is being used to create new sites. For example, the software used to build the wikipedia is mediawiki. It is being used to build other CBPP sites such as the sourcewatch site for exposing bias and deception in news stories, as well as hundreds of other sites. Likewise, the sourceforge infrastructure is available commercially for supporting open source type models within a firm or educational institution. Blogging software is available for download and modification at several sites including for example blojsom.

I have a suspicion that there are design principles that can be applied to CBPP sites that can explain why some sites are successul while others flounder. The best way to induce these principles is probably to embark on an indepth study of the most successful such endeavors.

CBPP also challenges the foundational notions of intellectual authority that we have developed in our culture. For example, the wikipedia provides an example of an information source that is by design completely untrustworthy at any particular moment, because anyone could insert anything into any article at any time!

For example, after only five years the wikipedia website offers a wide range of articles in many languages, and many of the articles are comparable to the quality in commercial encyclopedias, except for the intervals after incorrect information has been introduced into the article and before someone has corrected it! As an experienced user of wikipedia, you are simultaneously skeptical of the particulars of the content of any article, and yet fairly confident that much of what you see in an article will be corroborated by other sources. The nature of this information source encourages critical thinking about information sources.

Friday, January 28, 2005

Servlets and constant-time computing

On the way into work today I was thinking about the requirements for servlets, that is, programs that are invoked by visiting a web address, and that, after calculating for a while, generate a webpage which is returned to the user. Interactive webpages such as the post-a-blog-entry pages here at blogger.com are examples of servlets.

I've been using JScheme to write servlets for several years now. The language is simple and concise and it allows me to rapidly develop fairly complex servlets. When writing servlets, you generally want the servlet to do very little computation and to return a webpage to the user very quickly. In fact, most users will probably backup or visit another website if the server does not respond within 10-15 seconds.

For the website developer, it would be very helpful to have an accurate upper bound on the amount of time each of your servlets would take. If the servlet interacts with a database or reads/writes from the file system, the time estimate would need to take into account the size of the database tables and the files being read/written. It would also need to bound all loops and recursion, and come up with an estimate of running time and space that could be converted to seconds and bytes (depending on the speed of the computer, the type of compiler, etc.)

The curious aspect of this type of program is that ideally, you want a fixed and small upper bound on the runtime of all servlets you write (e.g. well under 10 seconds). I don't know of any programming environments that will automatically try to construct such estimates for you, but it looks to me like their may be a niche for such systems and it would be fun to build one!

Thursday, January 27, 2005

Ken Anderson: In memoriam

Ken AndersonMy friend and colleague, Ken Anderson, died last week of a heart attack while attending an (anti-)spam conference. Ken and I were the main developers of jscheme along with Peter Norvig.

Ken was a great guy, in addition to being a really talented software developer. The world was a richer place for his presence. He will be sorely missed by his family, friends, and colleagues.

Here is a webpage that Geoffrey Knauth put up about Ken.

Dreaming of Wikis

We're mapping out this website for the
CS33b: Internet and Society class at
Brandeis University
and this is my first entry... My long term goal is to develop a programming language that is powerful enough for professionals to use as their main tool, but simple enough for non-scientists to learn quickly enough to reach a high level of competence in a matter of weeks. My current approach is based around the JScheme language. The Intro to Computers course at Brandeis uses this approach.