Monday, July 26, 2004

Technorati and Software Development

Heavy users of the Technorati may have noticed today that the service is currently suffering from major issues, and not just of the garden variety we've all become accustomed to, either. The way in which Technorati's chosen to unveil its latest changes flies in the face of what I consider to be a fundamental law of web development: never use the same platform for developing and providing services if you can at all help it, in order to avoid causing service outages if any of the changes you implement screw things up.

I don't know what Technorati's cash flow situation is like, but I very much doubt that it can be so bad that the firm can't afford to acquire a few old machines to use for beta testing purposes, rather than rolling out breaking changes without warning and then scrambling to fix things (if they are even scrambling, that is). The sheer unreliability of the service is getting to be so irritating that I'm feeling more and more of a mind to put together a home-grown alternative of my own, by which I mean not an aggregator like the excellent Bloglines, but a spidering/indexing service which confines itself to weblogs alone, and is able to return both the numbers of links to a blog and the context in which a given link occurs.

One additional feature I'd like in my implementation should I get around to it is to do away with the whole link-aging thing used by Technorati: why should I care that a given link is more than X days old, as long as I'm able to search and sort links by age?

Saturday, July 24, 2004

Recursion and Dynamic Programming

Eric Lippert put up an interesting post on dynamic programming some days back which I've only just gotten round to discovering. Here's a topic that's very much an interest of mine, as sequence alignment techniques like the Needleman-Wunsch and Smith-Waterman algorithms make heavy use of it.

Thursday, July 22, 2004


Microsoft's Internet Explorer team now has a blog of its own. It will be interesting to see just what comes of it.

Thursday, July 15, 2004

Mozilla Code Review and Super-Reviewers

A useful list of who owns what, and which modules require super-review before check-in. Also recommended for would-be Mozilla developers are the SeaMonkey Code Reviewer's Guide and the SeaMonkey Engineering Bible

SQL Server Data Mining FAQ

Available at the SQL Server Developer Center, "dedicated to discuss[ing] issues on the Microsoft Analysis Services data mining functionality available with the Microsoft SQL Server product."

New Issue of Phrack Released

Issue 62 of the notorious hacker journal Phrack recently hit the streets - or, to be more precise, the Web. As is to be expected, there's lots of interesting stuff in it, including guidance on how to circumvent 3rd party buffer overflow protection on Windows, kernel-mode backdoors (i.e, rootkits) in NT-based Windows systems like Windows 2000, Windows XP and Windows Server 2003, and a tutorial on using process injection to bypass software firewalls on Windows - the sample provided works against Zone Alarm 4 (free and pro versions), Sygate Pro 5.5, BlackIce 3.6, and even Tiny Firewall 5.0 (though this last one did require a bit more effort). The lesson to take from the last of the three articles mentioned is a simple one - if security is important to you, don't expect a software firewall to do the job, or at least not one that's also running on a Windows machine.

I wonder if Microsoft's security people are Phrack users. They ought to be, and if they are I'm betting this latest article ought to keep them awake for a few nights. These Phrack guys are good at what they do.

Project Looking Glass

The home page of the Java 3D window manager project.

Project Looking Glass is based on Java technology and explores bringing a richer user experience to the desktop and applications via 3D windowing and visualization capabilities. It is an open development project based on and evolved from Sun Microsystems' advanced technology project. It will support running unmodified existing applications in a 3D space, as well as APIs for 3D window manager and application development. At the moment, existing application integration is supported for Linux platforms.

The project intends to break two boundaries -- the 2D-ness of the current desktop environment and the way the desktop environment evolves. Project Looking Glass is in its infancy. We need to explore lots of ideas and possibilities. We're releasing the Project Looking Glass code to the whole community to explore every aspect of the technology rather than restricting access to a privileged few. We believe this open development is an excellent model to pursue this exciting and vast opportunity. So, your involvement is eagerly anticipated.
Talking about visualization is all well and good, but I've yet to see anything about Looking Glass that suggests it'll bring any real benefits to users beyond the sheer "gee whiz" element, and much the same could be said for the 3D windowing features shown off in Longhorn at WinHEC earlier this year. If Looking Glass and Avalon are to be more than opportunities to show off some 3D eye candy, they'll either have to provide radically new methods of interaction that aren't easily implementable using the traditional "PC as virtual desktop" paradigm, or (more likely) it'll be through their making it possible for others to do so. All that is likely to come from merely turning the desktop into a 3D space is that users will find interesting new ways to misplace items and become disoriented.

Free Online Books for ACM Subscribers

The following is a from a message that's just arrived in my inbox.

ACM's Professional Development Centre proudly announces its new online books service, powered by Books24x7(r). Now all student and professional ACM members can take advantage of free, unlimited access to 395 online volumes, selected from the highly regarded ITPro Collection hosted by Books24x7. Members can look for books to help with their PD Centre online courses or just read up on new subjects! Members can also review citation information on books via direct links to The Guide for Computing Literature from the books list.

Topic areas covered include C++, C#, Java, ASP, SQL, PHP, Linux, .NET, Visual Basic, Data structures, Data mining, Networking, Security, and Web Design.

These complete, unabridged books can be searched (keywords, author, title, ISBN, publisher), bookmarked, or read from cover to cover. A personalized bookshelf allows for quick retrieval of a favorite book while bookmarks allow easy return to specific places in a book. Members can view as many books as they like, as often as they like. In addition, an ACM specially priced upgrade ($249) to the full ITPro Collection with over 3,000 volumes will be available in the near future.
Very nice. It's too bad reading books on a PC monitor is still such a pain.

Visual Studio 2005 - Product Line Overview

At last, Microsoft finally gets around to providing some information about product differentiation in its upcoming Visual Studio 2005 release.

I like most of the options offered under the Visual Studio Professional Edition column, and the absence of Visual SourceSafe (which I've never lied) is something I can live with, as is the lack of support for unit testing. What does get on my nerves is that one should have to buy Visual Studio Team System to get support for code profiling and static analysis, which I find simply ridiculous. For goodness' sake, the freely available GCC compiler suite has had support for code profiling forever, and yet here's Microsoft asking developers to hand over $2500 (or likely far more) for a feature which one ought to take as a given in a decent development suite.

I can't wait to see what eye-watering numbers Microsoft decides to attach to the Professional and Team System editions come 2005. A full copy of Visual Studio .NET Professional Edition is listed at $1,079, while even the "Competitive Upgrade" goes for $489; by way of contrast, an MSDN Professional subscription that provides the same development suite as well as all of Microsoft's operating systems goes for $1,189. Why would anyone go for the standalone package in the face of such pricing? I suppose that's the whole idea, of course ...

Wednesday, July 14, 2004

PHP 5 Released

A Slashdot story informs us of the release of PHP 5. Following is a list of key features.

  • The Zend Engine II with a new object model and dozens of new features.
  • XML support has been completely redone in PHP 5, all extensions are now focused around the excellent libxml2 library (
  • A new SimpleXML extension for easily accessing and manipulating XML as PHP objects. It can also interface with the DOM extension and vice-versa.
  • A brand new built-in SOAP extension for interoperability with Web Services.
  • A new MySQL extension named MySQLi for developers using MySQL 4.1 and later. This new extension includes an object-oriented interface in addition to a traditional interface; as well as support for many of MySQL's new features, such as prepared statements.
  • SQLite has been bundled with PHP. For more information on SQLite, please visit their website.
  • Streams have been greatly improved, including the ability to access low-level socket operations on streams.

Tuesday, July 13, 2004

Bioinformatics and Comparative Genomics

A pretty decent, if short, primer on the subjects.

Monday, July 12, 2004

Windows XP Service Pack 2 Delayed Again

Or so says Slashdot, at any rate. I really can't believe that Microsoft has done this yet again - when is this service pack ever going to see the light of day?

Sunday, July 11, 2004

VTK - Visualization ToolKit

An extremely nice, freely available, open source visualization toolkit with support for a broad range of algorithms. It's strange that it isn't better known.

Software Express for Solaris

Seems like Microsoft isn't the only company getting on the "Express" bandwagon: development builds of Solaris are now available for download as (hmm, guess what?) "Software Express for Solaris". Unfortunately, there's a hitch, and it isn't a minor one either: one has to already own a Solaris licence to use the product, and the lowest outlay for one of those is $289 ($99 for a single-cpu license + $95 for Softawre w/ DVD-ROM Media + $95 for Administrator's CD/DVD Media and Installation Docs).

Frankly, Sun's policy doesn't make the least bit of sense to me. Linux is freely available and eating into the company's market share from below, while 6-month trial copies of Windows Server 2003 can be freely downloaded from Microsoft's website. Exactly how many would-be Solaris administrators and evangelists does Sun expect to fork out $300 simply for the privilege of trying out its operating systems? I thought the name of the game was to flood the market with people who are well-versed in caring for your products, thereby reducing the risk and TCO of buying your stuff for potential customers, but Sun's management evidently has other ideas.

PS: Actually, I spoke too soon. Sun does indeed have a program providing free copies of Solaris 9 for evaluation. It's still not as attractive as Microsoft's offering though, as the evaluation period is only 60 days.

Saturday, July 10, 2004

Classification of Finite Simple Groups

This month's Notices has an interesting article by Michael Ashbacher on the state of the classification that was supposedly finished off in 1980. Surprisingly enough (or perhaps not so surprisingly), it turns out that there was still quite a bit of work left to do after all; a great deal of effort was put into simplifying the proof to a level requiring no more than an acquaintance with elementary graduate textbooks like Rotman's Introduction to the Theory of Groups, while increased scrutiny saw all sorts of gaps appear in the proof, some of which stubbornly refused to be closed until recently.

Dictionary of Algorithms and Data Structures

An NIST-hosted site with a cornucopia of entries on all sorts of useful algorithms on data structures. The entries are generally on the brief side, but it's a useful resource to know about even so.

SQL Server 2005 Express Edition Overview

A nice, in-depth article on the functionality and management of SQL Server 2005, including comparisons with older technologies like Microsoft Jet and MSDE. This product has the potential to knock the stuffing out of a lot of the Windows market for products like MySQL, if only because it has all those high-end features one expects from a real RDBMS (e.g. views, triggers, stored procedures, foreign key support) at a price that can't be beat - gratis. MSDE also had all those features, but its 5-concurrent-user limitations made it extremely unattractive by comparison with MySQL.

I'm really impressed by the thought that's gone into making sure SQL Server 2005 Express is secure out of the box. For instance, the "sa" account is disabled by default if Windows Authentication is in use, and should the account be enabled, it will require strong a cryptographically strong password from the administrator; neither Named Pipes nor TCP/IP is enabled in a default installation.

There is one thing worth noting about SQL Server 2005 Express, however, which is that memory requirements have gone up since the old SQL Server 2000 days. The minimum amount of RAM suggested for a host machine is 256MB, with at least 512MB of memory recommended.

Friday, July 09, 2004

implementing SVG on OS X

A clueful post about why implimenting SVG support on OS X isn't quite as hard as Dave Hyatt makes it out to be. The work's already half done, as the KDE Project has been working on KSVG for some time now. Even that "the SVG standard is too large and confusing" is a copout - ever heard of "SVG Tiny?" There's just no excuse for adding proprietary extensions to HTML to support Dashboard, as Apple seems intent on doing.

Database Abstraction Layers Considered Harmful

Jeremy Zawodny has a thought-provoking piece up about why "Database Abstraction Layers Must Die!" I don't really agree with it (and I don't think it's quite as much against the conventional wisdom as its title makes it out to be), but it's definitely something to give some serious thought to.


A nice article on B-Trees (or Balanced Trees, for the pedantic), useful for those who've been out of school a little too long, and even for those who've never been to school at all. "Why are B-Trees important?" one may ask, and the answer's straightforward enough - they can drastically reduce the number of accesses to slow secondary storage required to get at any given record.

The importance of understanding fundamental data structures can't be emphasized enough, and not understanding B-trees and doubly-linked lists ought to be a crime for anyone working with databases; for instance, SQL Server 2000's indexes are implemented as B-trees.

Fujitsu Bankrolling PostgreSQL Development

As a native version of PostgreSQL isn't yet freely available for Windows, I haven't been paying much attention to ongoing developments with that rather impressive open-source RDBMS. As such, it struck me as a big surprise to learn that Japan's Fujitsu Corporation has decided to back further work on the project, and it looks to be a long-term commitment too.

Fujitsu this week announced an expanded collaboration with Microsoft on servers for mainframe computing, but the Japanese hardware giant is also investing in open source, paving the way for a handful of new PostgreSQL functions that will benefit all of the open source database's users.

The Japanese company, folding Windows as well as Linux and other open source into its mix of strategy, will support the BSD-based PostgreSQL database with code contributions and underwriting development that will be a part of version 7.5 of the database, PostgreSQL core team member Josh Berkus said. It is expected to be available before the end of the year.

Berkus said Fujitsu, which brought in $45 billion last year, is the largest company to contribute directly to PostreSQL to date, adding that the PostgreSQL community expects its relationship with Fujitsu to continue for "at least the next few years."

Fujitsu beats feature freeze

While Berkus referred to a July 1 freeze on features for the next version of the database, he reported three new features in PosgreSQL -- Tablespaces, Nested Transactions, and Java support -- that are being underwritten by Fujitsu in partnership with Tokyo-based SRA will be included in version 7.5.

"Much of this new functionality will be present in the forthcoming release of PostgreSQL, which is shaping up as the most significant new release of the software since version 7.0 almost four years ago," Berkus said, referring to full point-in-time recovery and two-phase commit, data integrity and scalability improvements, native Windows edition, and solutions for high availability, clustering, and replication currently being developed for different user requirements.
The support for two-phase commits, nested transactions and the like are very cool, but from an adoption POV, by far the most significant aspect of this announcement is the bit about a native Windows edition. If the PostgreSQL developers can deliver on this promise (a big if, seeing as they made the same promise for version 7.3), it will likely lead to an explosion in PostgreSQL usage and visibility. MySQL is fast, but that's about all one can say for it; PostgreSQL is a real RDBMS.

Wednesday, July 07, 2004

Data Features in Visual Studio 2005

Robert Green, Program Manager for the Visual Basic team, gives a rundown of Visual Studio's new DB features in this video.

Tuesday, July 06, 2004

OS X Tiger Screenshots

OS X Tiger

A nice bunch of pictures of the upcoming release, highlighting new features like Automator, Dashboard, Spotlight and VoiceOver, the new spoken command interface.

Looking at these pictures, if there's one thing that one is made aware of, it's that Apple's graphic design people are by far the best in the business. None of the currently available Microsoft operating systems even come close to OS X for sheer graphic elegance, and it's fair to say that Windows XP, with its garish default "Luna" skin, actually fares worse than Windows 2000 and Windows Server 2003 in the comparison stakes. When will Microsoft learn that "easy to use" and "childlike" aren't one and the same?

PowerPC Emulation on IA32 Hardware

File this one under "Cool But Impractical Technologies" - PearPC is a PowerPC emulator which can run at anything from 400 to a mere 40 times slower than the real thing! Yes folks, now you too can use your 3GHz Pentium 4 boxes to emulate a PowerPC running at 77MHz!

Anyone serious about working with the PowerPC platform would likely be better off just getting a secondhand G3 iMac off Ebay or something. As a theoretical exercise this really isn't that awe-inspiring, as any Turing machine should be able to emulate another, given enough memory; still, there's a certain amusement factor to knowing that someone went to all that effort to actually implement such a thing.

Monday, July 05, 2004

PARC Publications on User Interface Research

A real treasure trove of information. Looking at how long ago many of these papers were published, it's amazing that desktop user interfaces have seen so little progress since the advent of the Apple Lisa in 1983, and what's even funnier is that the Lisa itself was based on work done way back in the 1970s.

An idea suggests itself here - might one postulate what might immodestly be called "Abiola's Law?" The dictum I have in mind states that desktop interface design will lag the cutting edge research by at least a decade; whether or not it has ahead of it a career as illustrious as "Moore's Law" is uncertain, but as retrodictions go, I'd say it's pretty decent.

TouchGraph GoogleBrowser

Although I have the distinct feeling that I've blogged about this tool before on my other blog, I don't see any harm in mentioning it here again. Information visualization constitutes the single most neglected aspect of information retrieval as far as I'm concerned, and as amusing a toy as the TouchGraph GoogleBrowser may be, it is possible to do far, far better; for example, one can look at Tamara Munzner's old Hyperbolic Viewer approach as one possible avenue of improvement.

If there's one aspect of Longhorn that I think worth getting excited about, it isn't so much what WinFS will make possible, as what the easy availability of 3D graphics primitives without going through the DirectX or OpenGL APIs will do for information visualization tools. Couple the arrival of a much more approachable 3D API with the relentless pace at which the demands of the gaming market has made ubiquitous ever more capable 3D graphics cards, and for the first time it looks as if all those advanced 3D interfaces cooked up ages ago, like Xerox PARC's Information Visualizer*, will finally be feasible on the average desktop. Humans, being primates, are largely visual creatures, and it strikes me as absurd that so little use has been made of this commonplace observation.

*More information is available to ACM Digital Library subscribers in a paper which can be downloaded from this link.

Google as Auditing Tool

This page lists Google searches that reveal potentially extremely embarassing and financially damaging information. Following are a few examples:

  1. allinurl:auth_user_file.txt
  2. intitle:index.of config.php
  3. filetype:ini ws_ftp pwd
  4. inurl:"wvdial.conf" intext:"password"
There are even more outrageous examples available from the aforementioned page, including searches that bring up the personal financial details of quite a number of people - just the sort of information phishers live for. It's amazing how little so many site administrators seem to either know or care about security issues.

Interesting Browser Statistics

According to this page at least, Mozilla-based browsers are starting to make serious inroads into IE marketshare. What I'm curious about is where exactly these numbers came from, and how they were collected.

Structural Analysis of Proteins

In discusing the use of Java 3D for protein visualization, a blogger at Sun touches on a topic I've been fascinated by for quite a few years now - the structural analysis of proteins, and, in particular, predicting how a protein will fold, starting from just raw sequence data. This has to be one of the most important and yet most challenging tasks in computational biology today.

Using Google's Advanced Search Operators

Here's a guide with more tips on using poorly documented Google features to get better search results. I've long been familiar with the "filetype:", "link:", "site:" and "related:" operators, but the "allin[xyz]:" operators are new even to me.

Breaking Changes in System.Xml from v1.1 to v2.0 of .NET

Dare Obasanjo details changes to System.Xml in Beta 1 of .NET Framework v2.0 that will cause compatability headaches for unwary developers.

Sunday, July 04, 2004

A Witty Comment on Security Disclosure

In response to yet another browser security hole found by Secunia, a Slashdot commenter made the following clever retort:

The best place to get a response when reporting a security bug is on Bugtraq. :)
Funny but all too true, as the experiences of this other commenter illustrate.

Lorenzo Colitti and I found the same hole several weeks ago, independently of Mark Laurence. I reported it to on June 11 and to Microsoft and Opera on June 16. I got different results from each browser maker:

Mozilla ( 246448)
Fixed on June 14. Firefox 0.9 released with the fix June 14. Mozilla 1.7 released with the fix June 17.
Opera ( 145283)
No response.
On June 21, I received an e-mail containing the following: "... is by design. To prevent this behavior, set the 'Navigate sub-frames across different domains' zone option to Prompt or disable in the Internet zone. We are trying to get this fixed in Longhorn ... on getting this blocking on by default in XP SP2 but blocking these types of navigations is an app compatibility issue on many sites." I usually don't get any response from Microsoft when I report security holes to them; I think I only got a response this time because I used my employer's premier support contract with Microsoft.

Another cross-browser security hole I found ( 162020) got similar responses from each browser maker: fixed in Mozilla 1.7 and Firefox 0.9; no response from Opera; confusing statement from Microsoft mentioning XP SP2. 162020 is an arbitrary code execution hole.

Although they're all quick to complain about "irresponsible" disclosures of vulnerabilities by individuals who are supposedly unwilling to give them the time to come up with patches, the ugly reality is that most commercial software vendors prefer to ignore such problems unless there's some media pressure preventing them from doing so. There's nothing quite like a Reuters report alerting the entire world of some embarassing new security hole to get the attention of indifferent corporate entities.

One more thing worth noting: the way in which the Bugzilla database enables such rapid turnaround in bug-fixing time really ought to serve as an inspiration to other organizations. It's good to see Microsoft providing a product feedback center for Visual Studio 2005, but what's needed now is something along the same lines devoted to security problems; one shouldn't have to trawl through the Bugtraq archives to see what, if anything, has been said about a potential problem, only to then be forced to report it by broadcasting it to the entire world in order to get a meaningful response. If Microsoft had a security database where registered outsiders could log and track whatever issues they'd discovered, the impetus for "disclosure by public ambush" would be reduced considerably.

SQL Server Express Manager

Microsoft's Eric Feng, Program Manager for SQL Server Express, provides a rundown of SQL Server Express Manager, or XM for short. XM is described as a soon to be released lightweight tool for use in managing both local and remote instances of SQL Server.

Converting Firefox 0.8 Extensions

A helpful page with information on how to convert Firefox 0.8 extensions to work with the new API introduced with Firefox 0.9. Although there's an overview of a few of the tags in the "install.rdf" file, here's the link to Ben Goodger's description of the file's format. Goodger's document should be considered the authoritative guide as to what is permissible/recommended and what isn't.

Selectively Deleting Saved Form Information in Firefox

Firefox's "Autosave" of form information is usually a wonderful thing, but there are times when it really does get on one's nerves. This can especially be the case when one's made a typo, after which one then has to deal with seeing the erroneous information offered as an alternative forever henceforth.

Short of deleting all saved form information, there's been no easy way to correct such irritating mistakes, but courtesy of this page on the Mozilla Forum, here's a tip that is useful in selectively getting rid of such entries.

If you're using 0.9, you should be able to delete individual entries on form elements (what the original poster mentioned) within web pages with Shift-Delete.
I can verify that this also works with the 0.91 release of Firefox, and as someone else notes on the page, it works with the URL dropdown list as well.

Friday, July 02, 2004

No POSIX Support in Windows XP

I hadn't realized that the the OS/2 and POSIX subsystems included with both Windows NT and Windows 2000 had been pulled from Windows XP; going by this document, they're also absent from Windows Server 2003. OS/2 support appears to have simply been discontinued, but the POSIX subsystem functionality at least can be obtained by installing the freely available Windows Services for UNIX (the actual download is on this page).

Thursday, July 01, 2004

Debugging in Visual Studio 2005

Microsoft's Andy Pennell discusses new debugger features in the Whidbey release of Visual Studio, Beta 1 of which should soon be available on MSDN Universal.

Dashboard is not a Konfabulator Clone

Via MacMinute, I came across this interesting article by John Gruber about the differences between Apple's Dashboard and Arlo Rose's Konfabulator project. It confirms what I'd been suspecting myself - that what Apple's trying to do is a lot more ambitious than Konfabulator's objectives, and the superficial similarity between the two has been seized on by the technically illiterate who are always in search of a nice "Goliath cheats David" story.