« The Future of IT and Open Collaborative Research | Main | Open Collaborative Research and the availability of data »



I am thinking the really interesting characteristic of large collections of tagged data is the difficulty of building appropriate indexes for the data "on the fly" as access patterns emerge.

This is really quite interesting when compared to Relational Databases, where we have a lot of techniques for picking out what "the best SQL" or the "the best Index construction" would be.

In a large collection of tagged data that is directly accessable, applications would come and go, and theoretically at least, so should index techniques, since it would be improbably difficult to provide high performance access to very large amounts of data *without* something equivalent to indexing in the RDB.

Adapative indexing? I'm not sure what to call this.

Thank you for the informative blog posts by the way!

The comments to this entry are closed.