Schema-less data management
The “schema-less” and lean approach to information integration, as advocated by NETMARK is being well received. Papers on NETMARK (”Semi-structured Data Management ……”; IDEAS 2005 etc.) are available off http://www.ics.uci.edu/~ashish/publications.htm
I presented the ideas last month at SIGMOD in the industrial session, and then at IDEAS 2005 and received a rather good response to what I thought may appear as somewhat iconoclastic ideas at first sight.
Schema-less data management advocates using schemas only if and to the extent that one needs to. Structure if often implicit in business documents and that structure suffices for a large range of querying and search needs. Relaxing schema requirements then can greatly simply (and optimize) underlying data storage and also lessen/simplify work required to incorporate new documents/information sources. This is exactly what we have demonstrated and described in the above papers.
In summary I think the approach provides an excellent bargain. Focus on simple querying (by “context” and “content”) which really is adequate and powerful for most enterprise data, in exchange you are freed from schema creation and management (simply drag and drop your raw documents into NETMARK folders), and you have a database engine that is a solid 40-50 times faster that other XML over relational systems !
