PPP: The LNMIIT

October 3, 2008

We’ve heard the phrase “public private partnership” rather often, in the context of
(higher) education in India. Check out the LNMIIT - which seems to be an actual example
of such an entity.

Search, NAY Medical Informatics: The Semantic-Web “Killer-App”

April 4, 2008

Why do we need a “semantic-web” ? (Web) search appears as one compelling application so far …. however the user honestly seen anything useful and scalable yet …. Powerset - hurry up !!

However it seems to medical and clinical informatics is one community that has really adopting (and wisely so) semantic-web technologies to address their knowledge sharing, search, retrieval etc needs in a specialized domain. This may well be the “killer-app” for semantics technologies.

Our Article on EII

January 22, 2008

Our EII article is out in this month’s edition of DM Review Magazine.

This is a good overview of the “schema-less” data integration approach we have been advocating from the NETMARK work at NASA Ames.

MySQL Acquisition

SUN acquired MySQL last week for $1B.
What does this bode for open-source DBs, and open-source software in general …. ?

XAR Extraction System

August 18, 2007

This is an information extraction system for relation (slot-filling) extraction from free text. Details at http://www.ics.uci.edu/~ashish/xar/

We hope to be able to make it available to the community very soon.

Interesting Companies

January 1, 2007

Couple of (unrelated) interesting companies I would like to follow in 07. One is Powerset that is out to address natural language/semantic search.

Another is moka5 which is touting the “LivePC” concept - I have’nt tried their product yet (intend to) but from what I understand you can capture your PC environment on literally a jumpdrive and access that very environment as long as you have access to some PC (XP).

Schema-less data management

August 5, 2005

The “schema-less” and lean approach to information integration, as advocated by NETMARK is being well received. Papers on NETMARK (”Semi-structured Data Management ……”; IDEAS 2005 etc.) are available off http://www.ics.uci.edu/~ashish/publications.htm
I presented the ideas last month at SIGMOD in the industrial session, and then at IDEAS 2005 and received a rather good response to what I thought may appear as somewhat iconoclastic ideas at first sight.

Schema-less data management advocates using schemas only if and to the extent that one needs to. Structure if often implicit in business documents and that structure suffices for a large range of querying and search needs. Relaxing schema requirements then can greatly simply (and optimize) underlying data storage and also lessen/simplify work required to incorporate new documents/information sources. This is exactly what we have demonstrated and described in the above papers.

In summary I think the approach provides an excellent bargain. Focus on simple querying (by “context” and “content”) which really is adequate and powerful for most enterprise data, in exchange you are freed from schema creation and management (simply drag and drop your raw documents into NETMARK folders), and you have a database engine that is a solid 40-50 times faster that other XML over relational systems !