So the final talk was by Shivakumar Vaithyanathan, a colleague of mine at Almaden Research Center. He has been working in the area of IR for a long time, and a large fraction of what I know in this field is what I have learnt from him. I have talked about his work in the past, something called Avatar, which is about extraction of precise semantics from a corpus (such as email) -- getting people, phone numbers, people's phone numbers, and a 100 other such concepts absolutely, near 100% right, near 100% of the time. So this talk was about how he and his colleagues built the system, and the experiences they had, and how, at least one other IR group from University of Wisconsin has reached similar conclusions. It was a fascinating journey, and I am going to get it wrong, so I urge the readers to see his slides here.
The only credit I can take in this piece of work is to ask Shiv to work on a general purpose, information extraction system, but to apply it to the task of email. And the result of that is phenomenal. The CTO of an F100 company who has used it is extremely pleased, and thousands of internal IBM users cannot live without it. You can give it a spin yourself -- I am sure Shiv and co would appreciate the feedback. It is called IBM OmniFind Personal EMail Search (IOPES -- clearly some IBM naming guru was involved :), and it can be downloaded from here -- it works with Lotus Notes and Outlook.
Comments