Ultimode Systems
Data Mining Consultancy
Berkeley, CA
www.ultimode.com
The segmentation problem arises in many applications in data mining, A.I. and statistics. In this talk, we consider segmenting simple time series -- this involves determining how many distinct intervals there are in a time series and when they occur. For example, when we examine ecomonic time series it would be useful to identify periods of growth, recession, depression, etc. We apply (without mathematical details) Minimum Message Length (MML) to the segmentation problem. We also consider a range of other approaches to segmentation, including: a Bayes Factors approach, Minimum Description Length (MDL) and Classical Statistical approaches. We find the segmentation problem interesting because it highlights significant differences between MML, MDL and Bayes Factors. Simulations comparing these approaches indicated that: a) MML gave significantly different and superior results to the Bayes Factors approach, and b) while MDL messages were shorter than MML messages, the MML results were again superior to MDL. Finally, we apply the segmentation method to real world time series data.
Date: Thurs., February 12; Time: 4:15-5:30PM; Place: Gates 100
Return to seminar schedule.