Seminar on Computational Learning and Adaptation



 
Fuzzy Data Analysis
- Learning Fuzzy Models from Data -
 
Michael Berthold
Berkeley Initiative in Soft Computing
University of California at Berkeley
Berkeley, CA 94720, USA
berthold@cs.berkeley.edu
 
 

Automatic Data Analysis has started to raise increasing attention, especially in areas where a  large amount of data  is gathered automatically and manual analysis is not feasible anymore.  Also applications where data is recorded online  without a possibility for continuous  analysis are demanding for automatic approaches.  Examples include such  diverse applications as the automatic monitoring of patients in medicine (which requires an understanding of the underlying behavior),  optimization of industrial processes,  and also the extraction of expert knowledge from observations of their behavior. Techniques from  diverse  disciplines  have  been  developed or  rediscovered recently,  resulting in an increasing set of  tools to automatically  analyze data sets.  Most of these tools,  however,  require the user to have detailed knowledge about the tools' underlying algorithms,  to fully make use of their potential.  In order to offer  the user the possibility  to explore the data, unrestricted by  a specific  tool's limitations,  it is necessary  to provide easy to use,  quick ways  to give the user first insights.  In addition the extracted  knowledge has to be  presented  to the user  in an  understandable manner, enabling interaction and refinement of the focus of analysis.

In this talk I will give an overview over the various steps required to perform successful data analysis  and will describe an example for an easy to use methodology to build interpretable models based  on fuzzy rules.  The resulting rules only constrain a small number of attributes thus making their interpretation possible even in high-dimensional feature spaces.  In addition the user can define granulations  of input and output variables which allows to focus the analysis  on specific aspects of interest.  I will conclude with a demonstration how these models can be used to point out potential outliers, that is, data examples that have low relevance or  could interfere with model generation.
 


Date: Thurs., April  29
Time: 4:15-5:30PM
Place: Cordura 100

Return to the seminar schedule