Research Themes
The description below is partly the same
as given in the page of Knowledge Base Lab.
I wrote this article about five years before.
So, the contents, particularly the first one
for Data Mining, should be revised.

Data Mining Story Databases Motion Analogy


Data Mining

Data Mining

In this decade, many efforts have been made to develop various methods for discovering valuable knowledge from a huge data set in our network world. As the amount of data is very huge, we need to disregard useless aspects of data before applying a data mining algorithm. After this kind of preprocessing, which we call a data abstraction, we can focus the mining processes on the other aspects of data suitable for a given mining task. We have already developed two kinds of data abstractions and show their effectiveness. One is for categorical and symbolic data and makes use of a machine-readable dictionary. It selects the best abstraction, a grouping of categorical values, so that the mean of information loss due to the grouping is minimized. The other one is for numerical data. It computes the best family of intervals with less number of intervals and within an allowable degree of information loss. In both cases, the data set becomes more compact one without loosing important aspects of data according to the mining task.



Story Databases

Story Databases

Internet users normally access Web cites by their index terms. Although the index terms are said to be practically enough to specify the cites users look for, it is always the case that we find a lot of irrelevant cites among the retrieved ones. In order to improve the precision of search engine for the Web, we take conceptual relationships between terms into account, consider an ordered set of conceptually structured terms, and regard the ordered set as a story description extracted from each Web cites. Thus, we intend to view the Web as a huge story database consisting of Web cites telling us stories from various viewpoints. We already started with providing a foundation of story databases. Actually, we have introduced a notion of maximal analogy to extract relevant story descriptions from natural language texts of Web cites, and are now developing a new text summarization technique, which is needed to reduce the computational cost for the extraction of maximal analogies.



Motion Analogy

Motion Analogy
Various kinds of motion data are now collected by means of motion capture, and are stored in a motion database. Each motion in the databases is reused to form another new motions by using a motion editor. However, the editing task is generally complex and hard for non-expert users. To reduce the cost in synthesizing new motions from those stored in a motion database, we have already presented a synthesis method that can identify some reusable segment of a source motion in the database and assimilate the segment motion into the target motion in inquiry. As the target motion thus synthesized is analogous to the source motion, we call the synthesis process a motion analogy. Although the current synthesis algorithm supposes that both the source and the target obey the same human figure model, we plan to extend the algorithm so that the source and the target figure models are different ones. Then, we would be able to create a motion by a command such as "play just like a bird".