Multi-dimensional data is a very important type of data since it covers almost every kind of dataset. Most of them could be formatted into regular tables or datasheets, where the columns are called their dimensionalities. Researchers already have a lot of experience dealing with these datasets, but in the age of big data, the expansion of the number of dimensionalities (columns) is already made it big trouble for people to handle them.
Researchers have created a huge amount of methods to reduce the dimensionality to let people could find insight information in an understandable way. The basic idea is to compute the importance of different dimensions and preserve the principle dimensionalities (PCA). If dimensionality reduced to 2 or 3, then we can understand them, because it’s easy to show them reduce data with 2D or 3D space. But the problem is the losing details could play a big role in the process of our understanding. How to find these missing messages or how to evaluate the result of dimensionality reduction has become a key problem. In other words, giving good explanations to these results is the final aim that we want to reach.
<< Read more articles in https://www.cxmoe.com >>
With extra interactions, some methods could let people have global explanations by adding axes to the 2/3D space. But they can’t give much more local explanations, meanwhile, the existing methods that have local explanations are all heavily dependent on interactions.
Thus, we introduced the image-based method to compute the local explanations. It allows users to get local explanations and could complete the computation in one time without extra interaction. We designed three models to compute the local information and finally created six methods to test. We used more than 20 datasets to test every method and gave detailed illustrations to these explanations we get. Using the intrinsic structure we have known, we verified the rationality of our explanations.
Now, this research is still moving on, the future work will focus on the fluence of different parameters, and the evaluation methods for our explanations. With multiple good explanations we computed, we aim to give users more extra understanding so that guide them to choose the best dimensionality reduction method to handle their dataset.
Comments