A New Comparison Metric for Computational Topology
Shariati, Pejmon
2021
-
There is often a common yet important set of questions that a data scientist, a mathematician, or even an engineer must ask themselves when dealing with any kind of data. Some of which include - what should I do with this data? How should I visualize it? What can I do to gain the most descriptive information from it? Broadly speaking, you could possibly use machine learning, statistics, or modeling ... read moreto better understand your data. Specifically within machine learning, there are some related methods of interest such as PCA (principal component analysis), SVM (support vector machines), neural networks, and TDA (topological data analysis). For the purposes of this paper, we focus on TDA - the study of the "shape of data." Persistent homology is a popular technique whose ultimate goal is to track the appearance and disappearance of homology classes (i.e. holes and connected components). However, our attention is directed towards an algorithm called "Mapper." It was developed by Gurjeet Singh, Facundo Memoli, and Gunnar Carlsson for topological data analysis projects. It is used to extract simple descriptions of high dimensional data sets. In other words, it is an efficient method for summarizing data. My goal is to extend the use of Mapper from summarizing individual data sets to comparing different data sets with each other. We will devise a method to accurately measure the difference between Mapper objects.The method will be based on converting Mapper objects into metric measure spaces and then using their weighted path induced distance matrices and the Gromov-Wasserstein metric to compare them. We will then use this method to perform experiments on synthetic data in the form of shapes that are familiar to us - circles, lines, etc. Once we are confident in the robustness and stability of our method, we will apply it to an example of real world data to see if we can gain any descriptive information. From here, we theorize the possible future benefits of our method on other real world data.
Advisors: Professor Moon Duchin and Professor Thomas Weighillread less - ID:
- 9593v843r
- To Cite:
- TARC Citation Guide EndNote
- Usage:
- Detailed Rights