![]() Moreover, the vector space embedding of the graphs and efficient filtration of the search space enables computation of approximate graph similarity at a throw-away cost. In this paper, we propose a novel technique called TraM for approximate graph matching that off-loads a significant amount of its processing on to the database making the approach viable for large graphs. Since public domain databases are noisy and incomplete in nature, inexact graph matching techniques have proven to be more promising in terms of inferring knowledge from numerous structural data repositories. Although the prohibitively expensive time complexity associated with exact subgraph isomorphism techniques has limited its efficacy in the application domain, approximate yet efficient graph matching techniques have received much attention due to their pragmatic applicability. To access the information embedded in graphs, efficient graph matching tools and algorithms have become of prime importance. Many emerging database applications entail sophisticated graph-based query manipulation, predominantly evident in large-scale scientific applications. ![]() Top-k similar graph matching using TraM in biological networks.Īmin, Mohammad Shafkat Finley, Russell L Jamil, Hasan M ![]() We conduct an empirical study on publicly available real-world datasets which shows that our framework provides efficient masking and achieves similar matching accuracy compared to the matching of actual unencoded patient records. We propose a framework with novel methods for masking numerical data using Bloom filters, thereby facilitating the calculation of similarities between records. date and time), which are commonly required in the health domain, has been presented in the literature. body mass index), and modulus (numbers wrap around upon reaching a certain value, e.g. However, no work on Bloom filter-based masking of numerical data, such as integer (e.g. Bloom filter encoding has widely been used as an efficient masking technique for privacy-preserving matching of string and categorical values. Therefore, the matching needs to be based on masked (encoded) values while being effective and efficient to allow matching of large databases. ![]() Due to increasing privacy and confidentiality concerns, using the actual attribute values of patient records to identify similar records across different organizations is becoming non-trivial because the attributes in such records often contain highly sensitive information such as personal and medical details of patients. A key component of identifying similar records is the calculation of similarity of the values in attributes (fields) between these records. One important type of entity matching application that is vital for quality healthcare analytics is the identification of similar patients, known as similar patient matching. The identification of similar entities represented by records in different databases has drawn considerable attention in many application areas, including in the health domain. Privacy-preserving matching of similar patients. ![]()
0 Comments
Leave a Reply. |