An Efficient and Effective Model for Preserving Sensitive Data in Location-Based Graphs Using Data Generalization and Data Suppression in Conjunction with Data Sliding Windows and R-Trees
- Posted
- Server
- Preprints.org
- DOI
- 10.20944/preprints202508.2125.v1
Location-based services (LBS) are well-known services that provide a user’s position and deliver tailored experiences. They are generally used for getting from one location to another, tracking, mapping, and timing, and they are often available in smartphones, tablets, computers, and applications such as Facebook, Twitter, TikTok, and YouTube. Aside from these, the data is collected by location-based services, which can be provided to the data analyst for some business reasons, such as improving marketing strategies, organizational policies, and customer services. In this situation, it can lead to privacy violation concerns. To reduce these concerns when location-based data is provided to the data analyst or released to be utilized outside the scope of data collecting organizations, several privacy preservation models have been proposed, such as k-Anonymity, l-Diversity, t-Closeness, LKC-Privacy, differential privacy, and location-based privacy preservation models. Unfortunately, to the best of our knowledge about these privacy preservation models, they still have several vulnerabilities regarding privacy violation concerns that must be addressed when location-based data is released, i.e., privacy violation issues from inferring sensitive locations (e.g., specialized hospitals, pawnshops, prisons, and safe house), privacy violation issues from considering duplicate trajectory paths (i.e., although the user’s visited path duplicate with other paths, it still has privacy violation issues when it consists of a sensitive location), and privacy violation issues from considering unique locations (e.g., home, condominium, and office). Moreover, these privacy preservation models have data utility issues and data transformation complexity that must be improved. To address these vulnerabilities, a new privacy preservation model, (ξ, ϵ)-Privacy, is proposed in this work. It is based on data generalization and data suppression in conjunction with data sliding windows and R-Tree, such that there are no concerns about privacy violations in its released location-based data from using privacy violation issues from inferring sensitive locations, privacy violation issues from considering duplicate trajectory paths, and privacy violation issues from considering unique locations. It is highly efficient and effective in data maintenance. Furthermore, we show that the proposed model is efficient and effective through extensive experiments.