A Novel Hybrid Approach of Suppression and Randomization for Privacy Preserving Data Mining
Keywords:
Prime- Anonymization, Computation time, Privacy preservation, Randomization, Suppression.Abstract
- In the era of technology advancement, knowledge extraction from large amount of data is very much important task. The process of data mining is applied to get the useful information from the data stored in a centralized server for important decision making process of multiple organizations. When multiple organizations collect the data for mutual gain, it gets vulnerable to individual’s private data. Different approaches such generalization, perturbation, cryptography and randomization are used for taking care of the confidentiality of any individual’s private data. Each of these methods has their own pros and cons like in anonymization, huge loss of information can take place. Data that is used in the process of data mining contain many attributes which hold confidential data of an individual and many attributes can reveal the private information of an individual, if those are associated with each other. These attributes are called quasi identifiers (QID). Individually these attributes don’t breach the security but in a combined way these may be vulnerable towards the security of private data. Thus, there is the requirement of an approach to overcome the problem of disclosing of private data through quasi identifiers. Our proposed method of combining the suppression and randomization presents the solution to this problem. The method conserves the data privacy with the zero information loss in the process of regaining the actual values. The proposed work is carried out by making a local centralized server and outcomes are matched up with anonymization process to
obtain the better results.