Data Leakage Detection 1 Final

By:
Vishal Patil Paresh Rawat Pratik Nikam Satish Patil
Under The Guidance Of Prof.Rucha Samant
Agenda
PROBLEM DEFINITION INTRODUCTION ISSUES SCOPE ANALYSIS DESIGN IMPLEMENTATIONS
Problem Definition
To detect whether data has been leaked by agents. To prevent data leakage .
Introduction
In the course of doing business, sometimes sensitive data must be handed over to supposedly trusted third parties. Our goal is to detect when the distributor's sensitive data has been leaked by agents, and if possible to identify the agent that leaked the data.
Existing System
Proposed System
In this system the leakage of data is detected by generating fake objects .

Data leakage prevention and detection of guilty agents is handled by e-mail filtering.
Types of employees that put your company at risk

The security illiterate The gadget nerds The unlawful residents The malicious/disgruntled employees
Analysis
Problem Setup and Notation
Distributor (D) is a system which will distribute data to agents Valuable Data (T) is the set of sensitive data which the system is going
to send to the agents Agent (U) is the set of agents to whom the system is going to send sensitive data. Request from client will be either sample request or explicit request.
Explicit Data Requests

1. Distributor having data T={t1,t2} 2. Agent request (R ) R1= {t1, t2} R2= {t1} R1 gets both data t1 and t2 R2 gets data t1 Therefore value of sum objective. R1+ R2 2/2 + 1/2 = 1.5 3. Select agent using Randomize function algorithm. SelectAgent {R1,.,R2} 4. E-optimal solution O(n+n2B)= O(n2B) Where n= number of agents, B= number of Fake objects.
In this algorithm, the agent receives the entire data object that satisfies the condition of the agents data request. The following algorithm shows the working of Explicit Data Request:
Sample Data Requests
With sample data requests, agents are not interested in particular objects. In this algorithm, the agent receives only the subset of data object that can be given. The working of Sample Data Request algorithm is same as the working of Explicit Data Request.
ARCHITECTURE DIAGRAM:
Requesting sensitive data
Data Distributor
Agents Requesting Secured data from the Data Distributor
Agents
ARCHITECTURE DIAGRAM:
Sensitive data is sent
Data Distributor
Data distributor sending the secured data to the agents
Agents
Agent tries to leak the sensitive data
Internet
Agent tries to leak the sensitive data
Internet
The system has the following
Data Allocation -- approach same as watermarking -- less sensitive -- add fake object in some cases
Fake Object -- Are real looking object -- Should not affect data -- Limit on fake object insertion(e-mail inbox) -- CREATEFAKEOBJECT (Ri, Fi, CONDi)
Optimization -- One constraint and one objective
-- Maximize the probability difference Data Distributor e-mail Filtering
Algorithm:
1.Identify the data. 2.Remove spamming stopping words. 3.Remove or change the synonyms. 4.Calculate the priority of the word depending upon the sensitivity of the data.
5.Compare data with predefine company data sets.

6.Filter the data if it has companys important data sets.
Agent
Attached data is not sensitive data
E-mail sent successfully
Attached data is a sensitive data
E-mail not sent as the data it contains is sensitive
O/S Language Data Base
: : :
Windows XP. Asp.Net, c#. Sql Server 2005
System Hard Disk Monitor Mouse Keyboard RAM
: : : : : :
Pentium IV 2.4 GHz 40 GB 15 VGA colour Logitech. 110 keys enhanced. 256 MB
In the real scenario there is no need to hand over the sensitive data to the
agents who will unknowingly or maliciously leak it.

However, in many cases, we must indeed work with agents that may not be 100 percent trusted, and we may not be certain if a leaked object came from an agent or from some other source. In spite of these difficulties, it is possible to assess the likelihood that an agent is responsible for a leak, based on the overlap of his data with the leaked data . The algorithms we have presented implement a variety of data distribution strategies that can improve the distributors chances of identifying a leaker.
R. Agrawal and J. Kiernan, Watermarking Relational Databases,Proc. 28th Intl Conf. Very Large Data Bases (VLDB 02), VLDB.Endowment, pp. 155-166, 2002. S. Czerwinski, R. Fromm, and T. Hodes, Digital Music Distribution and Audio Watermarking, http://www.scientificcommons. org/43025658, 2007. F. Guo, J. Wang, Z. Zhang, X. Ye, and D. Li, An Improved Algorithm to Watermark Numeric Relational Data, Information Security Applications, pp. 138149, Springer, 2006. S. Jajodia, P. Samarati, M.L. Sapino, and V.S. Subrahmanian, Flexible Support for Multiple Access Control Policies, ACM Trans. Database Systems, vol. 26, no. 2, pp. 214-260, 2001. Panagiotis Papadimitriou and Hector Garcia-Molina, Data Leakage Detection, IEEE Transactions on Knowledge and Data Engineering, Vol 23, No.1 January 2011. B. Mungamuru and H. Garcia-Molina, Privacy, Preservation and Performance: The 3 Ps of Distributed Data Management, technical report, Stanford Univ., 2008.

Data Leakage Detection 1 Final

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Data Leakage Detection 1 Final

Uploaded by

Copyright:

Available Formats

By:

Vishal Patil Paresh Rawat Pratik Nikam Satish Patil

Under The Guidance Of Prof.Rucha Samant

In this system the leakage of data is detected by generating fake objects .

Types of employees that put your company at risk

Explicit Data Requests

Sample Data Requests

Requesting sensitive data

Agents Requesting Secured data from the Data Distributor

Sensitive data is sent

Data distributor sending the secured data to the agents

Agent tries to leak the sensitive data

Agent tries to leak the sensitive data

The system has the following

Optimization -- One constraint and one objective

-- Maximize the probability difference Data Distributor e-mail Filtering

5.Compare data with predefine company data sets.

Attached data is not sensitive data

E-mail sent successfully

Attached data is a sensitive data

E-mail not sent as the data it contains is sensitive

O/S Language Data Base

Windows XP. Asp.Net, c#. Sql Server 2005

System Hard Disk Monitor Mouse Keyboard RAM

agents who will unknowingly or maliciously leak it.

You might also like