You are on page 1of 49

Paper Title

Automatic Protocol Blocker for Privacy-Preserving Public Auditing in Cloud Computing (2012)

Abstract
Cloud Computing is the long dreamed vision of computing as a utility, where users can remotely store their data into the cloud so as to enjoy the on-demand high quality applications and services from a shared pool of configurable computing resources. By data outsourcing, users can be relieved from the burden of local data storage and maintenance. However, the fact that users no longer have physical possession of the possibly large size of outsourced data makes the data integrity protection in Cloud Computing a very challenging and potentially formidable task, especially for users with constrained computing resources and capabilities. Thus, enabling public auditability for cloud data storage security is of critical importance so that users can resort to an external audit party to check the integrity of outsourced data when needed. To securely introduce an effective Third Party Auditor (TPA), the following two fundamental requirements have to be met: 1) TPA should be able to efficiently audit the cloud data storage without demanding the local copy of data, and introduce no additional on-line burden to the cloud user; 2) The Third Party Auditing process should bring in no new vulnerabilities towards user data privacy. In this paper we are extending the previous system by using automatic blocker for privacy preserving public auditing for data storage security in cloud computing. we utilize the public key based homomorphic authenticator and uniquely integrate it with random mask technique and automatic blocker. to achieve a privacy-preserving public auditing system for cloud data storage security while keeping all above requirements in mind. Extensive security and performance analysis shows the proposed schemes are provably secure and highly efficient. Cloud-based outsourced storage relieves the clients burden for storage management and maintenance by providing a comparably low-cost, scalable, location-independent platform. However, the fact that clients no longer have physical possession of data indicates that they are facing a potentially formidable risk for missing or corrupted data. To avoid the security risks, audit services are critical to ensure the integrity and availability of outsourced data and to achieve digital forensics and credibility on cloud computing. Provable data possession (PDP), which is a cryptographic technique for verifying the integrity of data without retrieving it at an untrusted server, can be used to realize audit services. In this paper, profiting from the interactive zero-knowledge proof system, we address the construction

Efficient audit service outsourcing for data integrity in clouds(2012)

of an interactive PDP protocol to prevent the fraudulence of prover(soundness property) and the leakage of verified data (zero-knowledge property). We prove that our construction holds these properties based on the computation DiffieHellman assumption and the rewindable black-box knowledge extractor. We also propose an efficient mechanism with respect to probabilistic queries and periodic verification to reduce the audit costs per verification and implement abnormal detection timely. In addition, we present an efficient method for selecting an optimal parameter value to minimize computational overheads of cloud audit services. Our experimental results demonstrate the effectiveness of our approach.

Enabling Secure and Efficient Ranked Keyword Search over Outsourced Cloud Data(2012)

Cloud computing economically enables the paradigm of data service outsourcing. However, to protect data privacy, sensitive cloud data has to be encrypted before outsourced to the commercial public cloud, which makes effective data utilization service a very challenging task. Although traditional searchable encryption techniques allow users to securely search over encrypted data through keywords, they support only Boolean search and are not yet sufficient to meet the effective data utilization need that is inherently demanded by large number of users and huge amount of data files in cloud. In this paper, we define and solve the problem of secure ranked keyword search over encrypted cloud data. Ranked search greatly enhances system usability by enabling search result relevance ranking instead of sending undifferentiated results, and further ensures the file retrieval accuracy. Specifically, we explore the statistical measure approach, i.e. relevance score, from information retrieval to build a secure searchable index, and develop a one-to-many order-preserving mapping technique to properly protect those sensitive score information. The resulting design is able to facilitate efficient server-side ranking without losing keyword privacy. Thorough analysis shows that our proposed solution enjoys as-strong-as-possible security guarantee compared to previous searchable encryption schemes, while correctly realizing the goal of ranked keyword search. Extensive experimental results demonstrate the efficiency of the proposed solution.

HOMOMORPHIC AUTHENTICATION WITH RANDOM MASKING TECHNIQUE ENSURING PRI-VACY & SECURITY IN CLOUD COMPUTING(2012)

Cloud computing may be defined as delivery of product rather than service. Cloud computing is a internet based computing which enables sharing of services. Many users place their data in the cloud. However, the fact that users no longer have physical possession of the possibly large size of outsourced data makes the data integrity protection in cloud computing a very challenging and potentially formida-ble task, especially for users with constrained computing resources and capabilities. So correctness of data and security is a prime concern. This article studies the problem of ensuring the integrity and security of data storage in Cloud Computing. Security in cloud is achieved by signing the data block before sending to the cloud. Signing is performed using BonehLynnShacham (BLS) algorithm which is more secure compared to other algorithms. To ensure the correctness of data, we consider an external auditor called as third party auditor (TPA), on behalf of the cloud user, to verify the integrity of the data stored in the cloud. By utilizing public key based homomorphic authenticator with random masking privacy preserving public auditing can be achieved. The technique of bilinear aggregate signature is used to achieve batch auditing. Batch auditing reduces the computation overhead. Extensive security and performance analysis shows the proposed schemes are provably secure and highly efficient. Cloud Computing is the long dreamed vision of computing as a utility, where users can remotely store their data into the cloud so as to enjoy the on-demand high quality applications and services from a shared pool of configurable computing resources. By data outsourcing, users can be relieved from the burden of local data storage and maintenance. However, the fact that users no longer have physical possession of the possibly large size of outsourced data makes the data integrity protection in Cloud Computing a very challenging and potentially formidable task, especially for users with constrained computing resources and capabilities. Thus, enabling public auditability for cloud data storage security is of critical importance so that users can resort to an external audit party to check the integrity of outsourced data when needed. To securely introduce an effective third party auditor (TPA), the following two fundamental requirements have to be met: 1) TPA should be able to efficiently audit the cloud data storage without demanding the local copy of data, and introduce no additional online burden to the cloud user; 2) he third party auditing process should bring in no new vulnerabilities towards user data privacy. In this paper, we utilize and uniquely combine the public key based homomorphic authenticator with random masking to achieve the privacypreserving public cloud data auditing system,

Preserving Integrity of Data and Public Auditing for Data Storage Security in Cloud Computing(2012)

which meets all above requirements. To support efficient handling of multiple auditing tasks, we further explore the technique of bilinear aggregate signature to extend our main result into a multi-user setting, where TPA can perform multiple auditing tasks simultaneously. Extensive security and performance analysis shows the proposed schemes are provably secure and highly efficient

Scalable and Secure Sharing of Personal Health Records in Cloud Computing using Attribute-based Encryption

Personal health record (PHR) is an emerging patient-centric model of health information exchange, which is often outsourced to be stored at a third party, such as cloud providers. However, there have been wide privacy concerns as personal health information could be exposed to those third party servers and to unauthorized parties. To assure the patients control over access to their own PHRs, it is a promising method to encrypt the PHRs before outsourcing. Yet, issues such as risks of privacy exposure, scalability in key management, flexible access and efficient user revocation, have remained the most important challenges toward achieving fine-grained, cryptographically enforced data access control. In this paper, we propose a novel patient-centric framework and a suite of mechanisms for data access control to PHRs stored in semi-trusted servers. To achieve fine-grained and scalable data access control for PHRs, we leverage attribute based encryption (ABE) techniques to encrypt each patients PHR file. Different from previous works in secure data outsourcing, we focus on the multiple data owner scenario, and divide the users in the PHR system into multiple security domains that greatly reduces the key management complexity for owners and users. A high degree of patient privacy is guaranteed simultaneously by exploiting multi-authority ABE. Our scheme also enables dynamic modification of access policies or file attributes, supports efficient on-demand user/attribute revocation and break-glass access under emergency scenarios. Extensive analytical and experimental results are presented which show the security, scalability and efficiency of our proposed scheme.

ELCA Evaluation for Keyword Search on Probabilistic XML Data(2011)

As probabilistic data management is becoming one of the main research focuses and keyword search is turning into a more popular query means, it is natural to think how to support keyword queries on probabilistic XML data. With regards to keyword query on deterministic XML documents, ELCA (Exclusive Lowest Common Ancestor) semantics allows more relevant fragments rooted at the ELCAs to appear as results and is more popular compared with other keyword query result semantics (such as SLCAs). In this paper, we investigate how to evaluate ELCA results for keyword queries on probabilistic XML documents. After defining probabilistic ELCA semantics in terms of possible world semantics, we propose an approach to compute ELCA probabilities without generating possible worlds. Then we develop an efficient stack-based algorithm that can find all probabilistic ELCA results and their ELCA probabilities for a given keyword query on a probabilistic XML document. Finally, we experimentally evaluate the proposed ELCA algorithm and compare it with its SLCA counterpart in aspects of result effectiveness, time and space efficiency, and scalability. An algorithm for anonymous sharing of private data among parties is developed. This technique is used iteratively to assign these nodes ID numbers ranging from 1 to . This assignment is anonymous in that the identities received are unknown to the other members of the group. Resistance to collusion among other members is verified in an information theoretic sense when private communication channels are used. This assignment of serial numbers allows more complex data to be shared and has applications to other problems in privacy preserving data mining, collision avoidance in communications and distributed database access. The required computations are distributed without using a trusted central authority. Existing and new algorithms for assigning anonymous IDs are examined with respect to trade-offs between communication and computational requirements. The new algorithms are built on top of a secure sum data mining operation using Newtons identities

Privacy Preserving Data Sharing With Anonymous ID Assignment(2013)

and Sturms theorem. An algorithm for distributed solution of certain polynomials over finite fields enhances the scalability of the algorithms. Markov chain representations are used to find statistics on the number of iterations required, and computer algebra gives closed form results for the completion rates.

Privacy-Preserving Fine-Grained Access Control in Public Clouds(2012)

With many economical benefits of cloud computing, many organizations have been considering moving their information systems to the cloud. However, an important problem in public clouds is how to selectively share data based on fine-grained attribute based access control policies while at the same time assuring confidentiality of the data and preserving the privacy of users from the cloud. In this article, we briefly discuss the drawbacks of approaches based on well known cryptographic techniques in addressing such problem and then present two approaches that address these drawbacks with different trade-offs.

Privacy-Preserving Public Auditing for Secure Cloud Storage

Using Cloud Storage, users can remotely store their data and enjoy the on-demand high quality applications and services from a shared pool of configurable computing resources, without the burden of local data storage and maintenance. However, the fact that users no longer have physical possession of the outsourced data makes the data integrity protection in Cloud Computing a formidable task, especially for users with constrained computing resources. Moreover, users should be able to just use the cloud storage as if it is local, without worrying about the need to verify its integrity. Thus, enabling public auditability for cloud storage is of critical importance so that users can resort to a third party auditor (TPA) to check the integrity of outsourced data and be worry-free. To securely introduce an effective TPA, the auditing process should bring in no new vulnerabilities towards user data privacy, and introduce no additional online burden to user. In this paper, we propose a secure cloud storage system supporting privacy-preserving public auditing. We further extend our result to enable the TPA to perform audits for multiple users simultaneously and efficiently. Extensive security and performance analysis show the proposed schemes are provably secure

and highly efficient.

Scalable and Secure Sharing of Personal Health Records in Cloud Computing using Attribute-based Encryption(2012)

Personal health record (PHR) is an emerging patient-centric model of health information exchange, which is often outsourced to be stored at a third party, such as cloud providers. However, there have been wide privacy concerns as personal health information could be exposed to those third party servers and to unauthorized parties. To assure the patients control over access to their own PHRs, it is a promising method to encrypt the PHRs before outsourcing. Yet, issues such as risks of privacy exposure, scalability in key management, flexible access and efficient user revocation, have remained the most important challenges toward achieving fine-grained, cryptographically enforced data access control. In this paper, we propose a novel patient-centric framework and a suite of mechanisms for data access control to PHRs stored in semi-trusted servers. To achieve fine-grained and scalable data access control for PHRs, we leverage attribute based encryption (ABE) techniques to encrypt each patients PHR file. Different from previous works in secure data outsourcing, we focus on the multiple data owner scenario, and divide the users in the PHR system into multiple security domains that greatly reduces the key management complexity for owners and users. A high degree of patient privacy is guaranteed simultaneously by exploiting multi-authority ABE. Our scheme also enables dynamic modification of access policies or file attributes, supports efficient on-demand user/attribute revocation and break-glass access under emergency scenarios. Extensive analytical and experimental results are presented which show the security, scalability and efficiency of our proposed scheme.

Secure Mining of Association Rules in Horizontally Distributed Databases

We propose a protocol for secure mining of association rules in horizontally distributed databases. The current leading protocol is that of Kantarcioglu and Clifton [18]. Our protocol, like theirs, is based on the Fast Distributed Mining (FDM) algorithm of Cheung et al. [8], which is an unsecured distributed version of the Apriori algorithm. The main ingredients in our protocol are two novel

secure multi-party algorithms one that computes the union of private subsets that each of the interacting players hold, and another that tests the inclusion of an element held by one player in a subset held by another. Our protocol offers enhanced privacy with respect to the protocol in [18]. In addition, it is simpler and is significantly more efficient in terms of communication rounds, communication cost and computational cost.

A FastClustering-Based Feature Subset Selection Algorithm for High Dimensional Data(2013)

Feature selection involves identifying a subset of the most useful features that produces compatible results as the original entire set of features. A feature selection algorithm may be evaluated from both the efficiency and effectiveness points of view. While the efficiency concerns the time required to find a subset of features, the effectiveness is related to the quality of the subset of features. Based on these criteria, a fast clustering-based feature selection algorithm, FAST, is proposed and experimentally evaluated in this paper. The FAST algorithm works in two steps. In the first step, features are divided into clusters by using graph-theoretic clustering methods. In the second step, the most representative feature that is strongly related to target classes is selected from each cluster to form a subset of features. Features in different clusters are relatively independent, the clustering-based strategy of FAST has a high probability of producing a subset of useful and independent features. To ensure the efficiency of FAST, we adopt the efficient minimum-spanning tree clustering method. The efficiency and effectiveness of the FAST algorithm are evaluated through an empirical study. Extensive experiments are carried out to compare FAST and several representative feature selection algorithms, namely, FCBF, ReliefF, CFS, Consist, and FOCUS-SF, with respect to four types of well-known classifiers, namely, the probability-based Naive Bayes, the tree-based C4.5, the instance-based IB1, and the rule-based RIPPER before and after feature selection. The results, on 35 publicly available real-world high dimensional image, microarray, and text data, demonstrate that FAST not only produces smaller subsets of features but also improves the performances of the four types of classifiers.

CloudMoV: Cloud-based Mobile Social TV(2013)

The rapidly increasing power of personal mobile devices (smartphones, tablets, etc.) is providing much richer contents and social interactions to users on the move. This trend however is throttled by the limited battery lifetime of mobile devices and unstable wireless connectivity, making the highest possible quality of service experienced by mobile users not feasible. The recent cloud computing technology, with its rich resources to compensate for the limitations of mobile devices and connections, can potentially provide an ideal platform to support the desired mobile services. Tough challenges arise on how to effectively exploit cloud resources to facilitate mobile services, especially those with stringent interaction delay requirements. In this paper, we propose the design of a Cloud-based, novel Mobile sOcialtV system (CloudMoV). The system effectively utilizes both PaaS (Platform-as-a-Service) and IaaS (Infrastructure-asaService) cloud services to offer the living-room experience of video watching to a group of disparate mobile users who can interact socially while sharing the video. To guarantee good streaming quality as experienced by the mobile users with timevarying wireless connectivity, we employ a surrogate for each user in the IaaS cloud for video downloading and social exchanges on behalf of the user. The surrogate performs efficient stream transcoding that matches the current connectivity quality of the mobile user. Given the battery life as a key performance bottleneck, we advocate the use of burst transmission from the surrogates to the mobile users, and carefully decide the burst size which can lead to high energy efficiency and streaming quality. Social interactions among the users, in terms of spontaneous textual exchanges, are effectively achieved by efficient designs of data storage with BigTable and dynamic handling of large volumes of concurrent messages in a typical PaaS cloud. These various designs for flexible transcoding capabilities, battery efficiency of mobile devices and spontaneous social interactivity together provide an ideal platform for mobile social TV services. We have implemented CloudMoV on Amazon EC2 and Google App Engine and verified its superior performance based on realworld experiments. In this paper, we consider the collaborative data publishing problem for anonymizing horizontally partitioned data at multiple data providers. We consider a new type of insider attack by colluding data providers who may use their own data records (a subset of the overall data) in addition to the external background knowledge to infer the data records contributed by other data providers. The paper addresses this

m-Privacy for Collaborative Data Publishing(2013)

new threat and makes several contributions. First, we introduce the notion of m-privacy, which guarantees that the anonymized data satisfies a given privacy constraint against any group of up to m colluding data providers. Second, we present heuristic algorithms exploiting the equivalence group monotonicity of privacy constraints and adaptive ordering techniques for efficiently checkingm-privacy given a set of records. Finally, we present a data provider-aware anonymization algorithm with adaptive mprivacy checking strategies to ensure high utility and m-privacy ofanonymized data with efficiency. Experiments on real-life datasets suggest that our approach achieves better or comparable utility and efficiency than existing and baseline algorithms while providingm-privacy guarantee.

A Load Balancing Model Based on Cloud Partitioning for the Public Cloud(2013)

Abstract: Load balancing in the cloud computing environment has an important impact on the performance. Good load balancing makes cloud computing more efcient and improves user satisfaction. This article introduces a better load balance model for the public cloud based on the cloud partitioning concept with a switch mechanism to choose different strategies for different situations. The algorithm applies the game theory to the load balancing strategy to improve the efciency in the public cloud environment. Key words: load balancing model; public cloud; cloud partition; game theory Recently, mobile trafc (especially video trafc) explosion becomes a serious concern for mobile network operators. While video streaming services become crucial for mobile users, their trafc may often exceed the bandwidth capacity of cellular networks. To address the video trafc problem, we consider a future Internet architecture: Named Data Networking (NDN). In this paper, we design and implement a framework of adaptive mobile video streaming and sharing in the NDN architecture (AMVS-NDN) considering that most of mobile stations have multiple wireless interfaces (e.g., 3G and WiFi). To demonstrate the benet of NDN, AMVS-NDN has two key functionalities: (1) a mobile station (MS) seeks to use either 3G/4G or WiFi links opportunistically, and (2) MSs can share content directly by exploiting local WiFiconnectivities. We implement AMVSNDN over CCNx, and perform tests in a real testbed consisting of a WiMAX base station and Android phones. Testing with time-varying link conditions in mobile environments reveals that

AMVS-NDN: Adaptive Mobile Video Streaming and Sharing in Wireless Named Data Networking

AMVS-NDN achieves the higher video quality and less cellular trafc than other solutions. Index TermsNamed Data Networking, Adaptive Video Streaming, Mobile Networks, Ofoading and Sharing.

CLOUD COMPUTING FOR MOBILE USERS: CAN OFFLOADING COMPUTATION SAVE ENERGY?

The cloud heralds a new era of computing where application services are provided through the Internet. Cloud computing can enhance the computing capability of mobile systems, but is it the ultimate solution forextendingsuchsystemsbatterylifetimes? AbstractGenerating models from large data setsand determining which subsets of data to mineis becoming increasingly automated. However choosing what data to collect in the rst place requires human intuition or experience, usually supplied by a domain expert. This paper describes a new approach to machine science which demonstrates for the rst time that non-domain experts can collectively formulate features, and provide values for those features such that they are predictive of some behavioral outcome of interest. This was accomplished by building a web platform in which human groups interact to both respond to questions likely to help predict a behavioral outcome and pose new questions to their peers. This results in a dynamically-growing online survey, but the result of this cooperative behavior also leads to models that can predict users outcomes based on their responses to the user-generated survey questions. Here we describe two web-based experiments that instantiate this approach: the rst site led to models that can predict users monthly electric energy consumption; the other led to models that can predict users body mass index. As exponential increases in content are often observed in successful online collaborative communities, the proposed methodology may, in the future, lead to similar exponential rises in discovery and insight into the causal factors of behavioral outcomes. Index TermsCrowdsourcing, machine science, surveys, social media, human behavior modeling

Crowdsourcing Predictors of Behavioral Outcomes(2012)

Facilitating Document Annotation using Content and Querying Value

A large number of organizations today generate and share textual descriptions of their products, services, and actions. Such collections of textual data contain signicant amount of structuredinformation, which remains buried in the unstructured text. While information extraction algorithms facilitate the extraction

of structured relations, they are often expensive and inaccurate, especiallywhen operating on top of text that does not contain any instances of the targeted structured information. We present a novelalternative approach that facilitates the generation of the structuredmetadata by identifying documents that are likely to contain informationof interest and this information is going to be subsequently usefulfor querying the database. Our approach relies on the idea that humansare more likely to add the necessary metadata during creationtime, if prompted by the interface; or that it is much easier for humans(and/or algorithms) to identify the metadata when such informationactually exists in the document, instead of naively prompting users toll in forms with information that is not available in the document. Asa major contribution of this paper, we present algorithms that identifystructured attributes that are likely to appear within the document,by jointly utilizing the content of the text and the query workload. Ourexperimental evaluation shows that our approach generates superiorresults compared to approaches that rely only on the textual content or only on the query workload, to identify attributes of interest.

Privacy-Preserving Public Auditing forSecure Cloud Storage

Using Cloud Storage, users can remotely store their data and enjoy the on-demand high quality applications and services from a shared pool of congurable computing resources, without the burden of local data storage and maintenance. However, the fact that users no longer have physical possession of the outsourced data makes the data integrity protection in Cloud Computing a formidable task, especially for users with constrained computing resources. Moreover, users should be able to just use the cloud storage as if it is local, without worrying about the need to verify its integrity. Thus, enabling public auditability for cloud storage is of critical importance so that users can resort to a third party auditor (TPA) to check the integrity of outsourced data and be worry-free. To securely introduce an effective TPA, the auditing process should bring in no new vulnerabilities towards user data privacy, and introduce no additional online burden to user. In this paper, we propose a secure cloud storage system supporting privacypreserving public auditing. We further extend our result to enable the TPA to perform audits for multiple users simultaneously and efciently. Extensive security and performance analysis show the proposed

schemes are provably secure and highly efcient. Our preliminary experiment conducted on Amazon EC2 instance further demonstrates the fast performance of the design.

Spatial Approximate String Search

Winds of Change: From Vendor Lock-Into the Meta Cloud

This work deals with the approximate string search in large spatial databases. Specically, we investigate range queries augmented with a string similarity search predicate in both Euclidean space and road networks. We dub this query the spatial approximate string (SAS) query. In Euclidean space, we propose an approximate solution, the MHR-tree, which embeds min-wise signatures into an R-tree. The min-wise signature for an index node u keeps a concise representation of the union of q-grams from strings under the sub-tree of u. We analyze the pruning functionality of such signatures based on the set resemblance betweenthe query string and the q-grams from the sub-trees of index nodes. We also discuss how to estimate the selectivity of a SASquery in Euclidean space, for which we present a novel adaptive algorithm to nd balanced partitions using both the spatial andstring information stored in the tree. For queries on road networks, we propose a novel exact method, RSASSOL, which signicantlyoutperforms the baseline algorithm in practice. The RSASSOL combines the q-gram based inverted lists and the reference nodes basedpruning. Extensive experiments on large real data sets demonstrate the efciency and effectiveness of our approaches. Index Termsapproximate string search, range query, road network, spatial databases The emergence of yet more cloud offerings from a multitude of service providerscalls for a meta cloud to smoothen the edges of the jagged cloud landscape. This meta cloud could solve the vendor lock-in problems that current publicand hybrid cloud users face.

Data-Provenance Verification For Secure Hosts(2012)

Malicious software typically resides stealthily on a users computer and interacts with the users computing resources. Our goal in this work is to improve the trustworthiness of a host and its system data. Specifically, we provide a new mechanism that ensures the correct origin or provenance of critical system information and prevents adversaries from utilizing host resources. We define dataprovenance integrity as the security property stating that the source where a piece of data is generated cannot be spoofed or tampered

with. We describe a cryptographic provenance verification approach for ensuring system properties and system-data integrity at kernel-level. Its two concrete applications are demonstrated in the keystroke integrity verification and malicious traffic detection. Specifically, we first design and implement an efficient cryptographic protocol that enforces keystroke integrity by utilizing on-chip Trusted Computing Platform (TPM). The protocol prevents the forgery of fake key events by malware under reasonable assumptions. Then, we demonstrate our provenance verification approach by realizing a lightweight framework for restricting outbound malware traffic. This traffic-monitoring framework helps identify network activities of stealthy malware, and lends itself to a powerful personal firewall for examining all outbound traffic of a host that cannot be bypassed.

Anomaly Detection for Discrete Sequences: A Survey

This survey attempts to provide a comprehensive and structured overview of the existing research for the problem of detecting anomalies in discrete sequences. The aim is to provide a global understanding of the sequence anomaly detection problem and how techniques proposed for different domains relate to each other. Our specific contributions are as follows: We identify three distinct formulations of the anomaly detection problem, and review techniques from many disparate and disconnected domains that address each of these formulations. Within each problem formulation, we group techniques into categories based on the nature of the underlying algorithm. For each category, we provide a basic anomaly detection technique, and show how the existing techniques are variants of the basic technique. This approach shows how different techniques within a category are related or different from each other. Our categorization reveals new variants and combinations that have not been investigated before for anomaly detection. We also

provide a discussion of relative strengths and weaknesses of different techniques. We show how techniques developed for one problem formulation can be adapted to solve a different formulation; thereby providing several novel adaptations to solve the different problem formulations. We highlight the applicability of the techniques that handle discrete sequences to other related areas such as online anomaly detection and time series anomaly detection.

Cloud Computing Security: From The use of cloud computing has increased rapidly in many organizations. Cloud Single to computing provides many benefits in terms of low cost and Multi-Clouds(2012)

accessibility of data. Ensuring the security of cloud computing is a major factor in the cloud computing environment, as users often store sensitive information with cloud storage providers but these providers may be untrusted. Dealing with single cloud providers is predicted to become less popular with customers due to risks of service availability failure and the possibility of malicious insiders in the single cloud. A movement towards multi-clouds, or in other words, interclouds or cloud-ofclouds has emerged recently. This paper surveys recent research related to single and multicloud security and addresses possible solutions. It is found that the research into the use of multicloud providers to maintain security has received less attention from the research community than has the use of single clouds. This work aims to promote the use of multi-clouds due to its ability to reduce security risks that affect the cloud computing user.

Cloud Data Protection for the Masses(2012)

Although cloud computing promises lower costs, rapid scaling, easier maintenance, and service availability anywhere, anytime, a key challenge is how to ensure and build confidence that the cloud can handle user data securely. A recent Microsoft survey found that 58 percent of the public and 86 percent of business leaders are excited about the possibilities of cloud computing. But more than 90 percent of them are worried about security, availability, and privacy of their data as it rests in the cloud.1 This tension makes sense: users want to maintain control of their data, but they also want to benefit from the rich services that application developers can provide using that data. So far, the cloud offers little platform-level support or standardization for user data protection beyond data encryption at rest, most likely because doing so is nontrivial. Protecting user data while enabling rich computation requires both specialized expertise and resources that might not be readily available to most application developers. Building in data-protection solutions at the platform layer is an attractive option: the platform can achieve economies of scale by amortizing expertise costs and distributing sophisticated security solutions across different applications and their developers.

Cooperative Provable Data Possession for Integrity Verification in MultiCloud Storage

Provable data possession (PDP) is a technique for ensuring the integrity of data in storage outsourcing. In this paper, we address the construction of an efficient PDP scheme for distributed cloud storage to support the scalability of service and data migration, in which we consider the existence of multiple cloud service providers to cooperatively store and maintain the clients data. We present a cooperative PDP (CPDP) scheme based on homomorphic verifiable response and hash index hierarchy. We prove the security of our scheme based on multi-prover zeroknowledge proof system, which can satisfy completeness, knowledge soundness, and zero-knowledge properties. In addition, we articulate performance optimization mechanisms for our scheme, and in particular present an efficient method for selecting optimal parameter values to minimize the computation costs of clients and storage service providers. Our experiments show that our solution introduces lower computation and communication overheads in comparison with non-cooperative approaches. Index TermsStorage Security, Provable Data Possession,

Interactive Protocol, Zero-knowledge, Multiple Cloud, Cooperative

Costing of Cloud Computing Services: A Total Cost of Ownership Approach

The use of Cloud Computing Services appears to offer significant cost advantages. Particularly start-up companies benefit from these advantages, since frequently they do not operate an internal IT infrastructure. But are costs associated with Cloud Computing Services really that low? We found that particular cost types and factors are frequently underestimated by practitioners. In this paper we present a Total Cost of Ownership (TCO) approach for Cloud Computing Services. We applied a multi-method approach (systematic literature review, analysis of real Cloud Computing Services, expert interview, case study) for the development and evaluation of the formal mathematical model. We found that our model fits the practical requirements and supports decisionmaking in Cloud Computing

Detecting Anomalous Insiders in Collaborative information systems (CISs) are deployed within a diverse array of environments that manage sensitive Collaborative Information information. Current security mechanisms detect insider threats, Systems(2012)

but they are ill-suited to monitor systems in which users function in dynamic teams. In this paper, we introduce the community anomaly detection system (CADS), an unsupervised learning framework to detect insider threats based on the access logs of collaborative environments. The framework is based on the observation that typical CIS users tend to form community structures based on the subjects accessed (e.g., patients records viewed by healthcare providers). CADS consists of two components: 1) relational pattern extraction, which derives community structures and 2) anomaly prediction, which leverages a statistical model to determine when users have sufficiently deviated from communities. We further extend CADS into MetaCADS to account for the semantics of subjects (e.g., patients diagnoses). To empirically evaluate the framework, we perform an assessment with three months of access logs from a real electronic health record (EHR) system in a large medical center. The results illustrate our models exhibit significant performance gains over state-of-the-art competitors. When the number of illicit users is low, MetaCADS is the best model, but as the number grows, commonly accessed semantics lead to hiding

in a crowd, such that CADS is more prudent. Index TermsPrivacy, social

Effective Pattern Discovery for Text Mining

Many data mining techniques have been proposed for mining useful patterns in text documents. However, how to effectively use and update discovered patterns is still an open research issue, especially in the domain of text mining. Since most existing text mining methods adopted term-based approaches, they all suffer from the problems of polysemy and synonymy. Over the years, people have often held the hypothesis that pattern (or phrase) based approaches should perform better than the term-based ones, but many experiments do not support this hypothesis. This paper presents an innovative and effective pattern discovery technique which includes the processes of pattern deploying and pattern evolving, to improve the effectiveness of using and updating discovered patterns for finding relevant and interesting information. Substantial experiments on RCV1 data collection and TREC topics demonstrate that the proposed solution achieves encouraging performance.

Efficient Anonymous Message Submission(2012)

In online surveys, many people are not willing to provide true answers due to privacy concerns. Thus, anonymity is important for online message collection. Existing solutions let each member blindly shuffle the submitted messages by using the IND-CCA2 secure cryptosystem. In the end, all messages are randomly shuffled and no one knows the message order. However, the heavy computational overhead and linear communication rounds make it only useful for small groups. In this paper, we propose an efficient anonymous message submission protocol aimed at a practical group size. Our protocol is based on a simplified secret sharing scheme and a symmetric key cryptosystem. We propose a novel method to let all members secretly aggregate their messages into a message vector such that a member knows nothing about other members message positions.We provide a theoretical proof showing that our protocol is anonymous under malicious attacks. We then conduct a thorough analysis of our protocol, showing that our protocol is computationally more efficient than existing solutions and results in a constant communication rounds with a high probability.

Fuzzy Multi-Dimensional Search in the Wayfinder File System

With the explosion in the amount of semi-structured data users access and store, there is a need for complex search tools to retrieve often very heterogeneous data in a simple and efficient way. Existing tools usually index text content, allowing for some IR-style ranking on the textual part of the query, but only consider structure (e.g., file directory) and metadata (e.g., date, file type) as filtering conditions. We propose a novel multidimensional querying approach to semi-structured data searches in personal information systems by allowing users to provide fuzzy structure and metadata conditions in addition to traditional keyword conditions. The provided query interface is more comprehensive than content-only searches as it considers three query dimensions (content, structure, metadata) in the search. We have implemented our proposed approach in the Wayfinder file system. In this demo, we will use this implementation to both present an overview of the unified scoring framework underlying the fuzzy multi-dimensional querying approach and demonstrate its potential in improving search results

Enabling cross-site interactions in social networks(2012)

Online social networks is one of the major technological phenomena on the Web 2.0. Hundreds of millions of people are posting articles, photos, and videos on their profiles and interacting with other people, but the sharing and interaction are limited within a same social network site. Although users can share some contents in a social network site with people outside of the social network site using a secret address of content, appropriate access control mechanisms are still not supported. To overcome this limitation, we propose a cross-site interaction framework x-mngr, allowing users to interact with users in other social network sites, with a cross-site access control policy, which enables users to specify policies that allow/deny access to their shared contents across social network sites. We also propose a partial mapping approach based on a supervised learning mechanism to map users identities across social network sites. We implemented our proposed framework through a photo album sharing application that shares users photos between Facebook and MySpace based on the cross-site access control policy that

is defined by the content owner. Furthermore, we provide mechanisms to enable users to fuse user-mapping decisions that are provided by their friends or others in the social network. We implemented our framework and through

Enabling Multi-level Trust in Privacy Preserving Data Mining(2011)

Privacy Preserving Data Mining (PPDM) addresses the problem of developing accurate models about aggregated data without access to precise information in individual data record. A widely studied perturbation-based PPDM approach introduces random perturbation to individual values to preserve privacy before data is published. Previous solutions of this approach are limited in their tacit assumption of single-level trust on data miners. In this work, we relax this assumption and expand the scope of perturbation-based PPDM to Multi-Level Trust (MLT-PPDM). In our setting, the more trusted a data miner is, the less perturbed copy of the data it can access. Under this setting, a malicious data miner may have access to differently perturbed copies of the same data through various means, and may combine these diverse copies to jointly infer additional information about the original data that the data owner does not intend to release. Preventing such diversity attacks is the key challenge of providing MLT-PPDM services. We address this challenge by properly correlating perturbation across copies at different trust levels. We prove that our solution is robust against diversity attacks with respect to our privacy goal. That is, for data miners who have access to an arbitrary collection of the perturbed copies, our solution prevent them from jointly reconstructing the original data more accurately than the best effort using any individual copy in the collection. Our solution allows a data owner to generate perturbed copies of its data for arbitrary trust levels on-demand. This feature offers data owners maximum flexibility.

Ensuring Distributed Accountability for Data Sharing in the Cloud

Cloud computing is the use of computing of sources (hardware and software) that are delivered as a service over a network (typically the internet).It enables highly scalable services to be easily consumed over the Internet on an as needed basis. A major characteristic of the cloud services is that users data are usually processed remotely in unknown machines that users do not operate. It can become a substantial roadblock to the wide adoption of cloud services. To address this problem, we propose a highly decentralized answerability framework to keep track of

the actual usage of the users data in the cloud. The Cloud Information Accountability framework proposed in this work conducts automated logging and distributed auditing of relevant access performed by any entity, carried out at any point of time at any cloud service provider. It has two major elements: logger and log harmonizer. The proposed methodology will also take concern of the JAR file by converting the JAR into obfuscated code which will adds an additional layer of security to the infrastructure. Apart from that we are going to increase the security of users data by provable data possessions for integrity verification.

FADE: Secure Overlay Cloud Storage with File Assured Deletion (2012)

While we can now outsource data backup to third-party cloud storage services so as to reduce data management costs, security concerns arise in terms of ensuring the privacy and integrity of outsourced data. We design FADE, a practical, implementable, and readily deployable cloud storage system that focuses on protecting deleted data with policy-based file assured deletion. FADE is built upon standard cryptographic techniques, such that it encrypts outsourced data files to guarantee their privacy and integrity, and most importantly, assuredly deletes files to make them unrecoverable to anyone (including those who manage the cloud storage) upon revocations of file access policies. In particular, the design of FADE is geared toward the objective that it acts as an overlay system that works seamlessly atop todays cloud storage services. To demonstrate this objective, we implement a working prototype of FADE atop Amazon S3, one of todays cloud storage services, and empirically show that FADE provides policy-based file assured deletion with a minimal trade-off of performance overhead. Our work provides insights of how to incorporate value-added security features into current data outsourcing applications.

Fast and accurate annotation of short texts with Wikipedia pages

We address the problem of cross-referencing text fragments with Wikipedia pages, in a way that synonymy and polysemy issues are resolved accurately and e_ciently. We take inspiration from a recent ow of work [3, 10, 12, 14], and extend their scenario from the annotation of long documents to the annotation of short texts, such as snippets of searchengine results, tweets, news, blogs, etc.. These short and poorly composed texts pose new challenges in terms of e_ciency and e_ectiveness of the annotation process, that we address by designing and engineering Tagme, the _rst system that performs an accurate and on-the-y annotation of these short textual fragments. A large set of experiments shows that Tagme outperforms state-of-the-art algorithms when they are adapted to work on short texts and it results fast and competitive on long texts.

Fog Computing: Mitigating Insider Data Theft Attacks in the Cloud

Cloud computing promises to significantly change the way we use computers and access and store our personal and business information. With these new computing and communications paradigms arise new data security challenges. Existing data protection mechanisms such as encryption have failed in preventing data theft attacks, especially those perpetrated by an insider to the cloud provider. We propose a different approach for securing data in the cloud using offensive decoy technology. We monitor data access in the cloud and detect abnormal data access patterns. When unauthorized access is suspected and then verified using challenge questions, we launch a disinformation attack by returning large amounts of decoy information to the attacker. This protects against the misuse of the users real data. Experiments conducted in a local file setting provide evidence that this approach may provide unprecedented levels of user data security in a Cloud environment.

Fuzzy Order-of-Magnitude Based Numerical link-based similarity techniques have Link Analysis for Qualitative Alias proven effective for identifying similar objects in the Internet and publication domains. However, Detection

for cases involving unduly high similarity measures, these methods usually generate inaccurate results. Also, they are often restricted to measuring over single properties only. This paper presents an order-of-magnitude based similarity mechanism that integrates multiple link properties to derive semantic-rich similarity descriptions. The

approach extends conventional order-of-magnitude reasoning with the theory of fuzzy sets. The inherent ability of this work in computing-with-words also allows coherent interpretation and communication within a decision-making group. The proposed approach is applied to supporting the analysis of intelligence data. When evaluated over a difficult terrorism-related dataset, experimental results show that the approach helps to partly resolve the problem of false positives.

Gossip-based Resource Management for Cloud Environments (long version)(2010)

We address the problem of resource management for a large-scale cloud environment that hosts sites. Our solution centers around a middleware architecture, the key element of which is a gossip protocol that meets our design goals: fairness of resource allocation with respect to hosted sites, efficient adaptation to load changes and scalability in terms of both the number of machines and sites. We formalize the resource allocation problem as that of maximizing the cloud utility under CPU and memory constraints, show that an optimal solution without considering memory constraints is straightforward (but not useful), and provide an efficient heuristic solution for the complete problem instead. We evaluate the performance of the protocol through simulation and find its performance to be wellaligned with our design goals.

How do Facebookers use Friendlists(2012)

Facebook friendlists are used to classify friends into groups and assist users in controlling access to their information. In this paper, we study the effectiveness of Facebook friendlists from two aspects: Friend Management and Policy Patterns by examining how users build friendlists and to what extent they use them in their policy templates. We have collected real Facebook profile information and photo privacy policies of 222 participants, through their consent in our Facebook survey application posted on Mechanical Turk. Our data analysis shows that users customized friendlists are less frequently created and have fewer overlaps as compared to Facebook created friendlists. Also, users do not place all of their friends into lists. Moreover, friends in more than one friendlists have higher values of node betweenness and outgoing to incoming edge ratio values among all the friends of a particular user. Last but not the least, friendlist and user based exceptions are less frequently used in policies as compared to allowing all friends, friends of friends and everyone

to view photos.

Impact of Storage Acquisition Intervals on the Cost-Efficiency of the Private vs. Public Storage

The volume of worldwide digital content has increased nine-fold within the last five years, and this immense growth is predicted to continue in foreseeable future reaching 8ZB already by 2015. Traditionally, in order to cope with the growing demand for storage capacity, organizations proactively built and managed their private storage facilities. Recently, with the proliferation of public cloud infrastructure offerings, many organizations, instead, welcomed the alternative of outsourcing their storage needs to the providers of public cloud storage services. The comparative cost-efficiency of these two alternatives depends on a number of factors, among which are e.g. the prices of the public and private storage, the charging and the storage acquisition intervals, and the predictability of the demand for storage. In this paper, we study how the cost-efficiency of the private vs. public storage depends on the acquisition interval at which the organization re-assesses its storage needs and acquires additional private storage. The analysis in the paper suggests that the shorter the acquisition interval, the more likely it is that the private storage solution is less expensive as compared with the public cloud infrastructure. This phenomenon is also illustrated in the paper numerically using the storage needs encountered by a university back-up and archiving service as an example. Since the acquisition interval is determined by the organizations ability to foresee the growth of storage demand, by the provisioning schedules of storage equipment providers, and by internal practices of the organization, among other factors, the organization owning a private storage solution may want to control some of these factors in order to attain a shorter acquisition interval and thus make the private storage (more) cost-efficient.

Multi-Agent Systems in Mobile Ad hoc Networks

A number of technologies are evolving that will help formulate more adaptive and robust network architectures intended to operate in dynamic, mobile environments. One technology area, mobile ad hoc networking (MANET) enables selforganizing, multi-hop heterogeneous network routing services and organization. Such technology is important in future DoD networking, especially in the forward edge of the battlespace where self-organizing, robust networking is

needed. A second technology area, Multi-Agent Systems (MAS) can enable autonomous, teambased problem solving under varying environmental conditions. Previous work done in MAS has assumed relatively benign wired network behavior and inter-agent communications characteristics that may not be well supported in MANET environments. In addition, the resource costs associated with performing inter-agent communications have a more profound impact in a mobile wireless environment. The combined operation of these technology areas, including cross-layer design considerations, has largely been unexplored to date. This paper describes ongoing research to improve the ability of these technologies to work in concert. An outline of various design and system architecture issues is first presented. We then describe models, agent systems, MANET protocols, and additional components that are being applied in our research. We present an analysis method to measure agent effectiveness and early evaluations of working prototypes within MANET environments. We conclude by outlining some open issues and areas offurther work.

MMDS: Multilevel Monitoring and Detection System

The paper presents an agent-based approach for monitoring and detecting different kinds of attacks in wireless networks. The long-term goal of this research is to develop a self-adaptive system that will perform real-time, monitoring, analysis, detection, and generation of appropriate responses to intrusive activities. This multi-agent architecture, which supports necessary agent interactions, uses fuzzy decision support system to generate rules for different attacks by monitoring parameters at multiple levels. The system is able to operate in a wireless network, detect and act in response to events in real-time, according to its broad decision objectives and security policies.

MORPHOSYS: Efficient Colocation of QoS-Constrained Workloads in

In hosting environments such as IaaS clouds, desirable application performance is usually guaranteed through the use of Service Level Agreements (SLAs), which specify

the Cloud

minimal fractions of resource capacities that must be allocated for unencumbered use for proper operation. Arbitrary colocation of applications with different SLAs on a single host may result in inefficient utilization of the hosts resources. In this paper, we propose that periodic resource allocation and consumption models often used to characterize real-time workloads be used for a more granular expression of SLAs. Our proposed SLA model has the salient feature that it exposes flexibilities that enable the infrastructure provider to safely transform SLAs from one form to another for the purpose of achieving more efficient colocation. Towards that goal, we present MORPHOSYS: a framework for a service that allows the manipulation of SLAs to enable efficient colocation of arbitrary workloads in a dynamic setting. We present results from extensive trace-driven simulations of colocated Video-on-Demand servers in a cloud setting. These results show that potentially-significant reduction in wasted resources (by as much as 60%) are possible using MORPHOSYS.

Self-Protecting Electronic Medical Records Using Attribute-Based Encryption

We provide a design and implementation of self-protecting electronic medical records (EMRs) using attribute-based encryption. Our system allows healthcare organizations to export EMRs to storage locations outside of their trust boundary, including mobile devices, Regional Health Information Organizations (RHIOs), and cloud systems such as Google Health. In contrast to some previous approaches to this problem, our solution is designed to maintain EMR availability even when providers are o_ine, i.e., where network connectivity is not available (for example, during a natural disaster). To balance the needs of emergency care and patient privacy, our system is designed to provide for _ne-grained encryption and is able to protect individual items within an EMR, where each encrypted item may have its own access control policy. To validate our architecture, we implemented a prototype system using a new dual-policy attribute-based encryption library that we developed. Our implementation, which includes an iPhone app for storing and managing EMRs o_ine, allows for exible and automatic policy generation. An evaluation of our design shows that our ABE library performs well, has acceptable storage requirements, and is practical and usable on modern smartphones.

Privacy-preserving Enforcement of Spatially Aware RBAC(2012)

Several models for incorporating spatial constraints into rolebased access control (RBAC) have been proposed, and researchers are now focusing on the challenge of ensuring such policies are enforced correctly. However, existing approaches have a major shortcoming, as they assume the server is trustworthy and require complete disclosure of sensitive location information by the user. In this work, we propose a novel framework and a set of protocols to solve this problem. Specifically, in our scheme, a user provides a service provider with role and location tokens along with a request. The service provider consults with a role authority and a location authority to verify the tokens and evaluate the policy. However, none of the servers learn the requesting users identity, role, or location. In this paper, we define the protocols and the policy enforcement scheme, and present a formal proof of a number of security properties.

Ranking Model Adaptation for Domain-Specific Search(2012)

`With the explosive emergence of vertical search domains, applying the broad-based ranking model directly to different domains is no longer desirable due to domain differences, while building a unique ranking model for each domain is both laborious for labeling data and time-consuming for training models. In this paper, we address these difficulties by proposing a regularization based algorithm called ranking adaptation SVM (RA-SVM), through which we can adapt an existing ranking model to a new domain, so that the amount of labeled data and the training cost is reduced while the performance is still guaranteed. Our algorithm only requires the prediction from the existing ranking models, rather than their internal representations or the data from auxiliary domains. In addition, we assume that documents similar in the domain-specific feature

space should have consistent rankings, and add some constraints to control the margin and slack variables of RA-SVM adaptively. Finally, ranking adaptability measurement is proposed to quantitatively estimate if an existing ranking model can be adapted to a new domain. Experiments performed over Letor and two large scale datasets crawled from a commercial search engine demonstrate the applicabilities of the proposed ranking adaptation algorithms and the ranking adaptability measurement.

Reliable Proxy Re-encryption in Unreliable Clouds(2013)

In this paper, we propose an efficient data retrieval scheme using attribute-based encryption. The proposed scheme is best suited for cloud storage systems with substantial amount of data. It provides rich expressiveness as regards access control and fast searches with simple comparisons of searching entities. The proposed scheme also guarantees data security end-user privacy during the data retrieval process. A key approach to secure cloud computing is for the data owner to store encrypted data in the cloud, and issue decryption keys to authorized users. The cloud storage based information retrieval service is a promising technology that will form a vital market in the near future. Although there have been copious studies proposed about secure data retrieval over encrypted data in cloud services, most of them focus on providing the strict security for the data stored in a third party domain. However, those approaches require astounding costs centralized on the cloud service provider, this could be a principal hindrance to achieve efficient data retrieval in cloud storage.

Remote Attestation with Domain-based Integrity Model and Policy Analysis

We propose and implement an innovative remote attestation framework called DR@FT for efficiently measuring a target system based on an information flow-based integrity model. With this model, the high integrity processes of a system are first measured and verified, and these processes are then protected from accesses initiated by low integrity processes. Towards dynamic systems with frequently changed system states, our framework verifies the latest state changes of a target system instead of considering the entire system information. Our attestation evaluation adopts a graph-based method to represent integrity violations, and the graph-based policy analysis is further

augmented with a ranked violation graph to support high semantic reasoning of attestation results. As a result, DR@FT provides efficient and effective attestation of a systems integrity status, and offers intuitive reasoning of attestation results for security administrators. Our experimental results demonstrate the feasibility and practicality of DR@FT.

Risk-Aware Mitigation for MANET Routing Attacks

Mobile Ad hoc Networks (MANET) have been highly vulnerable to attacks due to the dynamic nature of its network infrastructure. Among these attacks, routing attacks have received considerable attention since it could cause the most devastating damage to MANET. Even though there exist several intrusion response techniques to mitigate such critical attacks, existing solutions typically attempt to isolate malicious nodes based on binary or nave fuzzy response decisions. However, binary responses may result in the unexpected network partition, causing additional damages to the network infrastructure, and nave fuzzy responses could lead to uncertainty in countering routing attacks in MANET. In this paper, we propose a risk-aware response mechanism to systematically cope with the identified routing attacks. Our risk-aware approach is based on an extended Dempster-Shafer mathematical theory of evidence introducing a notion of importance factors. In addition, our experiments demonstrate the effectiveness of our approach with the consideration of several performance metrics.

Secure Overlay Cloud Storage with Access Control and Assured Deletion(2012)

We can now outsource data backups off-site to third-party cloud storage services so as to reduce data management costs. However, we must provide security guarantees for the outsourced data, which is now maintained by third parties. We design and implement FADE, a secure overlay cloud storage system that achieves fine-grained, policy-based access control and file assured deletion. It associates outsourced files with file access policies, and assuredly deletes files to make them unrecoverable to anyone upon revocations of file access policies. To achieve such security goals, FADE is built upon a set of cryptographic key operations that

are self-maintained by a quorum of key managers that are independent of third-party clouds. In particular, FADE acts as an overlay system that works seamlessly atop todays cloud storage services. We implement a proof-of-concept prototype of FADE atop Amazon S3, one of todays cloud storage services. We conduct extensive empirical studies, and demonstrate that FADE provides security protection for outsourced data, while introducing only minimal performance and monetary cost overhead. Our work provides insights of how to incorporate value-added security features into todays cloud storage services.

Sequential Anomaly Detection in the Presence of Noise and Limited Feedback(2012)

This paper describes a methodology for detecting anomalies from sequentially observed and potentially noisy data. The proposed approach consists of two main elements: (1) filtering, or assigning a belief or likelihood to each successive measurement based upon our ability to predict it from previous noisy observations, and (2) hedging, or flagging potential anomalies by comparing the current belief against a time-varying and data-adaptive threshold. The threshold is adjusted based on the available feedback from an end user. Our algorithms, which combine universal prediction with recent work on online convex programming, do not require computing posterior distributions given all current observations and involve simple primal-dual parameter updates. At the heart of the proposed approach lie exponential-family models which can be used in a wide variety of contexts and applications, and which yield methods that achieve sublinear per-round regret against both static and slowly varying product distributions with marginals drawn from the same exponential family. Moreover, the regret against static distributions coincides with the minimax value of the corresponding online strongly convex game. We also prove bounds on the number of mistakes made during the hedging step relative to the best offline choice of the threshold with access to all estimated beliefs and feedback signals. We validate the theory on synthetic data drawn from a time-varying distribution over binary vectors of high dimensionality, as well as on the Enron email dataset.

A Query Formulation Language for the Data Web

We present a query formulation language (called MashQL) in order to easily query and fuse structured data on the web. The main novelty of MashQL is that it allows people with limited IT-skills to explore and query one (or multiple) data

sources without prior knowledge about the schema, structure, vocabulary, or any technical details of these sources. More importantly, to be robust and cover most cases in practice, we do not assume that a data source should have -an offline or inline- schema. This poses several language-design and performance complexities that we fundamentally tackle. To illustrate the query formulation power of MashQL, and without loss of generality, we chose the Data Web scenario. We also chose querying RDF, as it is the most primitive data model; hence, MashQL can be similarly used for querying relational databases and XML. We present two implementations of MashQL, an online mashup editor, and a Firefox add-on. The former illustrates how MashQL can be used to query and mash up the Data Web as simple as filtering and piping web feeds; and the Firefox addon illustrates using the browser as a web composer rather than only a navigator. To end, we evaluate MashQL on querying two datasets, DBLP and DBPedia, and show that our indexing techniques allow instant user-interaction.

Scalable and Secure Sharing of Personal Health Records in Cloud Computing using Attribute-based Encryption

Personal health record (PHR) is an emerging patient-centric model of health information exchange, which is often outsourced to be stored at a third party, such as cloud providers. However, there have been wide privacy concerns as personal health information could be exposed to those third party servers and to unauthorized parties. To assure the patients control over access to their own PHRs, it is a promising method to encrypt the PHRs before outsourcing. Yet, issues such as risks of privacy exposure, scalability in key management, flexible access and efficient user revocation, have remained the most important challenges toward achieving finegrained, cryptographically enforced data access control. In this paper, we propose a novel patient-centric framework and a suite of mechanisms for data access control to PHRs stored in semi-trusted servers. To achieve fine-grained and scalable data access control for PHRs, we leverage attribute based encryption (ABE) techniques to encrypt each patients PHR file. Different from previous works in secure data outsourcing, we focus on the multiple data owner scenario,

and divide the users in the PHR system into multiple security domains that greatly reduces the key management complexity for owners and users. A high degree of patient privacy is guaranteed simultaneously by exploiting multi-authority ABE. Our scheme also enables dynamic modification of access policies or file attributes, supports efficient on-demand user/attribute revocation and break-glass access under emergency scenarios. Extensive analytical and experimental results are presented which show the security, scalability and efficiency of our proposed scheme.

Access control for online social networks third party applications

With the development of Web 2.0 technologies, online social networks are able to provide open platforms to enable the seamless sharing of profile data to enable public developers to interface and extend the social network services as applications. At the same time, these open interfaces pose serious privacy concerns as third party applications are usually given access to the user profiles. Current related research has focused on mainly user-to-user interactions in social networks, and seems to ignore the third party applications. In this paper, we present an access control framework to manage third party applications. Our framework is based on enabling the user to specify the data attributes to be shared with the application and at the same time be able to specify the degree of specificity of the shared attributes. We model applications as finite state machines, and use the required user profile attributes as conditions governing the application execution. We formulate the minimal attribute generalization problem and we propose a solution that maps the problem to the shortest path problem to find the minimum set of attribute generalization required to access the application services. We assess the feasibility of our approach by developing a proof-of-concept implementation and by conducting user studies on a widely-used social network platform.

Altered Fingerprints: Analysis and Detection(2012)

The widespread deployment of Automated Fingerprint Identification Systems (AFIS) in law enforcement and border control applications has heightened the need for ensuring that these systems are not compromised. While several issues related to fingerprint system security have been investigated, including the use of fake fingerprints for masquerading identity, the problem of fingerprint alteration or obfuscation has received very little attention. Fingerprint obfuscation refers to the deliberate alteration of the fingerprint pattern by an individual for the purpose of masking his identity. Several cases of fingerprint obfuscation have been reported in the press. Fingerprint image quality assessment software (e.g., NFIQ) cannot always detect altered fingerprints since the implicit image quality due to alteration may not change significantly. The main contributions of this paper are: 1) compiling case studies of incidents where individuals were found to have altered their fingerprints for circumventing AFIS, 2) investigating the impact of fingerprint alteration on the accuracy of a commercial fingerprint matcher, 3) classifying the alterations into three major categories and suggesting possible countermeasures, 4) developing a technique to automatically detect altered fingerprints based on analyzing orientation field and minutiae distribution, and 5) evaluating the proposed technique and the NFIQ algorithm on a large database of altered fingerprints provided by a law enforcement agency. Experimental results show the feasibility of the proposed approach in detecting altered fingerprints and highlight the need to further pursue this problem.

An Approach to Detect and Prevent SQL Injection Attacks in Database Using Web Service (2011)

SQL injection is an attack methodology that targets the data residing in a database through the firewall that shields it. The attack takes advantage of poor input validation in code and website administration. SQL Injection Attacks occur when an attacker is able to insert a series of SQL statements in to a query by manipulating user input data in to a web-based application, attacker can take advantages of web application programming security flaws and pass unexpected malicious SQL statements through a web application for execution by the backend

database. This paper proposes a novel specification-based methodology for the prevention of SQL injection Attacks. The two most important advantages of the new approach against existing analogous mechanisms are that, first, it prevents all forms of SQL injection attacks; second, Current technique does not allow the user to access database directly in database server. The innovative technique Web Service Oriented XPATH Authentication Technique is to detect and prevent SQLInjection Attacks in database the deployment of this technique is by generating functions of two filtration models that are Active Guard and Service Detector of application scripts additionally allowing seamless integration with currently-deployed systems.

Answering General TimeSensitive Queries

ABSTRACT Time is an important dimension of relevance for a large number of searches, such as over blogs and news archives. So far, research on searching over such collections has largely focused on locating topically similar documents for a query. Unfortunately, topic similarity alone is not always sufficient for document ranking. In this paper, we observe that, for an important class of queries that we call time-sensitive queries, the publication time of the documents in a news archive is important and should be considered in conjunction with the topic similarity to derive the final document ranking. Earlier work has focused on improving retrieval for recency queries that target recent documents. We propose a more general framework for handling time-sensitive queries and we automatically identify the important time intervals that are likely to be of interest for a query. Then, we build scoring techniques that seamlessly integrate the temporal aspect into the overall ranking mechanism. We extensively evaluated our techniques using a variety of news article data sets, including TREC data as well as real web data analyzed using the Amazon Mechanical Turk. We examined several alternatives for detecting the important time intervals for a query over a news archive and for incorporating this information in the retrieval process. Our techniques are robust and significantly improve result

quality for time-sensitive queries compared to state-of-the-art retrieval techniques.

Automatic Protocol Blocker for Privacy-Preserving Public Auditing in Cloud Computing(2012)

Cloud Computing is the long dreamed vision of computing as a utility, where users can remotely store their data into the cloud so as to enjoy the on-demand high quality applications and services from a shared pool of configurable computing resources. By data outsourcing, users can be relieved from the burden of local data storage and maintenance. However, the fact that users no longer have physical possession of the possibly large size of outsourced data makes the data integrity protection in Cloud Computing a very challenging and potentially formidable task, especially for users with constrained computing resources and capabilities. Thus, enabling public auditability for cloud data storage security is of critical importance so that users can resort to an external audit party to check the integrity of outsourced data when needed. To securely introduce an effective Third Party Auditor (TPA), the following two fundamental requirements have to be met: 1) TPA should be able to efficiently audit the cloud data storage without demanding the local copy of data, and introduce no additional on-line burden to the cloud user; 2) The Third Party Auditing process should bring in no new vulnerabilities towards user data privacy. In this paper we are extending the previous system by using automatic blocker for privacy preserving public auditing for data storage security in cloud computing. we utilize the public key based homomorphic authenticator and uniquely integrate it with random mask technique and automatic blocker. to achieve a privacy-preserving public auditing system for cloud data storage security while keeping all above requirements in mind. Extensive security and performance analysis shows the proposed schemes are provably secure and highly efficient.

Automatic Protocol Blocker for Privacy-Preserving Public Auditing in Cloud Computing

Cloud Computing is the long dreamed vision of computing as a utility, where users can remotely store their data into the cloud so as to enjoy the on-demand high quality applications and services from a shared pool of configurable computing resources. By data outsourcing, users can be relieved from the burden of local data storage and maintenance. However, the fact that users no longer have physical possession of the possibly large size of outsourced data makes the data integrity protection in Cloud Computing a very challenging and potentially formidable task, especially for users with constrained computing resources and capabilities. Thus, enabling public auditability for cloud data storage security is of

critical importance so that users can resort to an external audit party to check the integrity of outsourced data when needed. To securely introduce an effective Third Party Auditor (TPA), the following two fundamental requirements have to be met: 1) TPA should be able to efficiently audit the cloud data storage without demanding the local copy of data, and introduce no additional on-line burden to the cloud user; 2) The Third Party Auditing process should bring in no new vulnerabilities towards user data privacy. In this paper we are extending the previous system by using automatic blocker for privacy preserving public auditing for data storage security in cloud computing. we utilize the public key based homomorphic authenticator and uniquely integrate it with random mask technique and automatic blocker. to achieve a privacy-preserving public auditing system for cloud data storage security while keeping all above requirements in mind. Extensive security and performance analysis shows the proposed schemes are provably secure and highly efficient.

General Frameworks for Combined Mining: Discovering Informative Knowledge in Complex

Enterprise data mining applications such as mining government service data often involve multiple large heterogeneous data sources, user preferences and business impact. Business people expect data mining deliverables to inform direct business decision-making actions. In such situations, a single method or one-step mining is often limited in discovering informative knowledge. It would also be very time and space consuming, if not impossible, to join relevant large data sources for mining patterns consisting of multiple aspects of information. It is crucial to develop effective approaches for mining patterns combining necessary information from multiple relevant business lines, catering for real business settings and delivering decision-making actions rather than providing a single line of patterns. The recent years have seen increasing efforts on mining such patterns, for example, integrating frequent pattern mining with classifications to generate frequent pattern-based classifiers. Rather than presenting a specific algorithm, this paper builds on our existing works and proposes combined mining as a general approach to mining for informative patterns combining components from either multiple datasets or multiple features, or by multiple methods on demand. We summarize general frameworks, paradigms and basic processes for multi-feature combined mining, multi-source combined mining and multi-

method combined mining. Several novel types of combined patterns such as incremental cluster patterns result from such frameworks, which cannot be directly produced by existing methods. Several real-world case studies are briefed which identify combined patterns for informing government debt prevention and improving government service objectives. They show the flexibility and instantiation capability of combined mining in discovering more informative and actionable patterns in complex data. We also present combined patterns in dynamic charts, a novel pattern presentation method reflecting the evolution and impact change of a cluster of combined patterns and supporting business to take actions on the deliverables for intervention.

Authentication Protocol For Cross Realm SOA-Based Business Processes

This Modern distributed application is embedding an increasing degree of dynamism, from dynamic supply chain management, enterprise federations, and virtual collaborations to dynamic service interactions across organizations. Such dynamism leads to new security challenges. Collaborating services may belong to different security realms but often have to be engaged dynamically at run time. If their security realms do not have in place a direct cross-realm authentication relationship, it is technically difficult to enable any secure collaboration between the services. Because organizations and services can join a collaborative process in a highly dynamic and flexible way, it cannot be expected that every two of the collaborating security realms always have a direct cross-realm authentication relationship. A possible solution to this problem is to locate some intermediate realms that serve as an authentication-path between the two separate realms that are to collaborate. However, the overhead of generating an authenticationpath for two distributed realms is not trivial. The process could involve a large number of extra operations for credential conversion and require a long chain of invocations to intermediate services. This problem is addressed by presenting a new cross-realm authentication protocol for dynamic service interactions, based on the notion of multi-party business sessions. This protocol requires neither credential conversion nor establishment of any authentication path between

session members. The main contributions of this work are: (1) using the multi-party session concept to structure dynamic business processes, (2) a simple but effective way to establish trust relationships between the members of a business session, and (3) a set of protocols for multi-party session management.

Efficient audit service outsourcing for data integrity in clouds(2011)

Cloud-based outsourced storage relieves the clients burden for storage management and maintenance by providing a comparably low-cost, scalable, location-independent platform. However, the fact that clients no longer have physical possession of data indicates that they are facing a potentially formidable risk for missing or corrupted data. To avoid the security risks, audit services are critical to ensure the integrity and availability of outsourced data and to achieve digital forensics and credibility on cloud computing. Provable data possession (PDP), which is a cryptographic technique for verifying the integrity of data without retrieving it at an untrusted server, can be used to realize audit services. In this paper, profiting from the interactive zero-knowledge proof system, we address the construction of an interactive PDP protocol to prevent the fraudulence of prover (soundness property) and the leakage of verified data (zero-knowledge property). We prove that our construction holds these properties based on the computation DiffieHellman assumption and the rewindable black-box knowledge extractor. We also propose an efficient mechanism with respect to probabilistic queries and periodic verification to reduce the audit costs per verification and implement abnormal detection timely. In addition, we present an efficient method for selecting an optimal parameter value to minimize computational overheads of cloud audit services. Our experimental results demonstrate the effectiveness of our approach.

Data Mining for XML QueryAnswering Support

XML has become a defacto standard for storing, sharing and exchanging information across heterogeneous platforms. The XML content is growing day by day in rapid pace. Enterprises need to make queries on XML databases frequently. As huge XML data is available, it is challenging task to extract required data from XML database. It is computationally expensive to

answer queries without any support. Towards this, in this paper we present a technique known as Tree-based Association Rules (TARs) mined rules that provide required information on structure and content of XML file and the TARs are also stored in XML format. The mined knowledge (TARs) used later for XML query answering support. This enables quick and accurate answering. We also developed a prototype application to demonstrate the efficiency of the proposed system. The empirical results are very positive and query answering is expected to be useful in real time applications

Enabling Secure and Efficient Ranked Keyword Search over Outsourced Cloud Data

Cloud computing economically enables the paradigm of data service outsourcing. However, to protect data privacy, sensitive cloud data has to be encrypted before outsourced to the commercial public cloud, which makes effective data utilization service a very challenging task. Although traditional searchable encryption techniques allow users to securely search over encrypted data through keywords, they support only Boolean search and are not yet sufficient to meet the effective data utilization need that is inherently demanded by large number of users and huge amount of data files in cloud. In this paper, we define and solve the problem of secure ranked keyword search over encrypted cloud data. Ranked search greatly enhances system usability by enabling search result relevance ranking instead of sending undifferentiated results, and further ensures the file retrieval accuracy. Specifically, we explore the statistical measure approach, i.e. relevance score, from information retrieval to build a secure searchable index, and develop a one-to-many order-preserving mapping technique to properly protect those sensitive score information. The resulting design is able to facilitate efficient server-side ranking without losing keyword privacy. Thorough analysis shows that our proposed solution enjoys as-strong-as-possible security guarantee compared to previous searchable encryption schemes, while correctly realizing the goal of ranked keyword search. Extensive experimental results demonstrate the efficiency of the proposed solution.

Expert Discovery and Interactions in Mixed Service-Oriented Systems

Web-based collaborations and processes have become essential in todays business environments. Such processes typically span interactions between people and services across globally distributed companies. Web services and SOA are the defacto technology to implement compositions of humans and services. The increasing complexity of compositions and the distribution of people and services require adaptive and context-aware interaction models. To support complex interaction scenarios, we introduce a mixed service-oriented system composed of both humanprovided and Software-Based Services (SBSs) interacting to perform joint activities or to solve emerging problems. However, competencies of people evolve over time, thereby requiring approaches for the automated management of actor skills, reputation, and trust. Discovering the right actor in mixed service-oriented systems is challenging due to scale and temporary nature of collaborations. We present a novel approach addressing the need for flexible involvement of experts and knowledge workers in distributed collaborations. We argue that the automated inference of trust between members is a key factor for successful collaborations. Instead of following a security perspective on trust, we focus on dynamic trust in collaborative networks. We discuss Human-Provided Services (HPSs) and an approach for managing user preferences and network structures. HPS allows experts to offer their skills and capabilities as services that can be requested on demand. Our main contributions center around a context-sensitive trust-based algorithm called ExpertHITS inspired by the concept of hubs and authorities in web-based environments. ExpertHITS takes trustrelations and link properties in social networks into account to estimate the reputation of users.

Heuristics Based Query Processing for Large RDF Graphs Using Cloud Computing

Semantic Web is an emerging area to augment human reasoning. Various technologies are being developed in this arena which have been standardized by the World Wide Web Consortium (W3C). One such standard is the Resource Description Framework (RDF). Semantic Web technologies can be utilized to build efficient and scalable systems for Cloud Computing. With the explosion of semantic web technologies, large RDF graphs are common place.

This poses significant challenges for the storage and retrieval of RDF graphs. Current frameworks do not scale for large RDF graphs and as a result do not address these challenges. In this paper, we describe a framework that we built using Hadoop to store and retrieve large numbers of RDF triples by exploiting the cloud computing paradigm. We describe a scheme to store RDF data in Hadoop Distributed File System. More than one Hadoop job (the smallest unit of execution in Hadoop) may be needed to answer a query because a single triple pattern in a query cannot simultaneously take part in more than one join in a single Hadoop job. To determine the jobs, we present an algorithm to generate query plan, whose worst case cost is bounded, based on a greedy approach to answer a SPARQL Protocol and RDF Query Language(SPARQL) query. We use Hadoops MapReduce framework to answer the queries. Our results show that we can store large RDF graphs in Hadoop clusters built with cheap commodity class hardware. Furthermore, we show that our framework is scalable and efficient and can handle large amounts of RDF data, unlike traditional approaches.

HOMOMORPHIC AUTHENTICATION WITH RANDOM MASKING TECHNIQUE ENSURING PRIVACY & SECURITY IN CLOUD COMPUTING

Cloud computing may be defined as delivery of product rather than service. Cloud computing is a internet based computing which enables sharing of services. Many users place their data in the cloud. However, the fact that users no longer have physical possession of the possibly large size of outsourced data makes the data integrity protection in cloud computing a very challenging and potentially formida-ble task, especially for users with constrained computing resources and capabilities. So correctness of data and security is a prime concern. This article studies the problem of ensuring the integrity and security of data storage in Cloud Computing. Security in cloud is achieved by signing the data block before sending to the cloud. Signing is performed using BonehLynnShacham (BLS) algorithm which is more secure compared to other algorithms. To ensure the correctness of data, we consider an external auditor called as third party auditor (TPA), on behalf of the cloud user, to verify the integrity of the data stored in the cloud. By utilizing public key based homomorphic authenticator with random masking privacy preserving public auditing can be achieved. The technique of

bilinear aggregate signature is used to achieve batch auditing. Batch auditing reduces the computation overhead. Extensive security and performance analysis shows the proposed schemes are provably secure and highly efficient.

Improving Color Constancy by Photometric Edge Weighting

Edge-based color constancy method Estimation of illuminant Use of image derivatives

IntentSearch: Capturing User Intention for One-Click Internet Image Search(2012)

Web-scale image search engines (e.g., Google image search, Bing image search) mostly rely on surrounding text features. It is difficult for them to interpret users search intention only by query keywords and this leads to ambiguous and noisy search results which are far from satisfactory. It is important to use visual information in order to solve the ambiguity in text-based image retrieval. In this paper, we propose a novel Internet image search approach. It only requires the user to click on one query image with minimum effort and images from a pool retrieved by text-based search are reranked based on both visual and textual content. Our key contribution is to capture the users search intention from this one-click query image in four steps. 1) The query image is categorized into one of the predefined adaptive weight categories which reflect users search intention at a coarse level. Inside each category, a specific weight schema is used to combine visual features adaptive to this kind of image to better rerank the text-based search result. 2) Based on the visual content of the query image selected by the user and through image clustering, query keywords are expanded to capture user intention. 3) Expanded keywords are used to enlarge the image pool to contain more relevant images. 4) Expanded keywords are also used to expand the query image to multiple positive visual examples from which new query specific visual and textual similarity metrics are learned to further improve contentbased image reranking. All these steps are automatic, without extra effort from the user. This is critically important for any commercial web-based image search engine, where the user

interface has to be extremely simple. Besides this key contribution, a set of visual features which are both effective and efficient in Internet image search are designed. Experimental evaluation shows that our approach significantly improves the precision of top-ranked images and also the user experience.

A Tutorial on Linear and Differential Cryptanalysis

In this paper, we present a detailed tutorial on linear cryptanalysis and differential cryptanalysis, the two most significant attacks applicable to symmetric-key block ciphers. The intent of the paper is to present a lucid explanation of the attacks, detailing the practical application of the attacks to a cipher in a simple, conceptually revealing manner for the novice cryptanalyst. The tutorial is based on the analysis of a simple, yet realistically structured, basic Substitution-Permutation Network cipher. Understanding the attacks as they apply to this structure is useful, as the Rijndael cipher, recently selected for the Advanced Encryption Standard (AES), has been derived from the basic SPN architecture. As well, experimental data from the attacks is presented as confirmation of the applicability of the concepts as outlined.

OAuth Web Authorization Protocol

Allowing one Web service to act on our behalf with another has become increasingly important as social Internet services such as blogs, photo sharing, and social networks have become widely popular. OAuth, a new protocol for establishing identity management standards across services, provides an alternative to sharing our usernames and passwords, and exposing ourselves to attacks on our online data and identities.

Secure Speech Communication A Review

Secure speech communication has been of great importance in civil, commercial and military communication systems. As

speech communication becomes widely used and even more vulnerable, the importance of providing a high level of security becomes a major issue. The main objective of this paper is to increase the security, and to remove the redundancy for speech communication system under the global context of secure communication. So it deals with the integrating of speech coding, with speaker authentication and strong encryption. This paper also gives an overview and techniques available in speech coding, speaker Identification, Encryption and Decryption. The primary objective of this paper is to summarize some of the well known methods used in various stages for secure speech communication system.

Preserving Integrity of D ata a nd Public Auditing for Data Storage Security in Cloud Computing

Cloud Computing is the long dreamed vision of computing as a utility, where users can remotely store their data into the cloud so as to enjoy the on-demand high quality applications and services from a shared pool of configurable computing resources. By data outsourcing, users can be relieved from the burden of local data storage and maintenance. However, the fact that users no longer have physical possession of the possibly large size of outsourced data makes the data integrity protection in Cloud Computing a very challenging and potentially formidable task, especially for users with constrained computing resources and capabilities. Thus, enabling public auditability for cloud data storage security is of critical importance so that users can resort to an external audit party to check the integrity of outsourced data when needed. To securely introduce an effective third party auditor (TPA), the following two fundamental requirements have to be met: 1) TPA should be able to efficiently audit the cloud data storage without demanding the local copy of data, and introduce no additional on-line burden to the cloud user; 2) he third party auditing process should bring in no new vulnerabilities towards user data privacy. In this paper, we utilize and uniquely combine the public key based homomorphic authenticator with random masking to achieve the privacypreserving public cloud data auditing system, which meets all above requirements. To support efficient handling of multiple auditing tasks, we further explore the technique of bilinear aggregate signature to extend our main result into a multi-user setting, where TPA can perform multiple auditing tasks simultaneously. Extensive security and performance analysis shows the proposed schemes are provably secure and highly efficient

Privacy Preserving Delegated Access Control in Public Clouds

Current approaches to enforce fine-grained access control on confidential data hosted in the cloud are based on fine-grained encryption of the data. Under such approaches, data owners are in charge of encrypting the data before uploading them on the cloud and re-encrypting the data whenever user credentials or authorization policies change. Data owners thus incur high communication and computation costs. A better approach should delegate the enforcement of fine-grained access control to the cloud, so to minimize the overhead at the data owners, while assuring data confidentiality from the cloud. We propose an approach, based on two layers of encryption, that addresses such requirement. Under our approach, the data owner performs a coarse-grained encryption, whereas the cloud performs a finegrained encryption on top of the owner encrypted data. A challenging issue is how to decompose access control policies (ACPs) such that the two layer encryption can be performed. We show that this problem is NP-complete and propose novel optimization algorithms. We utilize an efficient group key management scheme that supports expressive ACPs. Our system assures the confidentiality of the data and preserves the privacy of users from the cloud while delegating most of the access control enforcement to the cloud.

Query Access Assurance in Outsourced Databases

Query execution assurance is an important concept in defeating lazy servers in the database as a service model. We show that extending query execution assurance to outsourced databases with multiple data owners is highly inefficient. To cope with lazy servers in the distributed setting, we propose query access assurance (QAA) that focuses on IO-bound queries. The goal in QAA is to enable clients to verify that the server has honestly accessed all records that are necessary to compute the correct query answer, thus eliminating the incentives for the server to be lazy if the query cost is dominated by the IO cost in accessing these records. We formalize this concept for distributed databases, and present two efficient schemes that achieve QAA with high success probabilities. The first scheme is simple to implement and deploy, but may incur excessive server to client communication cost and verification cost at the client side, when the query selectivity or the database size increases. The second scheme is more involved, but successfully addresses the limitation of the first scheme. Our design employs a few number theory techniques. Extensive experiments

demonstrate the efficiency, effectiveness and usefulness of our schemes.

Random4: An Application Specific Randomized Encryption Algorithm to prevent SQL injection

Web Applications form an integral part of our day to day life. The number of attacks on websites and the compromise of many individuals secure data are increasing at an alarming rate. With the advent of social networking and e-commerce, web security attacks such as phishing and spamming have become quite common. The consequences of these attacks are ruthless. Hence, providing increased amount of security for the users and their data becomes essential. Most important vulnerability as described in top 10 web security issues by Open Web Application Security Project is SQL Injection Attack(SQLIA) [3]. This paper focuses on how the advantages of randomization can be employed to prevent SQL injection attacks in web based applications. SQL injection can be used for unauthorized access to a database to penetrate the application illegally, modify the database or even remove it. For a hacker to modify a database, details such as field and table names are required. So we try to propose a solution to the above problem by preventing it using an encryption algorithm based on randomization. It has better performance and provides increased security in comparison to the existing solutions. Also the time to crack the database takes more time when techniques such as dictionary and brute force attack are deployed. Our main aim is to provide increased security by developing a tool which prevents illegal access to the database

Ranking Model Adaptation for Domain-Specific Search(2010)

With the explosive emergence of vertical search domains, applying the broad-based ranking model directly to different domains is no longer desirable due to domain differences, while building a unique ranking model for each domain is both laborious for labeling data and time-consuming for training models. In this paper, we address these difficulties by proposing a regularization based algorithm called ranking adaptation SVM (RA-SVM), through which we can adapt an existing ranking model to a new domain, so that the amount of labeled data and the training cost is reduced while the performance is still guaranteed. Our algorithm only requires the prediction from the existing ranking models, rather than their internal representations or the data from auxiliary domains. In addition,

we assume that documents similar in the domain-specific feature space should have consistent rankings, and add some constraints to control the margin and slack variables of RA-SVM adaptively. Finally, ranking adaptability measurement is proposed to quantitatively estimate if an existing ranking model can be adapted to a new domain. Experiments performed over Letor and two large scale datasets crawled from a commercial search engine demonstrate the applicabilities of the proposed ranking adaptation algorithms and the ranking adaptability measurement.

Review: Steganography Bit Plane Complexity Segmentation (BPCS) Technique

Steganography is an ancient technique of data hiding. Steganography is a technique in which secret data is hidden into vessel image without any suspicion. All other traditional techniques have limited data hiding capacity and can hide up to 15% of data amount of vessel image. This paper focuses on basic steganography and various characteristics necessary for data hiding. More importantly, the paper implements a steganographic technique that has hiding capacity up to 50 60% [8] [9]. This technique is called Bit Plane Complexity Segmentation (BPCS) Steganography. The main principle of BPCS technique is that, the binary image is divided into informative region and noise-like region. The secret data is hidden into noise-like region of the vessel image without any deterioration. In our experiment, we used the BPCS Principle by Eiji Kawaguchi & Richard O. Eason and experimented by using two images i) vessel image of 512 x 512 size ii) secret image of 256 x 256 size. We performed this experiment for 3 different sets of images and calculated image hiding capacity.

ROAuth: Recommendation Based Open Authorization

Many major online platforms such as Facebook, Google, and Twitter, provide an open Application Programming Interface which allows third party applications to access user resources. The Open Authorization protocol (OAuth) was introduced as a secure and e_cient method for authorizing third party applications without releasing a user's access credentials. However, OAuth implementations don't provide the necessary _ne-grained access control, nor any recommendations vis-a-vis which access control decisions are most appropriate. We propose an extension to the OAuth 2.0 autho-

rization that enables the provisioning of _ne-grained authorization recommendations to users when granting permissions to third party applications. We propose a mechanism that computes permission ratings based on a multi-criteria recommendation model which utilizes previous user decisions, and application requests to enhance the privacy of the overall site's user population. We implemented our proposed OAuth extension as a browser extension that allows users to easily con_gure their privacy settings at application installation time, provides recommendations on requested privacy attributes, and collects data regarding user decisions. Experiments on the collected data indicate that the proposed framework e_ciently enhanced the user awareness and privacy related to third party application authorizations.

COMPRESSED-ENCRYPTED DOMAIN JPEG2000 IMAGEWATERMARKING

In digital rights management (DRM) systems, digital media is often distributed by multiple levels of distributors in a compressed and encrypted format. The distributors in the chain face the problem of embedding their watermark in compressed, encrypted domain for copyright violation detection purpose. In this paper, we propose a robust watermark embedding technique for JPEG2000 compressed and encrypted images. While the proposed technique embeds watermark in the compressedencrypted domain, the extraction of watermark can be done either in decrypted domain or in encrypted domain.

Separable Reversible Data Hiding in Encrypted Image(2012)

This work proposes a novel scheme for separable reversible data hiding in encrypted images. In the first phase, a content owner encrypts the original uncompressed image using an encryption key. Then, a data-hider may compress the least significant bits of the encrypted image using a data-hiding key to create a sparse space to accommodate some additional data.With an encrypted image containing additional data, if a receiver has the data-hiding key, he can extract the additional data though he does not know the image content. If the receiver has the encryption key, he can decrypt the received data to obtain an image similar to

the original one, but cannot extract the additional data. If the receiver has both the data-hiding key and the encryption key, he can extract the additional data and recover the original content without any error by exploiting the spatial correlation in natural image when the amount of additional data is not too large.

Towards Secure and Dependable Cloud storage enables users to remotely store their data and enjoy the on-demand high quality cloud applications without Storage the burden of local hardware and software management. Though Services in Cloud Computing

the benefits are clear, such a service is also relinquishing users physical possession of their outsourced data, which inevitably poses new security risks towards the correctness of the data in cloud. In order to address this new problem and further achieve a secure and dependable cloud storage service, we propose in this paper a flexible distributed storage integrity auditing mechanism, utilizing the homomorphic token and distributed erasure-coded data. The proposed design allows users to audit the cloud storage with very lightweight communication and computation cost. The auditing result not only ensures strong cloud storage correctness guarantee, but also simultaneously achieves fast data error localization, i.e., the identification of misbehaving server. Considering the cloud data are dynamic in nature, the proposed design further supports secure and efficient dynamic operations on outsourced data, including block modification, deletion, and append. Analysis shows the proposed scheme is highly efficient and resilient against Byzantine failure, malicious data modification attack, and even server colluding attacks.

You might also like