We use proprietary and third party's cookies to improve your experience and our services, identifying your Internet Browsing preferences on our website; develop analytic activities and display advertising based on your preferences. If you keep browsing, you accept its use. You can get more information on our Cookie Policy
Cookies Policy
FIWARE.OpenSpecification.Security.DBAnonymizer.Open RESTful API Specification - FIWARE Forge Wiki

FIWARE.OpenSpecification.Security.DBAnonymizer.Open RESTful API Specification

From FIWARE Forge Wiki

Jump to: navigation, search

Contents

Introduction to the DB Anonymizer API

Please check the following FI-WARE Open Specification Legal Notice (essential patents license) to understand the rights to use this open specification. As all other FI-WARE members, SAP has chosen one of the two FI-WARE license schemes for open specifications.

To illustrate this open specification license from our SAP perspective:

  • SAP provides the specifications of this Generic Enabler available under IPR rules that allow for a exploitation and sustainable usage both in Open Source as well as proprietary, closed source products to maximize adoption.
  • This Open Specification is exploitable for proprietary 3rd party products and is exploitable for open source 3rd party products, including open source licenses that require patent pledges.
  • If the owner (SAP) of this GE spec holds a patent that is essential to create a conforming implementation of the GE spec (i.e. it is impossible to write a conforming implementation without violating the patent) then a license to that patent is deemed granted to the implementation.

DB Anonymizer API Core

The DB Anonymizer API is a RESTful API accessed via HTTP. It uses simple data types or binary files for the information exchange. It offers four main functions to trigger computation activities, and four associated functions to retrieve their results. Normally, each computation function considers two inputs:

  • a dataset, in the form of a DB dump
  • a disclosure or obfuscation policy.

The main API methods are:

  1. evaluatePolicy receives a DB dump (a single table in MySQL) and an obfuscation (or disclosure) policy file, to compute the likelihood (0->impossibility, 1->certainty) that an attacker can reconstruct exactly the table's content, if it is anonymized using the obfuscation policy.
    • getPolicyResult to retrieve the result of the computation.
  2. evaluateColumnRisk receives a DB dump (a single table in MySQL) and computes for each column, an index that represents the impact on the re-identification risk, caused by the disclosure of that column data.
    • getColumnRisk to retrieve the result of the computation.
  3. evaluateDeepSearch receives a DB dump (a single table in MySQL), an obfuscation policy file, and an upper-bound value for the acceptable re-identification risk associated to a policy, and computes all permutations to the original policy whose re-identification risk matches the specified upper-bound.
    • getDeepSearch to retrieve the result of the computation.
  4. anonymizeDataset receives a DB dump (a single table in MySQL) and an obfuscation policy file and it performs the anonymization operation according to the specified policy.
    • getAnonymizeDataset to retrieve the result of the computation.

Intended Audience

This specification is intended for software developers and reimplementers of this API. For the former, this document provides a full specification of how to interoperate with DB Anonymizer service, that implements DB Anonymizer API. For the latter, this document is a full specification of which functions and data types are part of DB Anonymizer API, and that must be part of any re-implementation effort.


To use this information, the reader should firstly have a general understanding of the Generic Enabler service (available on DB Anonymizer Open Specification page).

API Change History

Current version is: Version 3.3.3, 4/2/2014

The most recent changes are described in the table below:

Revision Date Changes Summary
Apr 30, 2012
  • This is the first version of the DB Anonymizer API Guide.
Apr 22, 2013
  • This is the second version of the DB Anonymizer API Guide, it includes two new functionalities:
    • for analysing the per-column disclosure impact of a re-identification risk computation, and
    • for computing all modifications to a disclosure policy for a specific dataset, to find a set of alternative policies that matches an arbitrary upper-bound for re-identification risk.
Feb 4, 2014
  • Added the new proactive dataset anonymization functionality plus a number of minor changes not impacting the previously published methods.

How to Read This Document

All FI-WARE RESTful API specifications will follow the same list of conventions and will support certain common aspects. Please check Common aspects in FI-WARE Open Restful API Specifications.

In the whole document the assumption is made that the reader is familiarized with REST architecture style. However, the interface was carefully designed to be extremely simple to use, thus to require minimal integration effort from software developers interested in the DB Anonymizer functionalities. Therefore, no special notation or particular constructs were needed in producing this description, but the following simple indications:

  • A bold font is used to represent code or logical entities, e.g., HTTP method (GET, PUT, POST, DELETE).
  • An italic font is used to represent document titles or some other kind of special text, e.g., URI.
  • The variables are represented between brackets, e.g. {id} and in italic font. When the reader find it, can change it by any value.


Additional Resources

You can download the most current version of this document from the FIWARE API specification website at http://wiki.fi-ware.eu/Summary_of_FI-WARE_API_Open_Specifications. For more details about the DB Anonymizer that this API is based upon, please refer to the Open Specification website at http://wiki.fi-ware.eu/Summary_of_FI-WARE_Open_Specifications. Related documents, including an Architectural Description, are available at the same site.

General DB Anonymizer API Information

Resources Summary

Graphical diagram in which we can see the different URIs exposed in the API.

Representation Format

The DB Anonymizer API supports the transmission of binary files and strings via HTML FORM ("multipart/form-data" or "application/x-www-form-urlencoded") . The request format is specified using the Content-Type header and is required for operations that have a request body. The response format is in plain text ("text/plain") or in XML ("text/xml"), for which XML Schema specifications are provided (see this link).

Representation Transport

Resource representation is transmitted between client and server by using HTTP 1.1 protocol, as defined by IETF RFC-2616. Each time an HTTP request contains payload, a Content-Type header shall be used to specify the MIME type of wrapped representation. In addition, both client and server may use as many HTTP headers as they consider necessary.

Resource Identification

API consumer must indicate the resource identifier while invoking a GET or POST or DELETE operation. DB Anonymizer API combines both identification and location by terms of URL, for methods for retrieving a computation result, but also allows the user to specify them by HTTP FORM. Each URL-enabled invocation provides the URL of the target resource along the verb and any required input data. That URL is used to identify unambiguously the resource. For HTTP transport, this is made using the mechanisms described by HTTP protocol specification as defined by IETF RFC-2616.

Links and References

Reference to Open Specification, DB Anonymizer

Versions

Only one version of the Open Specification is currently supported.

Extensions

The DB Anonymizer GE supports implementation-specific extensions, through the methods specified in Common aspects in FI-WARE Open Restful API Specifications.

Faults

Synchronous Faults

Fault ElementAssociated Error CodesExpected in All Requests?Return Message
GET /get* (all get methods) HTTP 204 NO Error in retrieving the requested result
GET /get * (all get methods) HTTP 400 NO Error in Request ID
GET /getPolicyResult HTTP 400 NO Error: The DB file is not in ZIP format
GET /get * (all get methods) HTTP 400 NO Error: Problem with input file
GET /getPolicyResult HTTP 400 NO Error: Problem with input DB dump
GET /getPolicyResult HTTP 400 NO Error: fault in policy parsing and/or setting
GET /get * (all get methods) HTTP 500 NO Error: DB communication problem
GET /get * (all get methods) HTTP 500 NO Error: fault in DB setup

Remark: HTTP Status 204 in response to /get * (all get methods) indicates that computation result is not yet available (coherently with the HTTP Status definition "No Content").

Asynchronous Faults

No Asynchronous Faults are used by DB Anonymizer

API Operations

Operations

A WADL specification for these methods can be found on the FI-WARE Catalogue.


Verb URI Description
POST /evaluatePolicy Starts the re-identification risk computation on the input: a MySQL DB table dump and a disclosure policy.
GET /getPolicyResult/{RequestID} Retrieves all available information about the context entity (flat, without attribute domains)
POST /evaluateColumnRisk Starts a per-column estimation of the impact on re-identification risk on the input: a MySQL DB table dump.
GET /getColumnRisk/{RequestID} Retrieves all available information about the context entity (flat, without attribute domains)
POST /evaluateDeepSearch Starts to compute all disclosure policies that matches a certain upper-bound value for re-identification risk on the input: a MySQL DB table dump, an initial disclosure policy and an upper-bound value for re-identification risk.
GET /getDeepSearch/{RequestID}/{count}/{offset} Retrieves all available information about the context entity (flat, without attribute domains)
POST /anonymizeDataset Anonymizes a dataset according to a disclosure policy, therefore it takes as input: a MySQL DB table dump and a disclosure policy.
GET /getAnonymizeDataset/{RequestID} Retrieves the anonymized dataset.

NOTE: The following resources must be provided in a HTML FORM ("multipart/form-data" or "application/x-www-form-urlencoded")

Description: evaluatePolicy

  • Correct Response: HTTP 200
  • Input:
    • a zipped MySQL table dump id: "dbDump", containing only a single table called "working_table", together with its elements. Allowed SQL commands: CREATE TABLE, INSERT
    • a disclosure policy file id: "policyFile", compliant with this XML Schema definition, for example:
<Policy>
	<Column>
		<Name>Gender</Name>
		<Type>identifier</Type>
		<Hide>false</Hide>
	</Column>
	<Column>
		<Name>Wine</Name>
		<Type>sensitive</Type>
		<Hide>true</Hide>
	</Column>
</Policy>

This policy foresees attribute suppression as data anonymization technique.

  • Return type: it returns a RequestID (string).

A sample of the required inputs is available on the FI-WARE Catalogue at this link.

Description: getPolicyResult

  • Correct Response: HTTP 200 .
  • Alternative: HTTP 204 (No Content), when computation result is not ready.
  • Input: a RequestID (string)
  • Return type: the likelihood (0->impossibility, 1->certainty) that an attacker can reconstruct exactly a table's content, that is anonymized using a certain obfuscation policy.


Description: evaluateColumnRisk

  • Correct Response: HTTP 200
  • Input:
    • a zipped MySQL table dump id: "dbDump", containing only a single table called "working_table", together with its elements. Allowed SQL commands: CREATE TABLE, INSERT
  • Return type: it returns a RequestID (string). If invoked with 800x600 Normal 0 21 false false false FR X-NONE X-NONE MicrosoftInternetExplorer4 "Content-type" HTTP header equals to "application/x-www-form-urlencoded", the result is encoded in an XML message using [ResultID] XML Schema.

Description: getColumnRisk

  • Correct Response: HTTP 200 .
  • Alternative: HTTP 204 (No Content), when computation result is not ready.
  • Input: a RequestID (string), as part of the URL (URL Param).
  • Return type: an indication of the impact on the re-identification risk, computed for each column of the dataset. The result is embedded into a RiskColumnResult XML document (described by [this] XML Schema).


Description: evaluateDeepSearch

  • Correct Response: HTTP 200
  • Input:
    • a zipped MySQL table dump id: "dbDump", containing only a single table called "working_table", together with its elements. Allowed SQL commands: CREATE TABLE, INSERT
    • a disclosure policy file id: "policyFile", compliant with this XML Schema definition
    • a string id: "maxRisk", containing a floating point number. It represents an upper bound for the re-identification risk of the alternative policies to be returned.
  • Return type: it returns a RequestID (string). If invoked with "Content-type" HTTP header equals to "application/x-www-form-urlencoded", the result is encoded in an XML message using ResultID XML Schema.

A sample of the required inputs is available on the FI-WARE Catalogue at this link.

Description: getDeepSearch

  • Correct Response: HTTP 200 .
  • Alternative: HTTP 204 (No Content), when computation result is not ready.
  • Input as part of the URL (URL Param):
    • a RequestID (string)
    • an integer "count", that represents the number of alternative policies to return
    • an integer "offset", that specifies how many policies to ignore for creating the return entity. E.g., /gid/10/20 would return 10 policies starting from alternative #20.
  • Return type: a set of anonymization policies, whose re-identification risk is below the specified maxRisk paramenter. The result is embedded into a PolicyProposalResult XML document (described by this XML Schema). Example:
<PolicyProposalResult>
    <PolicyProposal>
        <PolicyProposalID>0</PolicyProposalID>
        <ComputedRisk>0.25125</ComputedRisk>
        <Policy>
            [...policy description, in the same format as the input policy...]

        </Policy>
    </PolicyProposal>
    <PolicyProposal>
        [...another policy proposal element...]
    </PolicyProposal>
</PolicyProposalResult>

Description: anonymizeDataset

  • Correct Response: HTTP 200
  • Input:
    • a zipped MySQL table dump id: "dbDump", containing only a single table called "working_table", together with its elements. Allowed SQL commands: CREATE TABLE, INSERT
    • a disclosure policy file id: "policyFile", compliant with this XML Schema definition
  • Return type: it returns a RequestID (string). If invoked with "Content-type" HTTP header equals to "application/x-www-form-urlencoded", the result is encoded in an XML message using ResultID XML Schema.

A sample of the required inputs is available on the FI-WARE Catalogue at this link.

Description: getAnonymizeDataset

  • Correct Response: HTTP 200 .
  • Alternative: HTTP 204 (No Content), when computation result is not ready.
  • Input as part of the URL (URL Param):
    • a RequestID (string)
  • Return type: a text file (in CSV format) containing the anonymized dataset.
Personal tools
Create a book