We use proprietary and third party's cookies to improve your experience and our services, identifying your Internet Browsing preferences on our website; develop analytic activities and display advertising based on your preferences. If you keep browsing, you accept its use. You can get more information on our Cookie Policy
Cookies Policy
FIWARE.OpenSpecification.Security.Optional Security Enablers.MalwareDetectionService - FIWARE Forge Wiki

FIWARE.OpenSpecification.Security.Optional Security Enablers.MalwareDetectionService

From FIWARE Forge Wiki

Jump to: navigation, search
Name FIWARE.OpenSpecification.Security.Optional Security Enablers.MalwareDetectionService
Chapter Security,
Catalogue-Link to Implementation [N/A ]
Owner Inria, Jean-Yves Marion



Within this document you find a self-contained open specification of a FIWARE generic enabler, please consult as well the FIWARE Product Vision, the website on http://www.fiware.org and similar pages in order to understand the complete context of the FIWARE platform.


  • Copyright © 2012-2013 by Inria

Legal Notice

Please check the following Legal Notice to understand the rights to use these specifications.


Malware are programs designed to have an unwanted behavior seen from the legitimate user's side. They may be used to disrupt services, organise data leaks, or give access to some non-authorized security levels on a system. Malware are a very common threat, more than 70 million of such programs are known (cf. McAfee's report [1]). Furthermore, malware use some more and more sophisticated techniques thus complicating their detection. A virus such as Duqu has been revealed only six month after its deployment (see Symantec's report on Duqu [2]).

One of the main issues in malware detection is that there is no way to characterise definitely a program by its behaviour, neither syntactically nor semantically. A malware may hide its code by means of many techniques such as encryption, self-rewriting, and so on.

"Morphus" is a software capable of extracting (partly) a morphological signature from binary code, that corresponds to the behavior of malware. Doing so, it may by-pass some standard encryption techniques. The software can be used within many scenarios. The main GE offers an access to "Morphus" through a web service.

Target Usage

The malware detection service GE provides a mechanism for determining if the submitted executable binary file is sane or infected by a malware. Depending on yours needs, the service can answer SANE/INFECTED or by a distance vector to the malware database. The second option, distance vector, offers a finer analysis of suspicious cases by indicating the composition of the infected binary (shows the percentage of components from others malware).

Use Case

This is a typical scenario: the user has a suspicious binary file F whom he would like to estimate how dangerous it is. He sends a copy to "Morphus" which provides a distance vector to known malware. The report describes which malware occur in F and provides for each malware M an estimate of the level of the infection of F by M, that is a distance between the signature of F and the signature M. Morphus provides two options of analysis, static or dynamic. A static analysis is fast but not very precise, a dynamic analysis is finer but less efficient because binary must be executed in a monitored environment.

Basic Concepts

Morphus reads input binary files and extracts from them signatures. Signatures are composed of sites, that is, abstract descriptions of the behaviour of the input program. For instance, one of the sites could correspond to the initialisation of an RSA-encrypted channel. Binary files are then matched against malware sites. The signatures depend on some selectable options such as dynamic/static analysis, or security thresholds. The system provides currently only the static/dynamic choice. The other options are left for further releases. Morphus takes into account a white list database, that is, a list of known and safe signatures. Generally speaking, it contains signatures from basic operating system services.

Three consumers can be implemented from the delivered WSDL file:

  • scan client returns SANE/INFECTED string or an error message when service can not extract a morphological graph (e.g. GRAPH_TOO_SMALL string, TIMEOUT string),
  • distance client returns a distance matrix to a malware database. This distance evaluation indicates the distance of the input sample with respect to the malware of the database,
  • malware list consulting client returns the list of all malware names that can be detected by the service.

A selectable option can be set for determining how the binary must be analysed (e.g. when response of first static analysis is GRAPH_TOO_SMALL, you can select dynamic mode and submit the binary file once more) .

Example Scenario

In an information system, it very important to control that all executable binary files incoming from external sources are not infected by a malware. For instance, suppose that you have a mail gateway in your company for which you should decide which joint files can be delivered. In that case, it is possible to submit all executable binaries to the Malware Detection Service and take it as a filtering process.

Main Interactions

The Morphus software is available either as a web-service or through a direct connection to the website.


End-user applications send requests in order to submit a binary file for evaluation to determine if it is sane or infected. Additionally, it is possible to list all recognized malware contained in the database.


Scan a binary file

Given a submitted binary files, this action make it scanned by Morphus. It answers by either INFECTED for an infected binary file, or by SANE otherwise. When a binary file is submitted for analysis, the MODE option can be selected as either static or dynamic.

Direct submission through a browser
  • Once authenticated, a normal user can submit a binary file by filling a form (local binary path, scan action, mode between static and dynamic). The result is directly displayed in the browser.
Sequence diagram: binary scan from browser


Web Services client application
  • The client application submits a binary file through the scan web service and waits until Morphus returns the distance vector result.
Sequence diagram: binary scan from web service application client


Distance of a binary file

For this action, the user submits a binary file to the scanner as above, but in this case, Morphus will reply with the distance vector between the malware of the database and the submitted file.

The string format is DIST: "Submitted binary name"|percent (binary detected sites/malware sites), percent (binary detected sites/database sites): detected malware name| another distance vector |...

Example: DIST:"Backdoor.Win32.Hupigon.bhes.exe"|7.53% (125/1660), 14.63% (125/854): "HLLC.Asive"|15.12% (251/1660), 41.97% (251/598): "AutoRun.tl"|23.1% (382/1660), 90.30% (382/423): "KillApp.y"|


  • "Backdoor.Win32.Hupigon.bhes.exe" is the submitted binary file
  • 7.53% (125/1660) is the distance to the database
  • 14.63% (125/854) is the distance to the malware
  • "HLLC.Asive" is some detected malware name

Direct submission through a browser
  • Once authenticated, a normal user can submit a binary file by filling a form (local binary path, distance action, mode between static and dynamic). The result is directly displayed in the browser.
Sequence diagram: binary distance evaluation from browser


Web Services client application
  • The client application submits a binary file through the distance web service and waits until Morphus returns a distance vector result.

String format is DIST:"Submitted binary name"|percent (binary detected sites/database sites), percent (binary detected sites/malware sites): detected malware name| another distance vector |...

Sequence diagram: binary distance evaluation from web service application client


List malware database

This action provides a listing of malware's name in the database.

Example of result: 1337Crypter.a|2005.or|ACVE.am|ACVE.az|AF.20|AFtp.10|AIMJaker.10|AInfBot.co|AInfBot.cq|AInfBot.o|...

Sequence diagram: consult malwares list from browser



Malware Detection System architecture specification File:MorphusArchitecture.gif

  • Malware Detection Service is based on Application Server and Enterprise Service Bus from WSO2 enterprise middleware corporation http://wso2.com for transporting binary file into the High Security Lab.
  • The morphological detection engine technology (Morphus) is developed by INRIA.

Basic Design Principles

Design Principles

Malware Detection Service uses the "Axis2" Web service engine for the deployment of the web service. This means that they inherit all the power and versatility of Axis2, which implemented most of the WS-* family specifications.

Detailed Specifications

Following is a list of Open Specifications linked to this Generic Enabler. Specifications labeled as "PRELIMINARY" are considered stable but subject to minor changes derived from lessons learned during last interactions of the development of a first reference implementation planned for the current Major Release of FI-WARE. Specifications labeled as "DRAFT" are planned for future Major Releases of FI-WARE but they are provided for the sake of future users.

Open API Specifications


[1] Mc Afee, Threat Quarterly report. http://www.mcafee.com/us/resources/reports/rp-quarterly-threat-q3-2011.pdf

[2] Symantec, The Precursor to the Next Stuxnet http://www.symantec.com/connect/w32_duqu_precursor_next_stuxnet

[3] Isabelle Gnaedig, Matthieu Kaczmarek, Daniel Reynaud, Stéphane Wloka. Unconditional self-modifying code elimination with dynamic compiler optimizations, 5th International Malicious and Unwanted Software Conference (Malware 2010)

[4] Guillaume Bonfante, Jean-Yves Marion, Daniel Reynaud. A Computability Perspective on Self-Modifying Programs,7th IEEE International Conference on Software Engineering and Formal Methods.

[5] Jean-Yves Marion, Daniel Reynaud. Dynamic Binary Instrumentation for Deobfuscation and Unpacking, presentation at the In-Depth Security Conference Europe 2009inria-00330022, version 1

[6] Guillaume Bonfante, Matthieu Kaczmarek, Jean-Yves Marion. Architecture of a Morphological Malware Detector, Journal in Computer Virology 5, 3 (2009).

[7] Guillaume Bonfante, Matthieu Kaczmarek, Jean-Yves Marion. Morphological Detection of Malware, 3rd International Malicious and Unwanted Software Conference (Malware 2008)

Terms and definitions

This section comprises a summary of terms and definitions introduced during the previous sections. It intends to establish a vocabulary that will be help to carry out discussions internally and with third parties (e.g., Use Case projects in the EU FP7 Future Internet PPP). For a summary of terms and definitions managed at overall FI-WARE level, please refer to FIWARE Global Terms and Definitions

  • Attack. Any kind of malicious activity that attempts to collect, disrupt, deny, degrade, or destroy information system resources or the information itself
  • Authentication protocol: "Over-the-wire authentication protocols are used to exchange authentication data between the client and server application. Each authentication protocol supports one or more authentication methods. The OATH reference architecture provides for the use of existing protocols, and envisions the use of extended protocols which support new authentication methods as they are defined." (OATH)
  • Access control: is the prevention of unauthorized use of a resource, including the prevention of use of a resource in an unauthorized manner. (ITU-T-X-800_Link). More precisely, access control is the protection of resources against unauthorized access; a process by which use of resources is regulated according to a security policy and is permitted by only authorized system entities according to that policy. RFC 2828
  • Account: An (user) account is “typically a formal business agreement for providing regular dealings and services between principal sand business service providers.” OASIS Security Assertion Markup Language (SAML)
  • Authentication (AuthN): We adopted the following definition of authentication from RFC 3588"Authentication is “the act of verifying the identity of an entity (subject)”
TrustInCyberspace adds the term “level of confidence” to this definition:
Authentication is the process of confirming a system entity’s asserted identity with a specified, or understood, level of confidence.” This definition holds all necessary parts to examine authentication in broad sense. First of all it does not narrow the authentication to human users, but refers to a generic “system entity”. See authentication reference architecture description for a closer look at different identities that could be authenticated.
Secondly it introduces the often neglected concept of “level of confidence” which applies to each authentication of an identity. No computer program or computer user can definitely prove the identity of another party. There is no authentication method that can be secured against any possible identity-theft attack, be it physical or non-physical. It is only possible to apply one or more tests, which, if passed, have been previously declared to be sufficient to go on further. The problem is to determine which tests are sufficient, and many such are inadequate.
The original Greek word originates from the word 'authentes'='author'. This leads to the general field of claims and trust management, because authentication could also mean to verify the “author” / issuer of any claim.
The confirmation or validation process of authentication is actually done by presenting some kind of proof. This proof is normally derived from some kind of secret hold by the principal. In its simplest form the participant and the authentication authority share the same secret. More advanced concepts rely on challenge/response mechanisms, preventing the secrets to be transmitted. Refer to Authentication Technologies for a detailed list of authentication methods used today.
As stated above, each authentication method assures only some level of trust in the claimed identity, but none could be definite. Therefore it makes sense to distinguish the different authentication methods by an associated assurance level, stating the level of trust in the authentication process.
As this assurance level depends not only on the technical authentication method, but also on the overall computer system and even on the business processes within the organization (provisioning of identities and credentials), there is no ranking of the authentication methods here.
  • Cyber attack. An attack, via cyberspace, targeting an entity (industrial, financial, public...) and using cyberspace for the purpose of disrupting, disabling, destroying, or maliciously controlling a computing environment/infrastructure; or destroying the integrity of the data or stealing controlled information
  • Exploit. A program or technique that takes advantage of vulnerability in software and that can be used for breaking security, or otherwise attacking a host over the network
  • Federation: The term federation “is used in two senses - "The act of establishing a relationship between two entities. An association comprising any number of service providers and identity providers.” OASIS Security Assertion Markup Language (SAML)
“A federation is a collection of realms that have established a producer-consumer relationship whereby one realm can provide authorized access to a resource it manages based on an identity, and possibly associated attributes, that are asserted in another realm.
Federation requires trust such that a relying party can make a well-informed access control decision based on the credibility of identity and attribute data that is vouched for by another realm.” WS-Federation @ IBM
Remark: Federation according to WS-Federation @ IBM is similar to the concept of a Circle of Trust.
  • Forensics for evidence. The use of scientifically derived and proven methods toward the preservation, collection, validation, identification, analysis, interpretation, documentation and presentation of digital evidence derived from digital sources for the purpose of facilitating or furthering the reconstruction of events found to be criminal, or helping to anticipate unauthorized actions shown to be disruptive to planned operations.
  • Identity. In the narrow sense, identity is the persistent identifier of users (user name), things or services by which other parties “remember” them and, hence, are able to store or retrieve specific information about them and are able to control their access to different resources. In the wider sense, identity also covers further attributes of users, things and services; e.g. for users, such information may include personal information such as context, group membership and profile.
  • Identity (Digital): The term identity and its meaning have been discussed controversially in the “identity community” for many years. Until now, there is no commonly agreed definition of that notion. : : The IdM && AAA reference architecture applies the following three definitions of identity.
The Identity Gang defines the term digital identity as follows:
A digital identity is “a digital representation of a set of Claims made by one party about itself or another digital subject.”
The following comments were added:
A digital identity is just one set of claims about a digital subject. For any given digital subject there will typically be many digital identities.
A digital identity can be created on the fly when a particular identity transaction is desired or persistent in a data store to provide a representation that can be referenced.
A digital identity may contain claims made by multiple claimants.
A digital identity may be signed by a digital identity provider to provide assurance to a relying party.
This definition emphasizes two facts:
Normally, a principal (subject) has multiple digital identities or personas.
Identities are made out of attributes (claims).
Therefore, the scope of identity management in the reference architecture has two viewpoints: For once it focuses on identities and personas itself, and on the other side, it deals with the attributes of these identities and personas.
The Liberty Alliance Project (LAP) defines digital identity as follows:
Digital identity is “the essence of an entity. One’s identity is often described by one’s characteristics, among which may be any number of identifiers. A Principal may wield one or more identities.”
RSA uses the following definition of digital identity:
“Digital identity consists of an identity assertion and the characteristics, sometimes called attributes that are collected or observed through our computerized relationships. It is often as simple as a user name and password.”
The definition of RSA adds one important aspect to the identity discussion: Even the simplest user name and password combinations without any additional attributes or claims constitute an identity.
  • Identity context: is “the surrounding environment and circumstances that determine meaning of digital identities and the policies and protocols that govern their interactions.” (Identity Gang)
  • Identity management (IdM): comprises “the management of identity information both internally and when it is passed from one entity to another.” Open Mobile Alliance (OMA)
  • Identity provider: The Open Mobile Alliance (OMA) defines the term identity provider (IdP) as follows - An identity provider is “a special type of service provider […] that creates, maintains, and manages identity information for principals, and can provide […] assertions to other service providers within an authentication domain (or even a circle of trust).”
The Identity Provider is part of the Identity Management infrastructure.
  • Impact. The adverse effect resulting from a successful threat exercise of vulnerability. Can be described in terms of loss or degradation of any, or a combination of any, of the following three security goals: integrity, availability, and confidentiality.
  • Malware refers to software programs designed to damage or do other unwanted actions on a computer system.
  • Partial identity: a partial identity is a set of attributes of a user. Thus, an identity is composed of all attributes of a user, a partial identity is a subset of a user's identity. Typically, a user is known to another party only as a partial identity. A partial identity can have a unique identifier. The latter is a strong identifier if it is allows for a strong authentication of the user (holder) of the partial identity, such a cryptographic "identification" protocol
  • Privacy. Dictionary definitions of privacy refer to "the quality or state of being apart from company or observation, seclusion [...] freedom from unauthorized intrusion" (Merriam-Webster online [MerrWebPriv]). In the online world, we rely on a pragmatic definition of privacy, saying that privacy is the state of being free from certain privacy threats.
  • Privacy threats. The fundamental privacy threats are: traceability (the digital traces left during transactions), linkability (profile accumulation based on the digital traces), loss of control (over personal data) and identity theft (impersonation).
  • Risk analysis. The process of identifying security risks, determining their magnitude, and identifying areas needing safeguards. An analysis of an organization's information resources, its existing controls, and its remaining organizational and MIS vulnerabilities. It combines the loss potential for each resource or combination of resources with an estimated rate of occurrence to establish a potential level of damage
  • Security monitoring. Usage of tools to prevent and detect compliance defaults, security events and malicious actions taken by subjects suspected of misusing the information system.
  • Service impact analysis. An analysis of a service’s requirements, processes, and interdependencies used to characterize information system contingency requirements and priorities in the event of a significant disruption.
  • Single sign-on: is “From a Principal’s perspective, single sign-on encompasses the capability to authenticate with some system entity—[…] an Identity Provider - and have that authentication honored by other system entities, [termed] Service Providers […]. Note that upon authenticating with an Identity Provider, the Identity Provider typically establishes and maintains some notion of local session state between itself and the Principal’s user agent. Service Providers may also maintain their own distinct local session state with a Principal’s user agent.” Liberty Alliance Project (LAP)
  • S&D: Security and Dependability
  • The protocol specifies how integrity and confidentiality can be enforced on messages and allows the communication of various security token formats, such as SAML, Kerberos, and X.509. Its main focus is the use of XML Signature and XML Encryption to provide end-to-end security.
  • Threat. An event, process, activity being perpetuated by one or more threat agents, which, when realized, has an adverse effect on organization assets, resulting in losses (service delays or denials, disclosure of sensitive information, undesired patch of programs or data, reputation...)
  • USDL and USDL-Sec: The Unified Service Description Language (USDL) is a platform-neutral language for describing services. The security extension of this language is going to be developed FI-WARE project.
  • Vulnerability. A weakness or finding that is non-compliant, non-adherence to a requirement, a specification or a standard, or unprotected area of an otherwise secure system, which leaves the system open to potential attack or other problem.
  • WS-SecurityPolicy: It is an extension to SOAP to apply security to web services. It is a member of the WS-* family of web service specifications and was published by OASIS.
  • WS-* family is a prefix used to indicate specifications associated with Web Services and there exist many WS* standards including WS-Addressing, WS-Discovery, WS-Federation, WS-Policy, WS-Security, and WS-Trust.
  • Web Services Description Language (WSDL) is an XML-based interface description language that is used for describing the functionality offered by a web service.|}

Personal tools
Create a book