We use proprietary and third party's cookies to improve your experience and our services, identifying your Internet Browsing preferences on our website; develop analytic activities and display advertising based on your preferences. If you keep browsing, you accept its use. You can get more information on our Cookie Policy
Cookies Policy
FIWARE.OpenSpecification.Apps.DataVisualizationAndAnalytics - FIWARE Forge Wiki

FIWARE.OpenSpecification.Apps.DataVisualizationAndAnalytics

From FIWARE Forge Wiki

Jump to: navigation, search

If you need to check out the Open Specs of the previous FIWARE Release (Release4) you can go to FIWARE.OpenSpecification.Apps.DataVisualizationAndAnalytics R4

Name FIWARE.OpenSpecification.Apps.DataVisualizationAndAnalytics
Chapter Apps,
Catalogue-Link to Implementation SpagoBI
Owner ENG, Davide Zerbetto


Contents

Preface

Within this document you find a self-contained open specification of a FIWARE generic enabler, please consult as well the FIWARE Product Vision, the website on http://www.fiware.org and similar pages in order to understand the complete context of the FIWARE platform.

Copyright

  • Copyright © 2017 by ENG

Legal Notice

Please check the following Legal Notice to understand the rights to use these specifications. Note: Engineering provides the software associated to the Data Visualisation GEri (Knowage) as open source: see details here.

Overview

The Data Visualisation GE is the main component for data visualization of information. It provides several means of representation, offering a web-based end-user environment to run all types of analysis, as well as different user interaction with the GE itself.

The supported types of analyses are

  • reporting: structured reports, exportable in several formats;
  • multidimensional analysis (OLAP): navigation of data along hierarchies and dimensions;
  • charts: single ready-to-use graphical and interactive widgets;
  • interactive cockpits: aggregation of several documents into a single view: interactive and intuitive usage;
  • KPIs: view and browsing of KPI hierarchical models;
  • data mining: models and algorithms to find out hidden information patterns from great volumes of data;
  • free inquiry: graphical query building and creation of customized, data-driven selection forms;
  • ad-hoc reporting: end-user self-made analysis;
  • location intelligence: correlation between geographic and business data;
  • real-time: dashboards and consoles to monitor events in real time and act on data;
  • mobile: based on the common touch-screen interaction paradigm, the combination of BI with mobility features;
  • office automation: publication of personal documents in BI environments.
  • collaborative tools: organized dossiers, with comments and notes;
  • ETL: traditional process of extraction, transformation and loading, including internal migration of data to/from operational systems;
  • external processes: bidirectional interaction with operational and/or external systems.
  • master data: manual management of data.

Data Visualization GE is used by many types of users:

  • end users, who can access pre-built BI environments, use official documents and analysis, or freely build their own ones
  • administrator, who manages the BI Server instance, sets authorization policies, and handles the documents life cycle
  • behavioural model owner, which manages the visibility on data and users’ roles
  • developers, who directly release or manage analytical documents

Basic Concepts

Analytical document

The Data Visualization GE provides analyses as analytical documents, an analytical document being an object that specifies how data should be retrieved from data sources and how it should be displayed to the user (as a chart, a table, a map, a cockpit, ...).

Both developers and final users are able to create analytical document: of course final users (normally non-technical people) can exploit simplified web tools compared to the tools provided to developers.

An analytical document can display raw or aggregated data coming from different data sources (RDBMS, Big Data storages, files, REST services, CKAN instances, ...), according to the users requirements, and it permits user interaction (drill-through, filtering, ordering, custom calculations, depending on the type of anaylsis).

Dataset

A dataset is an object that contains or is in charge of retrieving some data. For example, a dataset can be defined as:

  • an SQL statement to be executed on a particular RDBMS system;
  • a file;
  • a CKAN resource;
  • an HTTP request to an external REST service;
  • ...

When defining an analytical document, users specify the dataset(s) to be considered for data retrieval, therefore they can completely forget about the data source: is this sense, the dataset can be considered a layer that levels all data sources and hides the underlying complexity. As for analytical documents, also datasets can be created by developers and final users: final users can define datasets by uploading their XLS/CSV files or querying the enterprise data sources using a high-level model and a high-level query designer.


The behavioral model

The Behavioral Model regulates the visibility on documents and data according to the roles and profiles of the end users. The behavioral model offers many advantages in a BI project, including:

  • reducing the required number of analytical documents to be developed and maintained
  • coding visibility rules once only and apply them to several documents, each one with its own analytical logics
  • ensuring a uniform growth of the project over time
  • guaranteeing the respect of the visibility rules over time, with no limitation on the number of engines and analytical documents that can be added over time.

The behavioral model is based on four main concepts:

  • user profile, defining the user’s roles and attributes
  • repository rights, which defines the users’ rights in terms of document accessibility
  • analytical drivers, defining which data can be shown to the user within a single document
  • presentation environment settings, defining how the user can reach and run his own documents.

In other words, the behavioral model mainly answers the following questions:

  • WHO uses the business intelligence solution (user profile)
  • WHAT is visible to users, in terms of documents and data (repository rights and analytical drivers)
  • HOW users work with their documents (analytical drivers and presentation environment settings).


Analytical engine

An analitycal engine is a component of the Data Visualization GE that is in charge of the actual execution and presentation of a particular kind of anaylsis (report, map, KPI, OLAP analysis, cockpit, ...).

The Data Visualization GE architecture is based on a modular approach where analytical engines are main components: users can decide to install just the engines they need and skip the others; moreover, the architecture of the Data Visualization GE permits different engines for the same analytical area with no limitations (for example: more than one engine for reporting, more than one engine for OLAP analysis, maps and so on).

Data Visualization Architecture

Data Visualization GE architecture is functionally layered on three main levels:

  • Delivery layer, which manages all possible usages of the Server by end-users or from external applications
  • Analytical layer, providing all analytical functionalities of the product
  • Data layer, which regulates data loading through many access strategies.


File:DataVizGEArchitecture.png


Delivery layer

The Delivery Layer covers all publication requirements that allow business intelligence services to be used by end-users and to be accessible from third-party applications. It can be accessed through different modalities:

  • default web view: it is the default Data Visualization GE use mode. A customizable web application is provided, working on a standard application server (i.e Tomcat, JBoss). The administrator can define the layout and specific views for each end-user type;
  • mobile web view: some types of analysis are also acessible through a mobile device, thanks to the interaction between Data Visualization GE and the remote client interface, in order to display one’s reports, charts or cockpits on one’s own tablet or smartphone;
  • services: REST services allowing external applications to interact with the Data Visualization GE in order to discover and retrieve analytical documents' results (tipically static documents in XML, HTML, PDF or XLS format) and datasets' content;
  • JavaScript API: even for the integration of enterprise applications behind or without the end-user GUI.


Analytical layer

The Analytical Layer is the core of the Data Visualization GE, providing all the analytical and operational capabilities in a secure and profiled mode. Its main components are:

  • Analytical Engines, covering all the BI analytical requirements in terms of analysis (Report, OLAP, Charts, Cockpits, KPIs, etc.) and providing many engines for each type, so as to guarantee the highest flexibility and customers' satisfaction;
  • Operational Engines, to interact with OLTP systems by means of returning ETL or processes, and manage basic BI registries such as master data or lookup domains;
  • Behavioral Model, which regulates the visibility over documents and data, according to end-users’ roles. It allows to reduce the number of analytical documents to be developed and managed by the user, guaranteeing the uniform and consistent growth of the solution overtime.

Offering multiple engines for the same analytical/operational area and/or multiple instances for a same engine, the Data Visualization GE architecture provides various benefits, such as:

  • less workload on the engine instances, to guarantee high performances;
  • openness to improve or enrich the suite thanks to the integration of new engines and functionalities, minimizing the impact on the exiting environments;
  • high flexibility and modularity;
  • a high level of scalability over time, with minimum economic, infrastructural and application-level impact
  • meeting requests coming from a wide range of users, coming from various sectors, through a complete range of engines;
  • scalability of users and servers with no additional costs.


Data Layer

The Data Layer allows data and metadata storage and usage. Data is usually located in a data warehouse, whose design is out of the Data Visualization GE scope and strictly related to the specific customer’s world. The Data Visualization GE offers a specific ETL tool allowing to load data at this level, covering the whole BI stack. The Data Visualization GE can directly access the DWH through a standard JDBC connection, or it can use a specific access strategy based on high-levels models.

Apart from traditional RDBMS, the Data Visualization supports data retrieval from many other sources, such as Big Data storages, external REST services, OpenData portals.

The Data Layer is where the integration with other FIWARE data providers GEs takes place: in particular integration with


Administrative tools and cross services

Besides its analytical, delivery and data access capabilities, the Data Visualization GE provides all the administrative tools to handle the whole instance and many cross-product services to make its functionalities powerful.

The administrative tools support developers, testers and administrators in their daily work, providing various functionalities, such as:

  • map catalogue;
  • management of repository, analytical model, behavioural model and engines;
  • configuration of data sources and data sets;
  • audit & monitoring analysis;
  • management of value domains, configuration settings and metadata.

Main Interactions

The Data Visualization GE is closely related to other FIWARE GEs, especially those concerning the business infrastructure and the data providers. The figure below depicts some of these relationships:

The user interacts mainly with analytical documents, both in case he is executing an already existing document, and when he is defining a new one with ad-hoc reporting tools. The analytical document retrieves data from one or more datasets, which in turn retrieves data from data sources, querying data providers GE or traditional RDBMS systems.

Third-party applications can retrieve data and analyses results through the REST and JavaScript API.

Interaction diagrams

The integration scenario with the Open Data GE needs a closer look, since it involves also other GEs coming from the Apps chapter. The user creates a report by his own using the ad-hoc reporting capabilities provided by the Data Visualisation GE, starting from datasets available in the Open Data-GE (data portal). He interacts with the GE, which in turn retrieves data and metadata from the Open Data GE; in case the user subscribed a pay-per-use model, the Open Data GE notifies the Business API Ecosystem GE about the dataset's usage.

User performs ad-hoc reporting

A part from ad-hoc reporting, datasets provided by the Open Data GE can be used also for other types of traditional BI analysis (reporting, charts, cockpits, KPIs, ...).

Basic Design Principles

  • API

The Data Visualisation GE exposes main functionalities as REST services or JavaScript API to facilitate integration with other GE and external components. Exposed REST services are independent from implementation technology.

  • Extension Point

Data Visualisation GE also supplies extension points, i.e. specifications that permit to implement new components such as new analytical engines, using out of the box libraries and APIs.


Re-utilised Technologies/Specifications

The Data Visualization GE relies on the following technologies/specifications:

  • HTTP
  • SQL
  • RDBMS
  • REST
  • JSON
  • HTML
  • WFS/WMS

On the other hand, SpagoBI GEri reuses the following technologies/languages/API:

  • Java
  • JavaScript
  • Servlet API
  • JDBC
  • JPA
  • SMTP
  • SVG
  • R
  • Ant

Apart from these technologies, the Visualization GE (and SpagoBI GEri) is able to interact with major NoSQL solutions such as MongoDB, Cassandra, Hive, Neo4J, OrientDB and other tools belonging to the Hadoop ecosystem.

Detailed Specification

Terms and definitions

This section comprises a summary of terms and definitions introduced during the previous sections. It intends to establish a vocabulary that will be help to carry out discussions internally and with third parties. For a summary of terms and definitions managed at overall FIWARE level, please refer to FIWARE Global Terms and Definitions


  • Reporting: Development of structured reports, exportable in several formats
  • Multidimensional analysis (OLAP): Navigation of data along hierarchies and dimensions
  • Charts: Single ready-to-use graphical and interactive widgets
  • Interactive cockpits: Aggregation of several documents into a single view: interactive and intuitive usage
  • KPIs: Creation, management, view and browsing of KPI hierarchical models
  • Data Mining: Models and algorithms to find out hidden information patterns from great volumes of data.
  • Free Inquiry: Graphical query building and creation of customized, data-driven selection forms.
  • Ad-hoc reporting: End-user self-made analysis.
  • Location Intelligence: Correlation between geographic and business data.
  • Real-time: Dashboards and consoles to monitor events in real time and act on data.
  • Mobile: Based on the common touch-screen interaction paradigm, the combination of BI with mobility features
  • Office automation: Publication of personal documents in BI environments.
  • Collaborative tools: Organized dossiers, with comments and notes
  • ETL: Traditional process of extraction, transformation and loading, including internal migration of data to/from operational systems
  • External processes: Bidirectional interaction with operational and/or external systems.
  • Master data: Manual management of data.
Personal tools
Create a book