You are on page 1of 14

SMART Data Visualization and Exploration

Filipe Clrigo, Ricardo Raminhos, Rui Estevo


VIATECLA SA
clerigo@viatecla.com, rraminhos@viatecla.com, restevao@viatecla.com
Teresa Gonalves, Pedro Melgueira
Universidade de vora
tcg@uevora.pt, m11153@alunos.uevora.pt

Summary
The continuous growth on the volume of information/data does not mean a proportional increase on its
related knowledge. Even, in some cases the actual increase of information contributes to a decline on
the quality of that knowledge. The existence of automatic analysis and visual inspection mechanisms
(normally under a supervised format), represent an important added value especially when these
mechanisms are naturally integrated in repositories that are specialized in managing big volumes of
content (i.e. CMS Content Management Systems).
As some of these repositories are open, they allow a high level of flexibility to the organisations that use
them, since it is possible to freely model their business data structures. However, it also means they are
not restricted to a certain domain of specific information which brings a great challenge on the way data
is interpreted and visually presented, as its structure is not known beforehand.
This is the main purpose of the SMART Content Provider prototype. The current paper considers and
presents the results obtained in its visual and exploration data areas, applied to open repositories of
information.

The SMART Content Provider (CP) Project


Through the Smart CP [1] project, investigation on enhancing Intelligence on CMS environments was
performed under three main pillars:
(i)
(ii)
(iii)

Enhance mechanisms of aggregation of heterogeneous information (where the structures and


objects are not known beforehand),
Define and apply Artificial Intelligence Algorithms, in particular in the area of the detection of
patterns on semi-structured information,
Apply mechanisms of data presentation to results/contents, exploring non-conventional
formats and ways of information representation that contribute to a more fluid knowledge
exploration.

The knowledge resulting from this investigation has been materialized in a prototype for a generic
platform for data visualization and interaction, referred as SMART Content Provider (CP), a project
developed by VIATECLA [2], supported by Universidade de vora [3] and GTE Consultores [4], and cofinanced by QREN (Quadro de Referncia Estratgico Nacional) [5].
The present paper focuses only on the third element of the project related to the presentation and
exploration of information components. A general presentation of the project, in terms of its objectives,

architecture and results, can be found on the paper SMART Content Provider [6], whilst the detailed
presentation of the application of AI Algorithms is available on the paper Data Clustering for
heterogeneous data [7].

Architecture
Figure 1 shows a global vision for the SMART CP architectural platform. A three colour scheme is used to
characterize its functional blocks that compose the platform or external interactions:

Orange: completely external to the platform, with which the SMART CP platform interacts to obtain
data / contents,
Green: Functional blocks with which the SMART CP platform is integrated, i.e. regarding the native
content management system that supports the platform;
Purple: Native blocks from the SMART CP platform.

Scriptor Server Core


(External Content Manager)

Third Party External


Content Manager

MS Excel
(External)

Scriptor Server API

SMART
Aggregation

SMART Data Layer


MS SQL Database
(External)

SMART Import

SMART Analyser

JSON Data
Formatter

REST API

Server Layer
Client Layer

SMART Views

Data Sorting

SMART Elastic

SMART Magic Board

SMART Graphs

SMART Navigation

Data Visuals and Exploration

SMART Timeline

SMART State

Accountability

Workflows

Scriptor Server Backoffice


(External Content Manager)

Figure 1: General diagram of the architecture of the platform SMART CP

The architecture for the SMART CP platform follows a classic client/server paradigm, as presented in
Figure 1. Blocks regarding the server component are represented on the top of the image, and blocks
relating with client components on the bottom. Because Smart CP platform uses data/contents present
in content management systems, all client functional groups (i.e. data sorting, data visuals and
exploration, accountability and workflows) are integrated in the content management system backoffice
itself.

State of the Art


Several interesting visual approaches are emerging that allow visualization actions, content
manipulation and exploration in line with SMART CP representation objectives. Some of these
approaches shall be analysed next. Although some of these visual representation theories are purely
conceptual they can be easily adapted to business analytics and clustering contexts, both primary lines
of research for the SMART CP project.

Figure 2: Examples of data visualization and exploration approaches

Figure 2 (top left), presents a visualization method that allows a hierarchical notion applied to different
classes of data [8, 9], enabling also a balanced perception for those data classes on each level,
simultaneously. On this example, the data is presented at two levels only (internal and external).
However, further multi-levels can be applied progressively without the diagram becoming excessively
confusing.
The representation on the top centre shows a simple, but interesting mapping on the number of event
incidences for each variable, on a representation in the form of an area [10, 11]. This way it is possible to
observe which are the dominant variables, and most important, relate the order of magnitude between
them [12, 13].
On the top right, the figure shows what is known as a constellation [14, 15]. This concept is used on
the representation of connections between data as it is the case of graphs that can be presented using
several shapes and colours, with three-dimensional effects or on a plan. The possible variation on the
node format of the constellation can have some extra information, which will distinguish nodes between
them, with colour, size or format changes so it is possible to place a great quantity of information on the
constellation, without it being excessively confusing, being also possible to represent and highlight the
presence of clusters on the represented data.
On the bottom left, a diagram in the form of a circle is present [16, 17]. Outside the circle, and around it,
are the objects to be analysed while on the inside the relations between them are shown. Some visual

constructs can vary in order to help the comprehension and differentiation of the data. As seen in the
example, the most important connections are visible through the thickness of connecting lines.
Representations on the centre and bottom right present a simple object distribution matrix [18, 19]. On
the first, unorganised raw data with a high level of entropy is shown. On the second one, the data has
been reorganised through clustering techniques and then grouped according to its degree of similarity.
This way, a diagram where it is easy to detect and observe groups of data that were previously scattered
and of difficult identification is obtained.
Some of these visual concepts have been applied to SMART CP visualization components, as it is
presented in the following sections.

SMART components for data exploration and visualization


Regarding the client layer, as mentioned before, all contents developed are integrated in the online CMS
platform backoffice. In a conceptual point of view, the seven components are grouped in four main
areas:
1.
2.
3.
4.

Data Sorting and Filtering;


Data Visuals and Exploration;
Accountability;
Business Workflows.

The Data Sorting area is materialized through the SMART Views component. This component allows
content sorting and filtering operations in an intelligent way, being completely generic (i.e. by not
knowing the content structure beforehand). These views can be defined privately or can be made
public. The contents processed by the SMART Views component can be directly viewed, listed in a
simple way, or the results can be later used as a source of data for other visual components (e.g. SMART
Graphs, SMART Elastic).
Regarding the Data Visuals and Exploration area, the following components are present: SMART
Elastic, SMART Magic Board, SMART Graphs and SMART Navigation. The first two components will be
presented in detail in the following section.
The SMART Navigation component relates to the presentation of metrics and the possible actions to
perform over an aggregated set of contents, in a dashboard/control panel logic. Through graphics,
listings as well as metrics associated with different colour levels, it is possible to identify possible limit
situations that require further attention from the manager/administrator, in a graphical way.
The SMART Graphs component presents a relatively standard set of charts, where the user can verify
the spread of results taken from the selected sample. Although this component is not completely
innovative by itself, its graphical and information exploitation aspect has been very important to
implement, as it provides relevant information, mainly to users that are not so experienced/keen in
content exploration processes.
The Accountability area is represented by the SMART Timeline component that will be presented in
detail in the following section.
Finally, the Workflows area is represented through the component SMART State. The Scriptor Server
platform has an internal workflow engine, which was enriched with this SMART CP graphical component
allowing the generic creation of workflows/business flows. Due to this graphical generic component
(with minimal technical complexity) administrators can build business workflows specific to their
domain, not being restricted to pre-designed business workflows.

SMART Magic Board


The Magic Board graphic component allows to represent and explore contents of multiple dimensions
simultaneously, through its representation on a 2D plan, with which one or more attributes are shown
on the horizontal axis and one or more attributes are shown on the vertical axis, as well as mapping
capabilities of attribute values in the form of colour, shape and/or size.
As other visual components, this is integrated within the backoffice of the content manager. However,
and due to space/visibility requirements for the exploring area, it is possible to use this component in a
full screen mode.
This component uses data previously aggregated by the platform (within its SMART Aggregation server
component), in order to allow a quick presentation of results, with all computation and aggregation
processes carried out at the time data is entered and updated at the CMS (versus being computed on
request).
Initially, the user defines which attribute dimensions wants to explore (Figure 3 on the left). For
example, for a given object/content that represents an issue of a ticketing tool (e.g. a clarification
request, amendment or bug report), by selecting and dragging the attribute Environment to the
horizontal axis of representation, Figure 3 on the right is obtained as a result. On this screen an only for
the values to which the attribute has results, contents are presented, randomly on the available space,
where each content is mapped to a circular icon.
Note that on this representation, the main concern, at least on a first moment involves understanding
how the set of contents is spread globally and not the analysis on the content itself. However, it is
possible to access the information about one specific content at any time, by clicking on its icon visual
representation (where a message with the name/title of the content is shown). In case the user clicks a
second time on the content icon a window previewing the content data is open.

Figure 3: Data initialization on the Magic Board

It is equally possible to select further attributes and place them on the vertical axis. By having an
attribute previously defined on the horizontal axis, this will result in a two axis plan (Figure 4 on the
left). As a result, in the given example, it is easy to analyse that the great majority of the issues
included in the CMS platform relate to tasks and are associated to the development environment.
The existence of requests associated to the pre-production and production environments is
relatively low in comparison to the other environments. On the other side, issues of the type
request and bug can only be found on the development environment.

It is possible to do finer partitions on the horizontal and/or vertical axis, adding attributes to a second
level. The screen on the Figure 4 (on the right) shows the addition of the secondary attribute Assigned
To to the horizontal axis. We can note that this attribute also contributes to a very low spread rate to
results, as there are few results that are assigned to the design team. In fact, most results are assigned
to VIATECLA.

Figure 4: Attribute addition on a 2D plan (with a unique dimension on the horizontal axis left | with two dimensions on the
horizontal axis right)

Figure 5 shows a similar example of two attributes cross-checking on the horizontal and vertical axis,
reflecting a real example for a VIATECLAs client that participated on the prototype validation. In this
case, the contents with the attributes Area (horizontal axis) and Assigned To (vertical axis) have
been cross-checked i.e. requests assigned to VIATECLA employees by project area.

Figure 5: Graphic representation of contents on the horizontal dimension rea and vertical dimension Assigned Person"

Note that in addition to the possibility of expressing attribute dimensions through their representation
on the horizontal and vertical axis, it is also possible to represent them through the use of colour, shape
and size. This specification can be performed in the Visuals area of the component. Regarding the
example of representation through colour, and by selecting an attribute of the enumerated type, one
option for the mapping of those values into a colour range is presented (Figure 6). A similar approach is
carried out for the mapping by a range of shapes (Figure 7 on the left) and dimensions (Figure 7 on
the right).

Figure 6: Representation of the data dimension through a colour range

Figure 7: Representation of the data dimensions through shape and size ranges

Also present on the configuration section of the SMART Magic Board, there is a filter area that allows to
limit the universe of contents (regardless of the shape and format that contents would be represented).

With similarities to the configuration of the Visuals area, this field allows to select one attribute and
define which values should be (or not) be considered as a filter.
Finally, and when we want to add more than two attributes on the horizontal or vertical axis, or in case
the two attributes selected have a low spread level, which would result in a very high combinatory, it is
possible to do a drill-down on a specific quadrant of the specific universe, selected by the user. Thus,
considering the example on the Figure 8 (on the left), in case the user selects the quadrant on the top
left corner a drill-down is done, and all universe of results now becomes the one of the quadrant
selected, on the lower level (Figure 8 on the right). The attributes previously selected (i.e.
Environment and Type) are fixed, and the user can drag other attributes to the horizontal and vertical
axis, in order to decompose/explore even more information present at this level.

Figure 8: Drill-down mechanism application to the data universe

SMART Elastic
The SMART Elastic component allows the creation of dynamic filters that can be applied to contents
where their structure is not previously known, as well as determining the form that the contents should
appear on screen as a result (i.e. which fields of information).
For the configuration itself (Figure 9), the user is questioned about the attribute fields of the object that
should be used as a dynamic/elastic filter, which attribute fields that should appear on the form that
will represent a content result, and what field should be used on the sorting/serialization of contents.
For this example the selection of the attributes Status, Project, Area, Environment, Priority,
and Assigned Persons as filtering fields, and the attribute Title a detail field, results in Figure 10.
The expression Elastic associated to the component arises from the fact that when a value in one of
the filtering dimensions is selected, the values of the remaining filters and the contents filtered are
recalculated in an elastic way; for example on the values of the remaining filters, values that may
further filter contents are kept, and all other values to which there are no contents with associated
value are removed.
In this way, the creation of Boolean filtering rules through AND operators is possible, by selecting
different contents in different filter columns, and rules with OR operator when more than one attribute
value is selected on the same filter column.

Figure 9: Initial screen for the dimension specification for the SMART Elastic component

Figure 10: Initial result presentation with no filter configuration

Since this component has a strong tendency for the exploitation of results, mainly on trial and error, it
is essential that the response times are very quick. With this in mind, the data applied by the component
for presentation do not represent the actual raw contents, but a set of indexations and pre-

aggregated values that will be made available by the SMART Aggregation server component. Thus the
effort on computation and aggregation is carried out incrementally at the time contents are created and
updated, and at the moment of visualization data will only be presented, as it has already been
processed.
As an example, the change of the filtering rule Priority (with the value Major) AND Project (with value
K4T Mobile Apps) expressed for Priority (with value Major) AND Workflow Status (with value
NotAnIssue) Figure 11, is carried out with two interactions (removal of the filter Project, and
addition of the filter Workflow Status) reducing the number of results obtained from 536 to 6 within a
time of 1 to 2 seconds.

Figure 11: Result presentation upon dimension filtering

SMART Timeline
The SMART Timeline component is specialized on the representation of contents on a time axis.
However, it follows a different approach, by not focusing on the representation of the content according
to one of its date type attributes, but by considering the dates when content changes have occurred
(either at their attribute/field levels, or through workflow changes).
The component addresses a very important question regarding accountability on content handling that
sometimes is minimized at the level of the content manager, or by presenting a vision of the
information that is too technical (e.g. on the form of log files). This way it is possible to inspect change
of which contents have been amended or had changes of status / workflow, by presenting information
related to the moment before the changes, and after those changes, in a graphic form, where it is
possible to easily identify who made which change and when, on the temporal axis.
On Figure 12 (on the left) a graphic presentation of the timeline is shown. By default, and not having any
option selected on the component, it delivers a representation where time events are distributed evenly
on the timeline. On the top of the screen (left side) it is possible to view the start and end dates of the
amendments and the period of time between those dates (this indicator can be of great importance in
the case of Service Level Agreement contract clauses).

Figure 12: Graphic representation of time occurrences

If the user clicks on the button with the symbol of event on a timeline, the contents are presented and
distributed proportionately on the time axis Figure 12 on the right. That allows a better insight of the
pace of events and mainly the immediate detection of long periods with absence of
response/interaction.
In case the user hovers over the space between events on the timeline (Figure 13), it is possible to
obtain a text showing the elapsed time occurred between events.
On the top of the component (Figure 14) there is a set of indicators showing for each user that has
carried out operations on the content, the actions made and workflow status the content has gone
through. Its selection will highlight certain events presented on a timeline, which is easier to read
(Figure 15). For the selection of a certain event, the information about changes made before and
after is displayed. On this example it is presented a change on the name of the content.

Figure 13: Graphic representation of time occurrences (with indication of the elapsed time between events)

Figure 14: Graphic representation of time occurrences (highlight of actions by the same user)

Figure 15: Inspection of changes made to a certain content

Evaluation and Future work


The correction for the SMART CP prototype can only be assessed by its effective use. During the final
stages of its development, a pilot has been made available so that the platform could be refined
according to feedback collection.
As for the scope of the test pilot, a ticketing system, already implemented at VIATECLA and named One
system was selected (communication between client and supplier). The One System is a collaborative
tool aiming at improving productivity, where the client and the development team issue and generate
issues with the capacity of giving the adequate follow-up to each situation, that could be (or not)

critical to the business, within the context of the clients project. Through this solution the
communication with the development, operation and project teams is simplified.
Clients have at their own disposal the facilities that allow them to communicate in real time about
technical questions, requests for clarification about the use of functionalities operated by all VIATECLA
platforms, or even address general comments.
The result of putting this test pilot into place has been a major success, as it has exceeded all initial
expectations. According to feedback received from the technical teams (not involved with the initial
project) about the platform usage and from VIATECLAs clients that have participated in an informal way
on the project validation were willing to keep using this environment in a more operational way after
the project validation was finished. This constitutes a recognition of the added value the SMART CP
platform brings.
As the test pilot focus is on the administration of high volume of contents (e.g. issues) and with
different levels of priorities, the components SMART Views, SMART Elastic, SMART Magic Board,
SMART Navigation and SMART Timeline, were the ones that got the most positive feedback,
because the impact they had on the information management, by turning it more visual and
comprehensive, allowing users to explore it through the drill-down tools and multi-filter criteria.
Future work of SMART CP includes the definition of strategies for launching the platform on the market,
aiming at getting more and better feedback for the improvement and innovation of the work carried out
with the effective use of the platform.

References
[1] Microsite SMART CP. 2015, http://www.viatecla.com/inovacao/smart_content_provider
[2] VIATECLA, Institucional website. 2015, http://www.viatecla.com
[3] University of vora, Institucional website. 2015, http://www.uevora.pt/
[4] GTE, Institucional website. 2015, http://www.gte.pt/
[5] National Strategic Reference Framework (NSRF), Institucional website. 2015,
http://www.qren.pt/np4/home
[6] Clrigo, Filipe. Raminhos, Ricardo. Estevo, Rui. Gonalves, Teresa. Melgueira, Pedro.: SMART
Content Provider, 2015
[7] Gonalves, Teresa. Melgueira, Pedro. Clrigo, Filipe. Raminhos, Ricardo. Estevo, Rui.: Data
Clustering for heterogeneous data, 2015
[8] Draper, G.; Livnat, Y.; Riesenfeld, R.F., "A Survey of Radial Methods for Information Visualization," in
Visualization and Computer Graphics, IEEE Transactions on , vol.15, no.5, pp.759-776, Sept.-Oct. 2009,
doi: 10.1109/TVCG.2009.23
[9] Diehl, S.; Beck, F.; Burch, M., "Uncovering Strengths and Weaknesses of Radial Visualizations---an
Empirical Approach," in Visualization and Computer Graphics, IEEE Transactions on , vol.16, no.6,
pp.935-942, Nov.-Dec. 2010, doi: 10.1109/TVCG.2010.209
[10] Bruls, Mark. Huizing, Kees. Van Wijk, JarkeJ.: Squarified Treemaps, in Book: Data Visualization
2010, pages: 33-42. Eurographics, Springer Vienna

[11] Benjamin B. Bederson, Ben Shneiderman, and Martin Wattenberg. 2002. Ordered and quantum
treemaps: Making effective use of 2D space to display hierarchies. ACM Trans. Graph. 21, 4 (October
2002), 833-854. DOI=10.1145/571647.571649
[12] Benjamin B. Bederson. PhotoMesa: a zoomable image browser using quantum treemaps and
bubblemaps. In Proceedings of the 14th annual ACM symposium on User interface software and
technology (UIST '01). ACM, New York, NY, USA, 71-80. DOI=10.1145/502348.502359
[13] Ben Shneiderman. Tree visualization with tree-maps: 2-d space-filling approach. ACM Trans. Graph.
11, 1, 92-99. DOI=10.1145/102377.115768
[14] Steven Noel and Sushil Jajodia. 2004. Managing attack graph complexity through visual hierarchical
aggregation. In Proceedings of the 2004 ACM workshop on Visualization and data mining for computer
security (VizSEC/DMSEC '04). ACM, New York, NY, USA, 109-118. DOI=10.1145/1029208.1029225
[15] Wakimoto, Kazumasa. Taguri, Masaaki.: Constellation graphical method for representing multidimensional data. Annals of the Institute of Statistical Mathematics, Kluwer Academic Publishers
[16] Krzywinski, Martin. Birol, Inanc. JM Jones, Steven. Marra, Marco A.: Hive plotsrational approach
to visualizing networks. Brief Bioinform (2012) 13 (5): 627-644 first published online December 9, 2011
doi:10.1093/bib/bbr069
[17] Braun, Lothar and Volke, Mario and Schlamp, Johann and von Bodisco, Alexander and Carle, Georg.:
Flow-inspector: a framework for visualizing network flow data using current web technologies. Springer
Vienna
[19] Henry, Nathalie. Fekete, Jean-Daniel.: MatLink: Enhanced Matrix Visualization for Analyzing Social
Networks. Lecture Notes in Computer Science , Human-Computer Interaction INTERACT 2007
[18] Han-Ming Wu, Yin-Jing Tien, Chun-houh Chen, GAP: A graphical environment for matrix
visualization and cluster analysis, Computational Statistics & Data Analysis, Volume 54, Issue 3, 1 March
2010, Pages 767-778, ISSN 0167-9473

You might also like