UnifiedViews 2.0 – What’s new
UnifiedViews is an Extract-Transform-Load (ETL) framework that allows users – publishers, consumers, or analysts – to define, execute, monitor, debug, schedule, and share RDF data processing tasks. UnifiedViews is one of the core components of Open Data Node – publication platform for Open data.
The data processing tasks may use custom plugins created by users. UnifiedViews differs from other ETL frameworks by natively supporting RDF data and ontologies. UnifiedViews has a graphical user interface for the administration, debugging, and monitoring of the ETL process. In this blog post, we focus on the description of new features of UnifiedViews 2.0, which was released on April 2, 2015; please see the website unifiedviews.eu for documentation of UnifiedViews, to get information how UnifiedViews may be obtained, and to see the community around UnifiedViews.
In UnifiedViews 2.0 the official helpers for developing UnifedViews plugins were merged with helpers provided by the XML and Web Engineering Research Group at Charles University in Prague; as a result, lots of further helpers were introduced which extend and improve features being available for plugin developers, so that they may implement plugins more effectively and faster. As part of this alignment of helpers, API for plugins was adjusted. Existing core plugins for UnifiedViews 2.0 were updated to support new helpers and API. UnifiedViews 2.0 is not backward compatible with UnifiedViews 1.X. If you need to use plugins developed for UnifiedViews 1.X in UnifiedViews 2.0, you may install extra library, which ensures that older plugins (developed for UnifiedViews 1.X) may be used in UnifiedViews 2.0.
In the following sections, we introduce brief overview of new features.
UnifiedViews 2.0 introduces a plugin template which may be easily used to start developing plugins – a developer needs only to specify couple of plugin’s metadata, such as its name, type, package for Java classes and, consequently, all important Java classes needed for developing and running the plugin are prepared for the plugin developer. As a result, creation of new plugins is simplified and the developers may focus mainly on the business functionality their plugins should provide.
UnifiedViews 2.0 introduces the concept of plugin extensions, classes, which may be easily exploited by plugin developers and which provide further functionality for developers when developing their plugins. Further, we list the most important extensions:
- SimpleRDF and SimpleFiles. SimpleRDF and SimpleFiles are wrappers on top of RDF data unit and files data unit, respectively. They provide plugin developers with easy to use methods to cover basic functionality when working with RDF data or files; for example, there are methods, which allow DPU developers to query RDF data within RDF data unit or to add new RDF data to RDF data unit using single line of code.
- Dynamic configuration of plugins using RDF configuration. Dynamic configuration allows a plugin to be configured dynamically over one of its inputs (configuration input data unit). If the plugin supports dynamic configuration, it may receives configuration over its configuration input data unit; such configuration is then automatically deserialized from RDF data format and used instead of the configuration being defined via configuration dialog of the plugin.
- Fault tolerance. Operation on top of RDF data can be time intensive (e.g., fetching tens of millions of RDF triples over SPARQL protocol or executing set of SPARQL Update queries on tens of millions of RDF triples). Such operations typically consist of set of calls against target RDF store. Since any such call can throw exception anytime and, as our experiments revealed, such exception is often caused by target RDF store not responding temporarily to certain operation, it makes sense to retry certain particular call rather then trying to retry whole operation or pipeline execution, as this could mean that hours of work were lost. As a result, developers may decide to use Fault tolerant extension to ensure that certain calls, which may fail, are retried in case of certain types of problems.
In UnifiedViews 2.0, it is not necessary to explicitly initialise localisation of the plugin – localisation of the plugin is automatically set up based on the localisation of the framework. There is also a native support for translating 1) messages published during plugin’s execution and 2) labels shown in the plugin’s configuration dialog.
Versioning of configurations, better support for serialization/deserialization of the configuration
UnifiedViews 2.0 provides support for versioning of plugins’ configurations. UnifiedViews 2.0 supports also automatic migration of plugins’ configuration. When a plugin developer changes configuration of a plugin, the plugin developer should 1) create new version of the configuration class and 2) provide a method, which migrates previous configuration to the new version of the configuration. As a result, when the new version of the existing plugin is imported to UnifiedViews, its configuration may be automatically updated before it is used based on the migration method provided by the plugin developer.
UnifiedViews 2.0 also changed the way how configurations of plugins are serialized, so that the persisted plugins’ configurations contain information about the versions of the configurations. Furthermore, the way how configuration of plugins is obtained and parsed on one side and stored on the other side was refactored in UnifedViews 2.0, so that the process may be easily configured/adjusted in the future.
In UnifiedViews 2.0 couple of classes were refactored:
- Set of classes new plugins should be based on were refactored, so that developing of new plugins is simplified.
- Data unit helpers were refactored, so that they are consistently named.
- Parent project every plugin should be based on was refactored, so that version of the parent projects defines all the versions of all the UnifiedViews and other artifacts the plugin needs for its use in UnifiedViews. As a result, management of plugins was simplified.
Article written by T.Knap
Tomas Knap received his Ph.D. from Faculty of Mathematics and Physics, Charles University, Czech Republic, for his research on trustworthy Linked Data integration and consumption. In 2013, he co-founded company Semantica.cz s.r.o, an SME entrepreneurship focused on consulting Linked Data and semantic web solutions for data integration and publishing.