Posted on: Mon, 05/31/2021 - 15:34 By: valentina.janev

The last decades witnessed a significant evolution in terms of data generation, management, and maintenance, especially in the RDF format. Moreover, in the energy domain, semantic data is finding its way and can be used for various data
analytics tasks. However, since data set sizes are increasing and can now be enormous, technologies are evolving to scale with the increasing data set sizes. In this regard, tools and frameworks such as SANSA have been emerged to facilitate the analytic over semantic data. SANSA is using big data technologies such as Apache Spark (as an analytics engine for large-scale data processing) and Apache Hadoop (as a distributed file system) in its backbone to be able to perform analytics in a distributed manner over a cluster of nodes. However, to be able to use SANSA, one should set up a cluster of nodes with enabled Spark and Hadoop. This requires extensive knowledge and expertise in computer systems, networking, distributed computing and etc. Moreover, in case of having sufficient technical knowledge, setting up such a cluster consumes huge manpower and is labor-intensive. To tackle the aforementioned issues, in this paper we introduce a micro-service architecture that easily brings the power of SANSA and distributed semantic data analysis in the end-user ecosystem, without having technical knowledge in the mentioned areas. The introduced architecture is based on Docker technologies and can be installed on-premise or in cloud systems.