Embedding-based Recommendations on Scholarly Knowledge Graphs

Posted on: Tue, 06/02/2020 - 09:37 By: valentina.janev

The increasing availability of scholarly metadata in the form of Knowledge Graphs (KG) offers opportunities for studying the structure of scholarly communication and the evolution of science. Such KGs build the foundation for knowledge-driven tasks e.g., link discovery, prediction and entity classification which allows to provide recommendation services. Knowledge graph embedding (KGE) models have been investigated for such knowledge-driven tasks in different application domains.

Scalable Knowledge Graph Processing using SANSA

Posted on: Fri, 05/29/2020 - 10:55 By: valentina.janev

The size and number of knowledge graphs have increased tremendously in recent years. In the meantime, the distributed data processing technologies have also advanced to deal with big data and large scale knowledge graphs. This lecture introduces Scalable Semantic Analytics Stack (SANSA), which addresses the challenge of dealing with large scale RDF data and provides a uni ed framework for applications like link prediction, knowledge base completion, querying, and reasoning.

Big Data Outlook, Tools, and Architectures

Posted on: Fri, 05/29/2020 - 10:19 By: valentina.janev

Big data is a reality and it is being generated and handled in almost all digitised scenarios. This chapter covers the history of Big data and discusses prominent related terminologies. The significant technologies including architectures and tools are reviewed. Finally, the lecture reviews big knowledge graphs, that attempt to address the challenges (e.g. heterogeneity, interoperability, variety) of big data through their specialised representation format. This chapter aims to provide an overview of the existing terms and technologies related to big data.

SANSA - Scalable Semantic Analytics Stack

Posted on: Wed, 04/01/2020 - 20:59 By: valentina.janev

The size of knowledge graphs has reached the scale where centralised analytical approaches have become infeasible. Recent technological progress has enabled powerful distributed in-memory analytics that have been shown to work well on simple data structures. However, the application of such distributed analytics approaches on semantic knowledge graphs lags significantly behind. To advance both scalability and accuracy of large-scale knowledge graph analytics to a new level, foundational research on methods leveraging distributed in-memory computing and semantic technologies in combination w

Distributed Semantic Analytics II

Posted on: Mon, 12/24/2018 - 16:22 By: valentina.janev

This module will cover the setup, APIs and different layers of SANSA. At the end of this module, the audience will be able to execute examples and create programs that use SANSA APIs. The final part of this lecture is planned to be an interactive session to wrap up the introduced concepts and present attendees some open research questions which are nowadays studied by the community.

Distributed Semantic Analytics I

Posted on: Mon, 12/24/2018 - 16:21 By: valentina.janev

This module will cover the needs and challenges of distributed analytics and then dive into the details of scalable semantic analytics stack (SANSA) used to perform scalable analytics for knowledge graphs. It will cover different SANSA layers and the underlying principles to achieve scalability for knowledge graph processing.

Please, download from the following link.

Distributed Big Data Libraries

Posted on: Mon, 12/24/2018 - 16:21 By: valentina.janev

In the practical level, the Big Data frameworks use different APIs for graph computations and graph processing. In this lecture, the important libraries built on top of Apache Spark will be covered. These include SparkSQL, GraphX and MLlib. The audience will learn to build scalable algorithms in Spark using Scala.

Please, downoloadfrom the following link.

Distributed Big Data Frameworks

Posted on: Mon, 12/24/2018 - 16:20 By: valentina.janev

The “processing frameworks” are one of the most essential components of a Big Data systems. There are three categories of such frameworks namely: Batch-only frameworks (Hadoop), Stream-only frameworks (Storm, Samza), and Hybrid frameworks (Spark, Hive and Flink). In this lecture, we will introduce them and cover one of the major Big Data frameworks, Apache Spark. We will cover Spark fundamentals and the model of “Resilient Distributed Datasets (RDDs)” that are used in Spark to implement in-memory batch computation.

Subscribe to UBO