Survey on Big Data Tools

Posted on: Thu, 06/11/2020 - 09:53 By: valentina.janev

This introductory lecture discusses the Big Data processing pipeline and the Big Data Landscape from the following perspectives

  • Big Data Frameworks
  • NoSQL Platforms and Knowledge Graphs
  • Stream Processing Data Engines
  • Big Data Preprocessing
  • Big Data Analytics
  • Big Data Visualization Tools.

Overview and Comparison of Machine Learning Algorithms

Posted on: Thu, 06/11/2020 - 09:24 By: valentina.janev

Big Data Analytics is a crucial component of the Big data paradigm and refers to the process of extracting useful knowledge from large datasets or streams of data. Due to enormity, high dimensionality, heterogeneous, and distributed nature of data, traditional techniques of data mining may be unsuitable to work with big data. 

SCADA Intrusion Detection Systems

Posted on: Thu, 06/11/2020 - 08:44 By: valentina.janev

Specific intrusion detection systems (IDSs) are needed to secure modern supervisory control and data acquisition (SCADA) systems due to their architecture, stringent real-time requirements, network traffic features and specific application layer protocols. This lecture aims to contribute to assess the state-of-the-art, identify the open issues and provide an insight for future study areas. To achieve these objectives, we start from the factors that impact the design of dedicated intrusion detection systems in SCADA networks and focus on network-based IDS solutions.

Reasoning on Financial Knowledge Graphs: The Case of Company Networks

Posted on: Fri, 06/05/2020 - 10:53 By: valentina.janev

The initial release of KGs was started on an industry scale by Google and further continued with the publication of other large-scale KGs such as Facebook, Microsoft, Amazon, DBpedia, Wikidata and many more. As an influence of the increasing hype in KG and advanced AI-based services, every individual company or organization is adapting to KG. The KG technology has immediately reached industry, and big companies have started to build their own graphs such as the industrial Knowledge Graph at Siemens.

Embedding-based Recommendations on Scholarly Knowledge Graphs

Posted on: Tue, 06/02/2020 - 09:37 By: valentina.janev

The increasing availability of scholarly metadata in the form of Knowledge Graphs (KG) offers opportunities for studying the structure of scholarly communication and the evolution of science. Such KGs build the foundation for knowledge-driven tasks e.g., link discovery, prediction and entity classification which allows to provide recommendation services. Knowledge graph embedding (KGE) models have been investigated for such knowledge-driven tasks in different application domains.

Vadalog System

Posted on: Fri, 05/29/2020 - 12:51 By: valentina.janev

Over the past years, there has been a resurgence of Datalog-based systems in the database community as well as in industry. In this context, it has been recognized that to handle the complex knowledge-based scenarios encountered today, such as reasoning over large knowledge graphs, Datalog has to be extended with features such as existential quantification. Yet, Datalog-based reasoning in the presence of existential quantification is in general undecidable.

SANSA - Scalable Semantic Analytics Stack

Posted on: Wed, 04/01/2020 - 20:59 By: valentina.janev

The size of knowledge graphs has reached the scale where centralised analytical approaches have become infeasible. Recent technological progress has enabled powerful distributed in-memory analytics that have been shown to work well on simple data structures. However, the application of such distributed analytics approaches on semantic knowledge graphs lags significantly behind. To advance both scalability and accuracy of large-scale knowledge graph analytics to a new level, foundational research on methods leveraging distributed in-memory computing and semantic technologies in combination w

Data Analytics for Energy Sector

Posted on: Tue, 05/21/2019 - 14:56 By: valentina.janev

Big Data technologies are often used in domains where data is generated, stored and processes with rates that cannot be efficiently processed by one computer. One of those domains is definitely the domain of energy. Here, the processes of energy generation, transmission, distribution and use have to be concurrently monitored and analyzed in order to assure system stability without brownouts or blackouts. The transmission systems (grids) that transport electric energy are in general very large and robust infrastructures that are accompanied with an abundance of monitoring equipment.

Subscribe to Paper