Learning Big Data tools at UBO and IAIS. Germany, 4-8 February 2019

Posted on: Tue, 02/26/2019 - 16:44 By: marko.jelic

In the period from February 4th to February 8th, 2019 a staff exchange was arranged between the Mihajlo Pupin Institute[1] and the Fraunhofer Institut für Intelligente Analyse und Informationssysteme (Fraunhofer IAIS)[2] and Universität Bonn[3] for Dea Pujić and Marko Jelić, Junior researches at Pupin.

 fig 2 1Figure 1 – Fraunhofer IAIS premises (The Birlinghoven castle  )










The first working day of the Staff exchange, Monday, February 4th, Dea Pujić and Marko Jelić presented their work to IAIS/UBO staff at the IAIS  premises (Schloss 53757 Sankt Augustin, Germany, Figure 1). After the initial introduction, Dea presented the Machine learning projects (Renewable production forecaster and Non-intrusive load monitoring) that she is currently working on and Marko presented the optimization solutions that are being developed for long-term feasibility assessments of Hybrid renewable energy plants and automatic load scheduling in the presence of Demand response events. With the presentation finished, the meeting continued with the presentations and introduction on the Smart Data Analytics (SDA) group[4] and SANSA[5] from Hajira Jabeen and Gezim Sejdiu. When the SANSA presentation was concluded the group continued the discussion on how the software solutions developed at Bonn can be used to enhance the projects that are done by Pupin and what are the possibilities for future joint research efforts. This meeting was also attended by Damien Graux from Fraunhofer and Sahar Vahdati and Heba Mohamed from Bonn University.

The meetings of the second working day, Tuesday February 5th, were organized at the Institut für Informatik (Informatics Institute, Endenicher Allee 19a, Bonn, Germany) where the headquarters of the SDA are located. The day started with a hands-on session with Spark examples from the MA-INF-4223-DBDA-Lab (Figure 2 left)) by first installing the required software packages for Scala development in Spark, and then working on same rudimentary examples like word counts and eventually moving on to parsing RDF data and performing basic operations on classes in linked data. The Spark hands-on session continued after the lunch break with advanced spark operations, transforming RDDs to data frames and performing SQL queries on data frames. When the Spark session was finished, Hajira presented the Big Data Europe project[6], its underlying structure and the tested use-cases (Figure 2 right).

fig 3 1  fig 3 2

Figure 2 – Second day presentations at UBO (Spark hands-on with Gezim on the left and BDE presentation by Hajira on the right)

Wednesday, February 6th, the third working day was also scheduled at Institut für Informatik where participants from Pupin attended the Smart Data Analytics department meeting (Figure 3) where group members present ongoing projects, current and future work, discuss issues but also report on experiences from conference travels. Dea and Marko were invited to give a short presentation on their projects at this meeting where the long-standing good collaboration between Pupin and SDA group was reaffirmed.

fig 4 1  fig 4 2

Figure 3 – Smart Data Analytics department meeting (Pupin’s presentation on the left and Prof. Dr. Jens Lehman’s presentation on the right)

After the SDA meeting and lunch break, the Distributed Semantic Analytics group (DSA, Figure 4) held its internal meeting where Rajjat Dadwal presented his ongoing work on clustering in SANSA. Afterwards, a crash course was organized by Sahar Vahdati with the main topics being Linked Open Data, RDF fundamentals and Knowledge Graphs. Finally, Pupin students also attended a master thesis seminar where two of the MSc students from the SDA group presented their work on their theses. In the evening of that day, a couple of the project participants, Sahar, Damien, Dea and Marko met for a team-building dinner.

fig 5 

Figure 4 – Distributed Semantic Analytics group meeting with guests from Pupin

Back at Fraunhofer, the activities of Thursday, February 7th started with an introduction to the Data Lake concept by Damien (Figure 5 left) and continued with the Pupin delegation attending the department meeting of the Fraunhofer group. Here, good words were also spoken about the partnership between Pupin and Fraunhofer. Afterwards, the Data Lake approach was further analyzed and query processing for heterogeneous data sources were presented by Mohamed Nadjib Mami (Figure 5 right). After the lunch break, the AskNow platform and the Semantic Question Answering projects were presented by Mohnish Dubey. Then, the group with Rajjat's aid begin testing the SANSA installation with the upcoming Big Data Summer School in Belgrade. The process started with Docker installation and continued with testing the SANSA installation on both Windows and Linux (trough a virtual machine).

fig 6 1  fig 6 2

Figure 5 – Working on heterogeneous data at Fraunhofer (Damien introducing Data Lake on the left and Mohamed presenting query processing on the right)

Finally, the last day of the visit, Friday, February 8th was partly planned for testing the codes for data processing developed earlier and also for traveling back home. In conclusion, this trip facilitated the meeting between two student-researchers from Pupin and the experts from Bonn and allowed for a knowledge transfer process in the Big data domain as imagined by the LAMBDA project.


[1] http://www.pupin.rs/

[2] https://www.iais.fraunhofer.de/

[3] https://www.uni-bonn.de/

[4] http://sda.cs.uni-bonn.de/

[5] http://sansa-stack.net/

[6] https://www.big-data-europe.eu/


Work package