PRIMORIS      Contacts      FAQs      INSTICC Portal

Keynote Lectures

Towards Interactive Data Exploration
Carsten Binnig, TU Darmstadt, Germany

Data-Driven Crowdsourcing
Tova Milo, Tel Aviv University, Israel

João Gama, LIAAD - INESC TEC, University of Porto, Portugal



Towards Interactive Data Exploration

Carsten Binnig
TU Darmstadt

Brief Bio
Carsten Binnig is a Full Professor in the Computer Science department at at TU Darmstadt and an Adjunct Associate Professor in the Computer Science department at Brown University. Carsten received his PhD at the University of Heidelberg in 2008. Afterwards, he spent time as a postdoctoral researcher in the Systems Group at ETH Zurich and at SAP working on in-memory databases. Currently, his research focus is on the design of data management systems for modern hardware as well as modern workloads such as interactive data exploration and machine learning. He has recently been awarded a Google Faculty Award and a VLDB Best Demo Award for his research. 

Technology has been the key enabler of the current Big Data movement. Without open-source tools like R and Hadoop, as well as the advent of cheap, abundant computing and storage in the cloud, the ongoing trend toward datafication of almost every research field and industry could never have occurred. However, the current Big Data tool set is ill-suited for interactive data exploration making the knowledge discovery process a major bottleneck in our data-driven society. In this talk, we will first give an overview of challenges for interactive data exploration on large data sets and then present current research results that revisit the design of existing data management systems, from the query interface to the storage and the underlying hardware, to enable interactive data exploration.



Data-Driven Crowdsourcing

Tova Milo
Tel Aviv University

Brief Bio
Tova Milo received her Ph.D. degree in Computer Science from the Hebrew University, Jerusalem, in 1992. After graduating she worked at the INRIA research institute in Paris and at University of Toronto and returned to Israel in 1995, joining the School of Computer Science at Tel Aviv university, where she is now a full Professor and holds the Chair of Information Management. She served as the Head of the Computer Science Department from 2011-2014. Her research focuses on large-scale data management applications such as data integration, semi-structured information, Data-centered Business Processes and Crowd-sourcing, studying both theoretical and practical aspects.
Tova served as the Program Chair of several international conferences, including PODS, VLDB, ICDT, XSym, and WebDB, and as the chair of the PODS Executive Committee. She served as a member of the VLDB Endowment and the PODS and ICDT executive boards and as an editor of TODS, IEEE Data Eng. Bull, and the Logical Methods in Computer Science Journal. Tova has received grants from the Israel Science Foundation, the US-Israel Binational Science Foundation, the Israeli and French Ministry of Science and the European Union. She is an ACM Fellow, a member of Academia Europaea, a recipient of the 2010 ACM PODS Alberto O. Mendelzon Test-of-Time Award, the 2017 VLDB Women in Database Research award, the 2017 Weizmann award for Exact Sciences Research, and of the prestigious EU ERC Advanced Investigators grant.

One of the foremost challenges for information technology over the last few years has been to explore, understand, and extract useful information from large amounts of data. Some particular tasks such as annotating data or matching entities have been outsourced to human workers for many years. But the last few years have seen the rise of a new research field called crowdsourcing that aims at delegating a wide range of tasks to human workers, building formal frameworks, and improving the efficiency of these processes.
What may be achieved with the help of the crowd depends heavily on the properties and knowledge of the given crowd. In this talk we will focus on knowledgeable crowds. We will examine the use of such crowds, and in particular domain experts, for assisting in solving data management problems. Specifically we will consider three dimensions of the problem: (1) How domain experts can help in improving the data itself, e.g. by gathering missing data and improving the quality of existing data, (2) How they can assist in gathering meta-data that facilitate improved data processing, and (3) How can we find and identify the most relevant crowd for a given data management task. Using examples from recent work, I will present several exciting and new directions that are opening up for database research.



Data Mining in the XXI Century

João Gama
LIAAD - INESC TEC, University of Porto

Brief Bio
João Gama is an Associate Professor at the University of Porto, Portugal. He is also a senior researcher and member of the board of directors of the Laboratory of Artificial Intelligence and Decision Support (LIAAD), a group belonging to INESC Porto. João Gama serves as the member of the Editorial Board of Machine Learning Journal, Data Mining and Knowledge Discovery, Intelligent Data Analysis and New Generation Computing. He served as Co-chair of ECML 2005, DS09, ADMA09 and a series of Workshops on KDDS and Knowledge Discovery from Sensor Data with ACM SIGKDD. He was also the chair for the conference of Intelligent Data Analysis 2011. His main research interest is in knowledge discovery from data streams and evolving data. He is the author of more than 200 papers reviewed by peers and author of a recent book on Knowledge Discovery from Data Streams. He has extensive publications in the area of data stream learning.

Nowadays, there are applications in which the data are modelled best not as persistent tables, but rather as transient data streams. In this keynote, we discuss the limitations of current machine learning and data mining algorithms. We discuss the fundamental issues in learning in dynamic environments like learning decision models that evolve over time, learning and forgetting, concept drift and change detection. Data streams are characterized by huge amounts of data that introduce new constraints in the design of learning algorithms: limited computational resources in terms of memory, processing time and CPU power. In this talk, we present some illustrative algorithms designed to taking these constrains into account. We identify the main issues and current challenges that emerge in learning from data streams, and present open research lines for further developments.