4th Annual Advances in Data Science Conference Report

idsai conference

Institute for Data Science and AI’s 4th annual conference took place online on Tuesday 23rd June. 11 leading international data scientists from industry and academia presented recent developments in data science, with talks ranging from application-focussed to advanced data science methodologies. Thanks to IDSAI’s strong and productive relationship with event sponsor The Alan Turing Institute, this first virtual instalment of the annual conference was a great success. Over 500 participants gathered to stream the day-long event, and a recording will be made available online shortly.

IDSAI underpins all other Digital Futures themes, and this conference was an excellent example of the capabilities afforded by Data Science and AI to research across disciplines. The day began with a welcome from Magnus Rattray, IDSAI director and Digital Futures Data Science and AI theme lead, who gave an introduction to IDSAI’s work and the conference format: 4 themed sessions each chaired by one of the event organisers.

Session 1 focussed on language and natural language, beginning with Ivan Titov, associate professor at both the University of Edinburgh and the University of Amsterdam. His talk ‘Interpretability in Natural Language Processing’ presented recent research into understanding deep neural models and making them more explainable, showing how neural models can be made more interpretable by having them provide ‘rationales’ for their predictions.

Next was Maria Liakata, Turing AI fellow and Professor in Natural Language Processing at Queen Mary University of London and University of Warwick. In her talk ‘Creating time sensitive sensors from language and heterogeneous content’ she presented the objectives of her five-year Turing AI Fellowship, through which she aims to establish a new area in natural language processing on personalised longitudinal language processing. This work centres around developing sensors for capturing digital biomarkers from language and heterogeneous user generated content to understand the evolution of an individual over time, and the mental health applications of this technology in assessing and measuring conditions between clinician appointments.

Session 2 looked at humans, intention and causality, beginning with Sabina Leonelli, Professor in Philosophy and History of Science at the University of Exeter. Her cross-disciplinary talk ‘Intelligent data linkage and distributed semantics for (big) data interpretation’ proposed a conceptual framework through which different data types and related infrastructures can be linked globally, while preserving as much as possible the domain- and system-specific properties of the data and related metadata.

Following this was Samuel Kaski, Professor of AI at the University of Manchester, and Professor of Computer Science at Aalto University, Finland. In his talk ‘Data analysis with humans’, he suggested three ways to improve modelling results by taking the human user into account in probabilistic data analysis, by joint modelling of the user and the domain data. Professor Kaski’s new position is part of a strategic cooperation between UoM and Aalto, which aims to further health-AI research via the Christabel Pankhurst Institute for Health Technology and the Finnish Center for Artificial Intelligence.

Session 3 looked at machine learning, deep learning and privacy, starting with Isabel Valera, Professor at Saarland University, Germany and Research Group Leader at the Max Planck Institute for Intelligent Systems. Her talk ‘Algorithmic recourse: from counterfactual explanations to interventions’ uses causal reasoning to caution against the use of counterfactual explanations for recourse, proposing a paradigm shift towards recourse through interventions.

Next was Freddie Kalaitzis, hybrid Applied Research Scientist / Research Engineer in the AI for Good lab at Element AI. In his talk ‘Multi-frame super-resolution by recursive fusion: HighRes-net, the tech and beyond’, he presented the first deep learning approach to MFSR that learns its sub-tasks in an end-to-end fashion: (i) co-registration, (ii) fusion, (iii) up-sampling, and (iv) registration-at-the-loss. This work has potential applications for NGOs relying on cheap low-resolution satellite imagery to monitor human rights and the environment.

This was followed by Jan Peters, Professor for Intelligent Autonomous Systems the Technische Universitaet Darmstadt, Germany and Research Group Leader at the Max Planck Institute for Intelligent Systems. In his talk ‘Learning Robot Skills from Data’ he outlined a general framework suitable for learning motor skills in robotics based on a representation of motor skills as parameterized motor primitive policies acting as building blocks of movement generation, and a learned task execution module that transforms these movements into motor commands.

Next was Stephanie Hyland, Senior Researcher at Microsoft Research Cambridge. Her talk ‘Can randomness in stochastic gradient descent provide privacy?’ looked at exploiting the natural correspondence between generalisation performance and privacy to utilise existing randomness in the training procedure and improve the performance of private mode, by viewing stochastic gradient descent as a randomised mechanism in the sense of differential privacy.

Session 4 focussed on humanitarian issues and climate change. The final speaker was Claire Monteleoni, Associate Professor of Computer Science at the University of Colorado Boulder. Her talk ‘Climate Informatics: Machine Learning for the Study of Climate Change’ gave an overview of her team’s climate informatics research, focusing on challenges in learning from spatiotemporal data, along with semi- and unsupervised deep learning approaches to studying rare and extreme events.

Advances in Data Science will return next year, in the meantime the Advances in Data Science Seminar Series hosts regular virtual events on diverse topics relating to data science and AI.

Leave a Comment

* Indicates a required field