Bridging the Gap

Bridging the Gap: Travel Call 4 Results

BTG Travel Grants Fourth Call - Funds Awarded

The Bridging the Gap board granted two travel awards in the fourth round of calls. See below for an overview of the projects funded.

Optimising water treatment with data mining and artificial neural networks John Bridgeman, Civil Engineering

Liaison between Dr Bridgeman and Dr Chris Chow (AWQC) has been ongoing since the IWA NOM2008 conference in the UK, at which the work of both researchers was presented. Through post-conference discussions, it has become apparent that we have closely-aligned research interests, most notably in the use of data mining techniques to assess large, multivariate datasets arising from the characterisation of organic matter (OM) in water, and also in the optimisation of water treatment processes. Bridgeman is currently working on the development and application of robust data mining techniques, including artificial neural networks and self-organising maps for OM characterisation; techniques which thus far have not been applied in this field. This innovative work dovetails neatly with Chow’s work on interpretation of spectra, data mining and optimisation. Bridgeman and Chow wish to develop two interdisciplinary, cross-institutional research proposals, the objective of which would be to combine data arising from their historical experimental work and to develop and apply novel approaches to data mining and optimisation. As part of the work to be undertaken during the visit, we will explore two areas of research, with a view to developing and submitting two separate research proposals.

1. Assessing organics removal in water treatment with data mining and artificial neural networks. Insufficient organic matter removal prior to disinfection of drinking water treatment leads to the formation of harmful disinfection by-products (DBPs) arising from the chemical reaction of organic compounds with the disinfectant. Therefore a rapid and accurate assessment of organic matter removal in the initial stages of water treatment is crucial to the prediction of DBPs formation potential. Recent work by Bridgeman and colleagues (e.g. Bieroza et al., 2009a, b) has considered the use of fluorescence spectroscopy to assess organic matter removal in water treatment. Fluorescence intensity is scanned over a range of excitation and emission wavelengths, to produce a three-dimensional output comprising of more than 4,000 fluorescence data points. Therefore, the analysis and interpretation of the substantial amount of fluorescence data characterising water treatment performance requires robust data mining techniques. In the work planned here, we will use the large multivariate datasets arising from fluorescence spectroscopy for the quantitative and qualitative characterisation of organic matter during water treatment.

2. Incorporating uncertainty into water treatment works design using a genetic algorithm and Monte Carlo based simulation. The primary aim of this piece of work is to formulate a research proposal which will draw upon previous work undertaken at the University of Birmingham and AWQC to develop a methodology and software tool to minimise operational cost of treatment according to performance characteristics, raw water input and risk acceptance, using a genetic algorithm (GA) and Monte Carlo based simulation. Specifically, we will develop a detailed proposal to assess treatment performance via Monte Carlo simulation, and then to optimise treatment works design and operation from an operational cost perspective using a GA technique.

Symbiotic blending of advanced machine learning and data mining with astronomy. Somak Raychaudhury, School of Physics and Astronomy

A collaboration of several researchers from the Schools of Physics and Astronomy, Computer Science and Mathematics are involved in developing algorithms for data mining aimed at the emerging Virtual Observatory, a global system of databases and archives that is currently being set up (www.ivoa.net). Since 2003, we have worked on various applications of machine learning to Astronomy, with an emphasis of developing new algorithms rather than adapt off- the-shelf software, on small scales. We would now work towards building a large international collaboration, aimed at the application of our algorithms to actual observational data and to simulations, widen our scope of algorithm development and engineer the work in the form of software packages. We will aim to write two major proposals, one for EPSRC, and another for EU FP7, to acquire researchers in Birmingham, as well as facilitate travel for us and our collaborators and organize regular meetings.

The proposed work will bring together computer science (specifically machine learning and data mining) and astronomy. The project blends my expertise in Physics and Astronomy with of Dr. Peter Tino (School of Computer Science) in machine learning and data mining with. We will also collaborate with Dr. Prakash Patil (School of Mathematics), an expert on density estimation, with whom we already established a working link. The project has a strong multidisciplinary character, combining astronomy with advanced machine learning and data mining on vast data sets using high-speed computing. Its objectives cannot be met without a true blending of expertise of both scientific disciplines. Vast data sets are involved, and the Virtual Observatory, for which these data mining algorithms are being developed, is a project involving High performance Computing.