Title: Artificial neural network models based on QSAR for predicting rejection of neutral organic compounds by polyamide nanofiltration and reverse osmosis membrane
Journal: Journal of Membrane Science
Authors: V. Yangali-Quintanillaa, b, , A. Verliefdeb, c, d, T.-U. Kime, A. Sadmania, M. Kennedya and G. Amya, b
Corresponding author: V. Yangali-Quintanilla
Institute:
a UNESCO-IHE, Institute for Water Education, Westvest 7, 2611AX Delft, The Netherlands
b Delft University of Technology, Stevinweg 1, 2628CN Delft, The Netherlands
c KWR Watercycle Research Institute, Groningenhaven 7, 3433PE Nieuwegein, The Netherlands
d UNESCO Center for Membrane Science and Technology, University of New South Wales, NSW 2052, Australia
e Pennsylvania State University at Harrisburg, Middletown, PA 17057, USA
The original and creativity of paper: In this paper, they applied salt to predict organic solute rejection of neutral organic compounds by polyamide nanofiltration (NF) and reverse osmosis (RO) membranes
Summary:
They found that artificial neural networks may be an important tool for prediction of the rejection of neutral organic compounds by NF and RO membranes. However, ANN models can be enhanced on the use of quantitative structure activity relationships that may summarize interactions between membrane characteristics, filtration operating conditions and physicochemical properties of organic compounds. Also rejection of neutral organic compounds is mainly governed by size exclusion and hydrophobic interactions between solute and membrane. As well as magnesium sulphate salt rejection may be a possible lump parameter that defines size exclusion capability of neutral organic compounds by NF or RO membrane. Conversely it may only be valid in combination with solute descriptors and for a range of boundary experimental conditions. Results in detail will be summarized following;
1. Variables reduction with principal component analysis and QSAR model.
After application of PCA, 15 variables were reduced to two components with six variables, component 1 was related to membrane characteristics with variables molecular weight cut-off (MWCO) and salt rejection (SR) and component 2 was related to size and hydrophobicity with four variables (equivalent width, depth, length and log Kow). Scores of the principal components are presented in Fig. 1, it shows that two compound groups are distinguished in the graph; in general hydrophobic neutral (HP-neu) and hydrophilic neutral (HL-neu) clustering is observed, however, not all cases cluster due to the influence of membrane characteristics on the components.
Fig. 1. Loading and score plots of principal components.
2. Artificial neural network models
The ANN models used in this study were multi layer feed-forward backpropagation networks. The input layer contains the predictors and the hidden layer contains the number of neurons used. The prediction performance of the QSAR model is shown in Fig. 2. The main disadvantage of the MLR model is that it shows over and under-prediction of rejection values in many cases (Fig. 2), although it yields acceptable correlation coefficients R2 of 0.81 and 0.92, for the model dataset and prediction set S1, respectively.
Fig. 2. QSAR equation model and predictions sets.
The accuracy of prediction was improved by ANN models N1 and N2 (R2 = 0.97) as can be observed in Fig. 3, with STDE of 5.4 and 5.3 for model N1 and N2, respectively.
Fig. 3. Network model N1 and N2 with training, validation and prediction sets.
Morover, the effect of the number of neurons in the hidden layer was evaluated with network models N5 and N6 (4). It was found that two neurons (N5) were sufficient in the hidden layer; and the effect of four neurons (N6) did not improve the predictions and performance (STDE 6.2 and 6.3% for N5 and N6, respectively). It is important to mention that models N1, N2, N3 and N4 used a random dataset 1, and models N5 and N6 used a different random dataset 2. Differences in random datasets for training, validation and prediction did not improve performance either.
Fig. 6. Network model N5 and N6 with training, validation and prediction sets.
Application & further study: ANN is a useful approach to capture and represent complex input-output relationships. They have the ability to learn linear, as well as non-linear correlative patterns between sets of input data and corresponding target values, directly from the data set that is modeled. They can also be successfully used in classification problems, since there are specific algorithms available to group the input patterns in different clusters based on similarities-dissimilarities between them. Therefore it is applicable to identify the relationship among group parameters and point out the significant spot of the problem.
By Monruedee Moonkhum
Email: moon@gist.ac.kr