- Email: [email protected]

Contents lists available at ScienceDirect

Computers and Chemical Engineering journal homepage: www.elsevier.com/locate/compchemeng

Deep hybrid modeling of chemical process: Application to hydraulic fracturing Mohammed Saad Faizan Bangi a,b, Joseph Sang-Il Kwon a,b,∗ a b

Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, TX 77845 USA Texas A&M Energy Institute, Texas A&M University, College Station, TX 77845 USA

a r t i c l e

i n f o

Article history: Received 18 October 2019 Revised 27 November 2019 Accepted 22 December 2019 Available online 26 December 2019 Keywords: Deep learning First principles Hybrid modeling Levenberg–Marquardt algorithm Hydraulic fracturing

a b s t r a c t Process modeling began with the use of ﬁrst principles resulting in ‘white-box’ models which are complex but accurately explain the dynamics of the process. Recently, there has been tremendous interest towards data-based modeling as the resultant ‘black-box’ models are simple, and easy to construct, but their accuracy is highly dependent on the nature and amount of training data used. In order to balance the advantages and disadvantages of ‘white-box’ and ‘black-box’ models, we propose a hybrid model that integrates ﬁrst principles with a deep neural network, and applied it to hydraulic fracturing process. The unknown process parameters in the hydraulic fracturing process are predicted by the deep neural network and then utilized by the ﬁrst principles model in order to calculate the hybrid model outputs. This hybrid model is easier to analyze, interpret, and extrapolate compared to a ‘black-box’ model, and has higher accuracy compared to the ﬁrst principles model. Published by Elsevier Ltd.

1. Introduction Process modeling is the task of obtaining a mathematical representation for knowledge about any physical process (Cameron and Hangos, 2001). Depending on the nature of knowledge, models can be classiﬁed into various categories. First principles or mechanistic models also known as ‘white box’ models are obtained using the mass and energy conservation laws, kinetic laws, thermodynamic laws, transport laws, etc. This class of models is transparent and easy-to-understand as they usually contain parameters with physical meaning, valid over a wide range of operating conditions of the process, but are complex and computationally expensive to solve. On the other hand, data-driven models or ‘black box’ models are obtained using process data and computationally inexpensive to solve, but are usually diﬃcult to interpret as the nature of parameters is unknown, and have narrow range of applicability. Hybrid models or ‘grey box’ models are a third class of models which are a combination of white box and black box submodels (Thompson and Kramer, 1994). In process modeling, the concept of hybrid (grey box) models evolved from the ﬁeld of neural networks (Psichogios and Ungar, 1992; Kramer et al., 1992; Johansen and Foss, 1992; Su et al., 1993). The idea was to build neural network based hybrid models through

∗

Corresponding author. E-mail address: [email protected] (J.S.-I. Kwon).

https://doi.org/10.1016/j.compchemeng.2019.106696 0098-1354/Published by Elsevier Ltd.

the use of ﬁrst principles knowledge. This resulted in hybrid models with better prediction accuracy compared to the ﬁrst principles models, and better interpolation, extrapolation, and interpretation compared to solely neural networks based models. There exist other kinds of grey box models that combine different types of ﬁrst principles knowledge and/or empirical submodels. During the 90s, the term ‘grey box’ models appeared in systems and control theory wherein the structural information from the ﬁrst principles models was incorporated into the data-based models (Bohlin and Graebe, 1995; Jorgensen and Hangos, 1995; Tulleken, 1993). But understanding of the term ‘grey box’ has evolved to represent all types of hybrid models that combine ﬁrst principles and data-based submodels. Nonetheless, hybrid modeling balances the advantages and disadvantages of strictly ﬁrst principles and data-based modeling, and offers desirable beneﬁts such as high prediction accuracy, better extrapolation capabilities, ease of calibration, and better interpretability. For these reasons, hybrid modeling has numerous applications in chemical and biochemical engineering. For instance in modeling of chemical reactor (Zahedi ´ et al., 2011; Gupta et al., 1999; Molga and Cherbanski, 1999; Qi et al., 1999), polymerization processes (Tsen et al., 1996; Fiedler and Schuppert, 2008), crystallization (Lauret et al., 20 0 0; Georgieva and de Azevedo, 2009), metallurgic processes (Reuter et al., 1993; Hu et al., 2011; Jia et al., 2011), distillation columns (Safavi et al., 1999; Mahalec and Sanchez, 2012), drying processes (Cubillos and Acuña, 2007), thermal devices (Arahal et al., 2008), mechanical reactors (Nascimento et al., 1999),

2

M.S.F. Bangi and J.S.-I. Kwon / Computers and Chemical Engineering 134 (2020) 106696

milling (Aguiar and Filho, 2001; Kumar Akkisetty et al., 2010), modeling of yeast fermentations (Schubert et al., 1994b; 1994a; Eslamloueyan and Setoodeh, 2011), modeling of fungi cultivations (Preusting et al., 1996; Wang et al., 2010), modeling of bacteria cultivations (Simutis and Lübbert, 1997; Gnoth et al., 2008), modeling of mammalian cell cultivations (Dors et al., 1995; Teixeira et al., 2007), modeling of insect cell cultivations (Carinhas et al., 2011), modeling of hybridoma cell cultivations, (Fu and Barford, 1995; 1996), etc. For more information one can view von Stosch et al. (2014), an excellent review paper on hybrid modeling in the ﬁeld of process systems engineering. More recently, Ghosh et al. (2019) proposed a hybrid modeling framework involving ﬁrst principles integrated with subspace identiﬁcation algorithm whose purpose is to minimize the mismatch between the process outputs and plant data. Building hybrid models is one way of taking advantage of the ‘big data’ available today. Apart from hybrid modeling, ‘big data’ can be utilized in multi-scale modeling and integrated decision making of process systems which usually involve chemical processes functioning on different time/length scales, and for more information on this one can read this excellent review paper by Tsay and Baldea (2019). Additionally, to understand the path forward and the challenges that lie ahead in chemical engineering related to integrating datadriven processing with scientiﬁc reasoning, one can review this excellent review paper written by Venkatasubramanian (2018). Recall, hybrid modeling started in 1992 from the use of neural networks along with ﬁrst principles knowledge (Psichogios and Ungar, 1992; Kramer et al., 1992; Johansen and Foss, 1992; Su et al., 1993). Neural networks are connectionist models that map an input space to an output space and were inspired by the biological networks present in the brain. Each neural network contains elements called ‘neurons’ or ‘nodes’ that process an input and give an output, and comprises of layers containing many such nodes. Each node in a particular layer is connected to every node in an adjoining layer, and the strength of each connection is represented by an assigned weight. Consequently, a node in a particular layer receives inputs from all the nodes in the previous layer which are added together after weights are applied on each of them, and a ‘bias’ is added to the sum. If the neural network only contains weights and biases, then it behaves as a linear function. In order to capture nonlinearity in the input-output data, certain nonlinear functions called ‘Activation functions’ are used in each node. With activation functions, the input to the neural network undergoes nonlinear transformation through each layer and is collected as the output at the ﬁnal layer. Overall, the number of nodes, number of layers, weights in each connection, and the biases are the internal parameters of the neural network. Under certain assumptions, neural networks, if suﬃciently large, have been shown to capture any nonlinear continuous function accurately (Stinchcombe and White, 1989). With the advancements in Machine Learning, the ﬁeld of neural networks has evolved from the use of a single hidden layer to multiple hidden layers resulting in deep neural networks. Recently, there has been lots of interests in understanding and comparing the capabilities of shallow neural networks and deep neural networks as function approximators. For instance, it has been proven that deep sum-product networks for polynomial functions (Delalleau and Bengio, 2011), deep neural networks for piecewise smooth functions (Liang and Srikant, 2017), and three layer network for a speciﬁc function (Eldan and Shamir, 2016) require an exponentially less number of neurons than their shallow counterparts. Also, it has been shown that the number of linear regions of the functions approximated by deep neural networks is exponentially large compared to shallow neural networks (Montufar et al., 2014). These works demonstrate that shallow networks require an exponentially large number of neurons compared to deep networks for speciﬁc functions,

and hence, depict the power of deep networks over shallow networks. In this work, we develop a hybrid model for a hydraulic fracturing process that combines its ﬁrst principles model with a deep neural network that acts as an estimator of its unmeasured process parameters. The hydraulic fracturing process is a complex system in which the fracturing ﬂuid containing proppant is pumped into the reservoir to create and sustain the created fractures in order to extract oil/gas through them from the reservoir. There are two main issues when modeling this process which are its moving boundary nature and numerous process uncertainties. One such uncertainty is the leak off rate of the fracturing ﬂuid into the reservoir which is diﬃcult to accurately explain using ﬁrst principles, and building a ﬁrst principles based model with such a knowledge gap will result in its inaccuracy. Therefore, the novelty in this work is two fold. First, the use of deep neural networks for the purpose of building hybrid models. Second, the application of this hybrid modeling methodology to a complex hydraulic fracturing process in order to explain one of its underlying ﬂuid leak-off phenomena and build a superior model. The remainder of this text is organized as follows: Section 2 provides the methodology proposed by Psichogios and Ungar (1992) to build a neural network based hybrid model. Section 3 gives a brief background about deep neural networks and the Levenberg–Marquardt algorithm used in this work to train them. Section 4 presents our proposed methodology to build a deep neural network based hybrid model. Section 5 presents the ﬁrst principles model of hydraulic fracturing process utilized in this work, and Section 6 presents the application of our proposed technique. 2. Hybrid neural network model Consider a dynamical system with the following generic representation:

dx = f (x, u, p) dt p = g(x, u )

(1) (2)

where x, u, and p denote the states, control inputs, and model parameters, respectively. The dynamics of the states of the system is explained by the equations in f which are dependent on parameters p. These parameters p are related to the states x and the inputs u through the equations in g. In many of the chemical engineering processes, the parameters p are complex and unknown. For instance, in biological reactions, the cell growth rate and the corresponding reaction kinetics are usually unknown and diﬃcult to be derived from ﬁrst principles. Consequently, the ﬁrst principles models of such processes will contain unknown terms. Psichogios and Ungar (1992) developed the idea of a hybrid model wherein a ﬁrst principles model works in conjunction with a neural network ‘black box’ model that can explain the unknown parameters. This methodology is superior when compared to the straightforward neural network-based ‘black box’ modeling of the entire system, in that it has an internal structure which clearly represents the interactions between the state variables and the parameters. This structuring allows for easier analysis and interpretation compared to the ‘black box’ model. Also, the hybrid model has better generalization and extrapolation capabilities as the basic structure; speciﬁcally, the ﬁrst principles model is not affected during any process identiﬁcation tasks and only the ‘black box’ model is altered. A schematic representation of the neural network based hybrid model is shown in Fig. 1. The neural network model’s inputs are the current state of the system xk and the external input applied on the system uk , and its output is the current parameter value

M.S.F. Bangi and J.S.-I. Kwon / Computers and Chemical Engineering 134 (2020) 106696

3

Let nk+1 (i ) be the net input to each unit i in layer k + 1 which is given by

nk+1 (i ) =

Sk

wk+1 (i, j )ak ( j ) + bk+1 (i )

(3)

j=1

Let ak+1 (i ) be the output of unit i in layer k + 1 which is given by

ak+1 (i ) = f k+1 (nk+1 (i ))

(4)

Assuming there are M layers in the network, the equations in matrix form can be represented as

Ak+1 = F k+1 (W k+1 Ak + Bk+1 ), Fig. 1. Hybrid neural network model.

A0 = uq ,

k = 0, 1, . . . , M − 1

q = 1, 2, . . . , Q

(5) (6)

where uq is the input vector to the neural network and its cork k k k responding output is AM q . The matrices A , F , W , and B contain the outputs, the activation functions, weights, and biases of all the neurons in layer k, respectively. The aim of the neural network is to learn the relationship between the input-output pairs {(u1 , y1 ), (u2 , y2 ), . . . .(uQ , yQ )}. The performance of the neural network is measured as follows: Q 1 T eq eq 2

(7)

eq = ( yq − AM q )

(8)

V =

q=1

where eq is the error when the qth input is given to the neural network. The error matrix E can be deﬁned as follows: Fig. 2. Deep neural networks.

p. The output of the neural network is fed as input to the ﬁrst principles model along with the current state and input, xk and uk , respectively. The ﬁrst principles model then predicts the state at the next time step, i.e. xk+1 . Together, the neural network model and the ﬁrst principles model constitute the hybrid model for the entire process. Before the neural network can be utilized, it has to be trained. A set of input and state measurements is utilized for this purpose. The neural network is initialized by deﬁning its parameters, and an error signal is backpropagated through the network to update them. However, for the hybrid model, the output of the neural network is not directly available as it is not measured. Hence, the ﬁrst principles model is utilized to calculate the error signal which is then used to update the parameters of the network. The network is fully trained when the difference between the hybrid model’s predictions and the actual measurements fall below a predeﬁned tolerance. Once trained, the neural network can be utilized along with the ﬁrst principles model to form the hybrid model.

E = [e1 e2 . . . eQ ]T

(9)

3.2. Levenberg–Marquardt training The Error Backpropagation (EBP) algorithm (Rumelhart et al., 1986; Werbos, 1988) is one of the most signiﬁcant breakthroughs with regard to the training of neural networks, but it has an issue of slow convergence, which can be explained using the following two reasons. First, the step size is kept small to prevent oscillations around the required minima and this leads to slow training process. Second, the curvature of the error surface could vary in different directions, so the classical ‘error valley’ problem (Osborne, 1992) could exist and may lead to slow convergence rate. Despite its issue of slow convergence, the steepest descent algorithm is widely used to train neural networks. Its update rule utilizes the gradient g, which is the ﬁrst-order derivative of the total error function, is deﬁned as follows:

g=

∂ V (u, w ) ∂ V = ∂w ∂ w1

∂V ∂ w2

...

∂V ∂ wN

T (10)

The update rule of the steepest descent algorithm is written as: 3. Preliminaries In this section, the concept of deep neural networks is brieﬂy discussed followed by the Levenberg–Marquardt training algorithm utilized in this work. 3.1. Deep neural networks Deep neural networks are the neural networks with more than three layers and each layer contains multiple neurons. The neurons in each layer are fully connected to the neurons in the subsequent layer as shown in Fig. 2. The strength of each connection is assigned by its weight w, and each neuron processes the input through a nonlinear function called activation function f.

wk+1 = wk − α gk

(11)

where α is the learning constant. This issue of slow convergence in the steepest descent algorithm can be improved by the Gauss-Newton algorithm (Osborne, 1992). The Gauss-Newton algorithm utilizes the second-order derivatives of the error function to gauge the curvature of error surface and ﬁnd the appropriate step size for each direction. The convergence is fast if the error function has a quadratic surface, without which this algorithm would be mostly divergent. The update rule of the Gauss-Newton method is deﬁned as:

wk+1 = wk − (JkT Jk )−1 Jk Ek

(12)

4

M.S.F. Bangi and J.S.-I. Kwon / Computers and Chemical Engineering 134 (2020) 106696 Table 1 Summary of the training algorithm. Algorithm

Parameter updates

Convergence

Computational issue

EBP Gauss–Newton Levenberg–Marquardt

wk+1 = wk − α gk wk+1 = wk − (JkT Jk )−1 Jk Ek wk+1 = wk − (JkT Jk + μI )−1 Jk Ek

stable, slow unstable, fast stable, fast

Gradient Jacobian Jacobian

where J is the Jacobian matrix which is deﬁned as:

⎡

∂ e1 ⎢ ∂ w1 ⎢ ∂ e2 ⎢ ⎢ J = ⎢ ∂ w1 ⎢ ⎢ ⎣ ∂e Q ∂ w1

∂ e1 ∂ w2 ∂ e2 ∂ w2

... ...

.. .. .. . . . ∂ eQ ... ∂ w2

⎤ ∂ e1 ∂ wN ⎥ ∂ e2 ⎥ ⎥ ∂ wN ⎥ ⎥ ⎥ ⎥ ∂ eQ ⎦ ∂ wN

(13)

The Levenberg–Marquardt algorithm (Levenberg, 1944; Marquardt, 1963) combines the steepest descent method and the Gauss–Newton algorithm. Consequently, it inherits the stability characteristic of the steepest descent algorithm and the speed of the Gauss-Newton algorithm. The Levenberg–Marquardt algorithm functions as the steepest descent algorithm when the curvature of error surface is complex until the local curvature takes a reasonable approximation of a quadratic surface wherein the Levenberg– Marquardt can behave as the Gauss-Newton algorithm. Essentially, the Levenberg–Marquardt algorithm utilizes the superior aspects of the steepest descent algorithm and the Gauss-Newton algorithm alternatively based on the situational requirements. The update rule of the Levenberg–Marquardt algorithm is given by

wk+1 = wk − (JkT Jk + μI )−1 Jk Ek

(14)

where μ is the combination coeﬃcient, and I is the identity matrix. When the μ value is large, then the Levenberg–Marquardt algorithm behaves as the steepest descent method; otherwise, it behaves as the Gauss-Newton method (Table 1).

Fig. 3. Proposed deep hybrid model.

model which includes a DNN to approximate the unknown parameters p. The inputs u are presented to the hybrid model, speciﬁcally to the ﬁrst principles model and the DNN. The inputs to the DNN are propagated through its layers and the network’s outputs are obtained at the ﬁnal layer. Eqs. (5) and (6) are utilized to calculate the outputs of the DNN which are the predicted parameter values p. These predictions are used as inputs to the ﬁrst princi ples model to calculate the outputs of the hybrid model y . For the deep hybrid model, the squared prediction error of the output for all Q training patterns was minimized as:

Vˆ = 4. Proposed deep hybrid model

(16)

q=1

Consider the dynamical system deﬁned using Eqs. (1) and (2) along with the following equation:

y = h (x )

Q 1 ( eq )T ( eq ) 2

(15)

where y denotes the output of the system which is related to the states through the equation in h. In this proposed methodology, a DNN will be trained using data to predict the originally unknown parameter values p, and this trained DNN alongside a ﬁrst principles model, will be used as shown in Fig. 3 to build a hybrid model. A DNN is initialized by assigning some values for the number of layers, number of neurons in each layer, type of activation function, and the initial values of weights and biases. A DNN has more than one hidden layer apart from the input and output layers. There are many kinds of activation functions being used today but widely used are Sigmoid, Hyperbolic tangent, Rectiﬁed linear unit (ReLU), and Leaky rectiﬁed linear unit (Leaky ReLU). If the input and output to the DNN vary over an order of magnitude, then this will cause issues during the training process. To avoid such scenarios, it is better to scale or make them dimensionless. 4.1. Training algorithm We use an input-output training data set (u1 , y1 ), (u2 , y2 ), . . . .(uq , yq ), . . . .(uQ , yQ ) for the purpose of training the hybrid

eq = yq − yq

(17)

The DNN’s output pq does not explicitly appear in the above error equation as it is generated and utilized internally in the hybrid model. In order to update the parameters of DNN using Eqs. (13) and (14), the effect of the DNN’s output pq on the prediction error of the hybrid model eq needs to be quantiﬁed. For this purpose, we utilize ﬁnite difference methods to calculate the gradient of the hybrid model’s output yq with respect to the DNN’s output pq . Hence, we obtain the following equations:

∂ eq = −1 ∂ yq

(18)

∂ yq yq+1 − 2yq + yq−1 = ∂ pq pq+1 − pq−1

(19)

yq+1 − 2yq + yq−1 ∂ eq ∂ eq ∂ yq = =− ∂ pq ∂ yq ∂ pq pq+1 − pq−1

(20)

Let’s deﬁne the sensitivity of the error eq to changes in the net input nkq (i ) of unit i in layer k when input vector uq is given to the DNN as:

δqk (i ) =

∂ eq ∂ nkq (i )

(21)

M.S.F. Bangi and J.S.-I. Kwon / Computers and Chemical Engineering 134 (2020) 106696

Now, the above equation can be rewritten using Eq. (4) as:

δqk (i ) =

∂ eq ∂ eq ∂ akq (i ) ∂ eq ˙ k k = = f (n (i )) k ∂ nq (i ) ∂ akq (i ) ∂ nkq (i ) ∂ akq (i )

(22)

where akq (i ) is the output of unit i in layer k when input vector uq is given to the DNN. For the last layer M, the above equation leads to:

5

Levenberg–Marquardt optimization method and it includes a customized step for Jacobian calculation. Other than Levenberg– Marquardt algorithm, steepest descent or Gauss-Newton method are some of the popular techniques used for neural network training. But these methods solely cannot be directly applied for training the proposed DNN-based hybrid model.

∂W ∂ eq , and can be calculated for the neurons in the ﬁnal layer M us∂ Bk

Remark 2. The Levenberg–Marquardt algorithm has faster convergence rate compared to the steepest descent method, and is stable compared to the Gauss-Newton method. These properties make it a superior algorithm compared to the other two methods. But the Levenberg–Marquardt has the issue of the calculation of Jacobian matrix. If the size of the neural network or the size of the training data matrix is large, then the calculation of the Jacobian matrix and consequently, the update of the parameters of the DNN would be computationally expensive or the algorithm could face memory issues. Therefore, our Levenberg–Marquardt based hybrid model training algorithm is more suitable to systems that require small sizes of DNNs to capture their unknown parameters and small training data sets. Otherwise, the steepest descent or the Gauss-Newton method would be more suitable for hybrid model training.

∂ eq = δqM AM−1 q ∂W M

(25)

5. Hydraulic fracturing process

∂ eq = δqM ∂ BM

(26)

∂e δ = Mq F˙ M (nM q ) ∂ Aq M q

(23)

But the output AM q from the ﬁnal layer M is the predicted parameter pq which results in

∂ eq ∂ eq = ∂ pq ∂ AM q

(24)

Using Eqs. (20), (23), (24), the δqM value can be calculated. Recall, the Jacobian matrix contains the sensitivities of the error eq to changes in the parameters of the DNN, i.e. Wk and Bk . In math∂ eq ematical form, these sensitivities can be represented as and k ing δqM and Eq. (3) in the following way:

For other layers, k = 1, . . . , M − 1, δqk value can be calculated using the following recurrence relation: T

δqk = F˙k (nkq ) W k+1 δqk+1

(27)

Using δqk value, the Jacobian matrix can be calculated in a manner similar to when obtaining Eqs. (25) and (26). Following the calculation of the Jacobian matrix, the parameters of the DNN can be updated using Eq. (14). The Levenberg–Marquardt training algorithm starts with an initial value for μ but it is multiplied by a factor β whenever an update would result in the increase of Vˆ . On the other hand, when an update reduces the Vˆ value, then μ is divided by β . The training algorithm is repeated until the value of Vˆ reaches a predeﬁned tolerance. The proposed algorithm has been summarized in Algorithm 1. Also, the ﬂow diagram of the proposed hybrid modeling framework is presented with details in Fig. 6. Algorithm 1 Deep hybrid model training. 1:

Present all the inputs uq to the hybrid model and calculate its corresponding output yq . Compute the errors eq using (17) and the sum of the squared of errors over all inputs Vˆ using (16).

Compute the Jacobian matrix using Eqs. (13), (20), (21), (22), (23), (24), (25), (26), (27) 3: Update the parameters of the DNN using Eq.(14) and recalculate Vˆ using Eq.(16). If it is smaller than that computed in Step 1, then reduce μ by β , and proceed to Step 1 with the new parameters of the DNN. If the Vˆ is greater than that computed in Step 1, then increase μ by β and repeat this step. ˆ is less than a predeter4: The algorithm is terminated when V mined value.

2:

Remark 1. To reiterate, the important novelty of this proposed work is the algorithm which utilizes data for DNN-based hybrid model training. Our proposed algorithm is based on the

Shale gas is the natural gas trapped in rocks with low porosity and permeability which is diﬃcult to extract using conventional oil/gas extraction techniques. Horizontal drilling and hydraulic fracturing process enabled the extraction of oil/gas from such rocks. In a hydraulic fracturing process, controlled explosions are carried out inside the wellbore to create initial fractures followed by pumping of a clean ﬂuid called pad to extend the geometry of these fractures. Later, a fracturing ﬂuid containing water, proppant and additives is pumped inside the system to further extend these fractures. During and after the pumping process, the fracturing ﬂuid leaks off into the reservoir leaving behind proppant in the fractures. The natural stresses present in the reservoir cause the closure of these fractures with proppant inside them. The presence of proppant creates a conductive medium for the oil/gas to ﬂow through these fractures making their extraction possible. The phenomenon of interest in this work is the leak off of the fracturing ﬂuid into the reservoir during the hydraulic fracturing process as its knowledge is essential to obtain the optimal fracture geometry. Obtaining the fracture geometry requires huge amounts of fracturing ﬂuid, and hence, ﬂuid leak off indirectly affects the economic eﬃciency of the hydraulic fracturing process. In our past work on modeling and control of hydraulic fracturing process, we utilized a mathematical expression of timedependent fracturing ﬂuid leak-off rate. In practice, advancement in pressure analysis has made it possible to estimate the leak-off rate from pressure decline following injection of fracturing ﬂuid. But this method requires the knowledge of gross fracture height, and hence, this method is suitable for the formations with a large net permeable height. As previously described, in this work, we utilize a DNN to estimate the ﬂuid leak-off rate from input-output data of the hydraulic fracturing process as explained in the previous section. To achieve this objective, we also utilize a ﬁrst principles based model of hydraulic fracturing process which comprises of 3 subprocesses: (a) Fracture propagation, (b) Proppant transport, and (c) Proppant bank formation. One can refer Yang et al. (2017), Siddhamshetty et al. (2017), and Siddhamshetty et al. (2018) to understand the model used in our application. The subprocesses of hydraulic fracturing process are brieﬂy discussed in the following subsections.

6

M.S.F. Bangi and J.S.-I. Kwon / Computers and Chemical Engineering 134 (2020) 106696

The two boundary conditions and an initial condition for the process are shown below (Gu and Hoo, 2015):

qz ( 0, t ) = Q0

W (L(t ), t ) = 0,

W (z, 0 ) = 0

(32) (33)

where Q0 is the fracturing ﬂuid rate at the wellbore, and L(t) is the length of the fracture varying with time. 5.2. Proppant transport In this model, it is assumed that the proppant moves at a rate equal to the superﬁcial velocity of the fracturing ﬂuid in the horizontal direction, and with the settling velocity relative to the fracturing ﬂuid in the vertical direction. The other assumptions utilized are: (1) the size of proppant particle is large enough that its diffusive ﬂux is neglected and only its convective ﬂux is considered; (2) the proppant particle-particle interactions are neglected while only drag and gravity effects are considered; and (3) the uniform size of proppant particles. Utilizing the assumptions in regard to proppant transport, the advection of proppant can be deﬁned in the following manner: Fig. 4. The PKN fracture model.

5.1. Fracture propagation We assume the fracture propagation follows the Perkins, Kern and Nordgren (PKN) model (Perkins and Kern, 1961; Nordgren, 1972) which is shown in Fig. 4. The assumptions included are as follows: (a) the fracture length is assumed to be greater than the fracture width, and hence, the ﬂuid pressure in the vertical direction remains constant; (b) the fracture is conﬁned to a single layer because of large stresses in the rock layers above and below the fractures; (c) the fracturing ﬂuid is assumed to be incompressible, and the rock properties such as Young’s modulus and Poisson’s ratio are also assumed to be constant. Following the assumptions, the fracture geometry and fracture cross-sectional area take an elliptical and rectangular shape, respectively. Lubrication theory is utilized to explain the ﬂuid momentum inside the fracture which relates the ﬂuid ﬂow rate, qz , in the horizontal direction to the pressure gradient, − ∂∂Pz zˆ, as follows:

π HW 3 ∂ P qz = − 64μ ∂ z

(28)

where H is the fracture height, W is the fracture width, and μ is the fracturing ﬂuid viscosity. The maximum width of the fracture can be calculated using the net pressure exerted by the fracturing ﬂuids as follows:

W =

2P H ( 1 − ν 2 ) E

(29)

where ν is the Poisson’s ratio, and E is the Young’s modulus of the rock formation. The local mass conservation of an incompressible ﬂuid leads to the continuity equation which is:

∂ A ∂ qz + + HU = 0 ∂t ∂z

(30)

where A = π W H/4 is the cross-sectional area of the fracture, U is the ﬂuid leak off rate per unit height, t is the time elapsed since the beginning of pumping, and z is the spatial coordinate in the horizontal direction. Plugging Eqs. (28) and (29) into Eq. (30) results in the following partial differential equation of W:

2 π H ∂W πE 2 ∂W 2 3∂ W − 3 W + W + HU = 0 (31) 4 ∂t ∂z 128μ(1 − ν 2 ) ∂ z2

∂ (W C ) ∂ + (W CVp ) = 0 ∂t ∂z

(34)

C (0, t ) = C0 (t )

(35)

and C (z, 0 ) = 0

where C(z, t) is the proppant concentration at distance z in the horizontal direction from the wellbore at time t, and C0 (t) is the proppant concentration at the wellbore varied with time. Vp is the net velocity of proppant and is calculated using the following equation (Adachi et al., 2007):

Vp = V − (1 − C )Vs

(36)

where V is the superﬁcial ﬂuid velocity, and Vs is the gravitational settling velocity which is calculated using the following equation (Daneshy, 1978):

Vs =

(1 − C )2 (ρsd − ρ f )gd2 101.82C 18μ

(37)

where ρ f is the pure ﬂuid density, ρ sd is the proppant particle density, g is the gravitational constant, d is the proppant particle diameter, and μ is the fracture ﬂuid viscosity which is dependent on the proppant concentration C (Barree and Conway, 1995):

μ(C ) = μ0 1 −

C

−α

Cmax

(38)

where μ0 is the pure ﬂuid viscosity, α is an exponent in the range of 1.2 to 1.8, and Cmax is the maximum theoretical concentration calculated using the equation Cmax = (1 − φ )ρsd where φ is the proppant bank porosity. 5.3. Proppant bank formation When proppant settles down in a fracture, a proppant bank is formed whose height δ can be calculated using the following equations (Gu and Hoo, 2014; Novotny, 1977):

d (δW ) CVsW = dt (1 − φ )

(39)

δ (z, 0 ) = 0

(40)

where Eq. (40) is the initial condition for Eq. (39).

M.S.F. Bangi and J.S.-I. Kwon / Computers and Chemical Engineering 134 (2020) 106696

7

brid model to capture a similar variation of the leak off rate U.

U=

Fig. 5. Schematic of deep hybrid model for hydraulic fracturing process.

Remark 3. Considering the complexity of the hydraulic fracturing process because of its moving boundary nature and its ﬁrst principles model involving parabolic partial differential equations along with other system equation, one can opt for its simpler reducedorder models to explain its dynamics. In our past work, we have developed multiple reduced-order modeling techniques and applied it to hydraulic fracturing process (Narasingam et al., 2017; Narasingam and Kwon, 2017; Narasingam et al., 2018; Sidhu et al., 2018; Narasingam and Kwon, 2018; Bangi et al., 2019). 6. Deep hybrid model for hydraulic fracturing process Recall, the deep hybrid model structure as shown in Fig. 3 is utilized to predict the unknown parameters of the ﬁrst principles model. We utilized the ﬁrst principles model of the hydraulic fracturing process discussed above along with a DNN to build a hybrid model. The objective of the DNN, once it is trained, is to predict the leak off rate U of the fracturing ﬂuid into the reservoir by using time t as input. The input to the hybrid model is the fracturing ﬂuid injection rate Q0 (t) and the output of the hybrid model is the width of the fracture Wz at various points z = 1, 2, 3 . . . . . . 251 along the length of the fracture in the horizontal direction z. The schematic of the deep hybrid model for hydraulic fracturing process is shown in Fig. 5. Unlike the schematic of hybrid model shown in Fig. 3, wherein both the state and the input applied on the system are used as inputs to the DNN, we only utilize the time t as input to the DNN as seen in Fig. 5. For the DNN, we considered a fully connected network consisting of 5 layers, i.e. 3 hidden layers, 1 input layer, and 1 output layer. Each hidden layer contains 20 neurons, and the input and output layer contain 1 neuron each. Hyperbolic tangent was used as the activation function in the hidden layers and the linear function in the outer layers. To generate training data in order to train the hybrid model, we simulated the ﬁrst principles model using a constant input ﬂow rate of Q0 (t ) = 0.03 m3 /s and the leak off model, as shown in Eq. (41) (Howard and Fast, 1957; Economides and Nolte, 20 0 0), with the assumption that the leak off rate U is independent of spatial location z. The other parameters of the ﬁrst principles model used in our process calculations are as follows (Gu and Hoo, 2014): H = 20 m, μ = 0.56 Pa · s, E = 5 × 103 MPa, and ν = 0.2. The output from the simulation was the widths Wz at various points z = 1, 2, 3 . . . . . . 251 along the length of the fracture. These 251 points are spaced equidistantly with a distance of 0.2 m between two consecutive spatial points and we collected 2841 time snapshots of the output data with a time difference of 0.1s between two consecutive snapshots. Since we have used the model in Eq. (41) for leak off rate when generating the simulation data assuming τ (z) to be constant, we expect the DNN in the hy-

2Cleak

t − τ (z )

(41)

Now that the hybrid model structure and its components have been deﬁned, and the training data is available, we initialized the parameters of DNN (i.e., weights W and biases B), and began the Levenberg–Marquadt based training algorithm as discussed above in Algorithm 1. The inputs Q0 (t) were presented to the hybrid model with the DNN utilizing t as its input and the ﬁrst principles model utilizing Q0 (t) as its input. The input t to the DNN undergoes transformation through its layers and gives the output U(t) which is presented to the ﬁrst principles model as an input. The ﬁrst principles model utilizes Q0 (t) along with U(t) to calculate the outputs Wz (t ). Since the parameters of DNN were randomly initialized, the initial predictions of U(t) by the DNN is more likely inaccurate than its actual value which in turn will affect the error e(t) calculated using Eq. (17). As the algorithm progresses, the parameters of the DNN are updated, and hence, its predictions of U(t) will move closer to its actual value. A point to note here is that despite the hybrid model predicting widths at 251 locations, we only con sidered just the width at the wellbore, i.e. W1 (t ) in the objective function calculation Eq. (16), and hence in error e(t) calculation Eq. (14) as well. This is because we assumed earlier that U does not vary with spatial location, and hence, the U(t) approximated by the DNN is applicable at all locations which implies that when the W1 (t ) values predicted by the hybrid model move towards convergence by updating the parameters of the DNN, the widths at other locations i.e., Wz where z = 2, 3 . . . . . . 251 also move towards convergence. In order to reduce the error and reach the tolerance of 10−6 for Vˆ , we needed to update the parameters of DNN, i.e. its weights W and biases B using Eq. (14). To do so, we calculate the Jacobian matrix as explained previously which includes initializing the value of μ. We used an initial value of μ as 10−2 and β as 2, and updated the weights. Using the updated weights, we recalculated Vˆ and continued with the training process as explained in Algorithm 1. Once the tolerance for Vˆ was reached, we stopped the training process. We compared the outputs (i.e., Wz of the hybrid model and the actual outputs from the training data) at the wellbore, and this comparison is presented in Fig. 7. It can be seen from Fig. 7 that the outputs obtained using the proposed deep hybrid model closely mimic the output from the training data. This indicates that the DNN has been well trained and the hybrid model accurately predicts the outputs Wz . We used a relative error metric, RE(t), to quantify the performance of the hybrid model in comparison to the training data. The relative error is calculated by using the Frobenius norms of the state vectors as follows:

RE (t ) =

Wtraining (t ) − Whybrid (t ) f ro Wtraining (t ) f ro

(42)

where .fro is the Frobenius norm, Wtraining (t) and Whybrid (t) are the width vectors obtained from the training data and the hybrid model, respectively, at time t. The relative error in the widths predicted by the hybrid model in comparison with the training data at different times is presented in Fig. 8. In order to compare the performance of the DNN in approximating the underlying phenomena of leak off, we used the obtained DNN and computed the leak off rate U values by presenting time inputs t. The predicted U values are compared against the actual leak off rate values obtained using Eq. (41), and the comparison is shown in Fig. 9. From Fig. 8, we note that the DNN has been well trained and was able to accurately approximate the underlying leak off rate in the training data. A point to observe here is that there is slight inaccuracy in the approximation of the leak off rate U by the DNN

8

M.S.F. Bangi and J.S.-I. Kwon / Computers and Chemical Engineering 134 (2020) 106696

Fig. 6. Block diagram for Levenberg–Marquardt based deep hybrid model training.

Fig. 7. Comparison of wellbore widths obtained from the hybrid model, and the training data.

in the initial stages of pumping whose effect can also be seen in the larger relative errors as shown in Fig. 8. Remark 4. In general, the performance of the DNN is affected by hyper-parameters such as the number of layers and nodes in its structure. In our work we utilized trial and error method to come up with the ﬁnal structure for the DNN in the hybrid model. The use of any sophisticated hyper-parameter tuning methods is advised as it will quicken the selection of the optimal hyperparameters for the DNN. 6.1. Comparison of deep hybrid model and black box model The deep hybrid model utilizes a DNN to approximate the unobserved process parameters. As discussed previously, the structure of the hybrid model is similar to the ﬁrst principles model except the DNN which does not alter the nature and characteristics of the known parameters in the ﬁrst principles model. The

presence of the ﬁrst principles model in the hybrid model enables it with better extrapolation properties. On the other hand, the parameters of a black box model, which are purely data-driven, do not carry any physical meaning and such models have narrow applicability. To quantitatively prove this point, we built a black box model using a DNN. As the structure of the DNN greatly affects not only its training performance but also the extrapolation properties of the black box model, it is necessary to select an optimal structure in terms of the number of layers and nodes. For this purpose we selected 9 DNNs with network characteristics that had two to four hidden layers, and 5, 10 and 20 neurons per hidden layer. Each DNN was trained using the same training data as used in the case of hybrid model. Next the performance of the DNNs was assessed using three different test inputs, i.e. Q0 = 0.02, 0.04, 0.05, and these inputs were never seen by the DNNs. The performances were quantiﬁed using the relative error as deﬁned in Eq. (42), the average relative error (ARE) was calculated for each scenario and the values have been tabulated below.

M.S.F. Bangi and J.S.-I. Kwon / Computers and Chemical Engineering 134 (2020) 106696

9

Fig. 8. Relative error of the hybrid model predictions in comparison to the training data.

Fig. 9. Comparison of leak off rates predicted from the DNN and actual values calculated using Eq. (41).

Based on the data presented in Table 2, the DNN with 4 layers and 20 neurons in each hidden layer was selected as the black box model. A point to be noted here is that the network structure with 6 layers and and 20 neurons in each hidden layer can also be selected but this structure utilizes more resources in terms of the number of hidden layers and neurons but only slightly improves the black box model performance. Once the structure with 4 layers and 20 neurons in each hidden layer was ﬁnalized, we compared the training result of the black box model with the actual output from the training data. The comparison of the widths at the wellbore is shown in Fig. 10. It can be seen from Fig. 10 that the outputs obtained using the black box closely mimics the output from the training data. This indicates that the DNN has been well trained and the black box model accurately predicts the width when the training input Q0 =

0.03 m3 /s is presented to it. In contrast to the hybrid model, the parameters of the black box model (i.e., the DNN parameters) do not carry any physical meaning and hence, cannot give any insights about the process. To compare the extrapolation properties of the deep hybrid model and the black box model, we presented to them another input of Q0 = 0.04 m3 /s, which is a small deviation from the training input. Fig. 11 shows the comparison of the widths Wz from the hybrid model, black box model and the true data at the wellbore. Although the DNN-based black box model captures the trend in the outputs but clearly has poor accuracy, whereas the hybrid model accurately predicts the output when presented with the test input. An important point to note here is that the difference between the training input (Q0 = 0.03 m3 /s) and the test input (Q0 = 0.04 m3 /s) is small; however, the performance of the black box model varies

10

M.S.F. Bangi and J.S.-I. Kwon / Computers and Chemical Engineering 134 (2020) 106696

Fig. 10. Comparison of wellbore widths obtained from the DNN-based black box model, and the training data.

Fig. 11. Comparison of wellbore widths obtained from the hybrid model, black box model and the test data.

Table 2 Average relative errors for different network structures. Network Structure

Training ARE Q0 = 0.03

Test ARE Q0 = 0.02

Test ARE Q0 = 0.04

Test ARE Q0 = 0.05

4 4 4 5 5 5 6 6 6

0.0108 0.0037 0.0019 0.0036 0.0033 0.0047 0.0048 0.0042 0.0015

0.1091 0.1097 0.1097 0.1100 0.1095 0.1093 0.1093 0.1095 0.1098

0.0716 0.0711 0.0709 0.0706 0.0710 0.0717 0.0712 0.0710 0.0708

0.1229 0.1225 0.1224 0.1222 0.1225 0.1232 0.1227 0.1226 0.1223

layer, layer, layer, layer, layer, layer, layer, layer, layer,

5 neurons 10 neurons 20 neurons 5 neurons 10 neurons 20 neurons 5 neurons 10 neurons 20 neurons

considerably in both these cases as shown in Fig. 11. As explained previously, this inaccurately stems from the fact that the black box model can accurately approximate the relationship between the training inputs and the training outputs by utilizing various parameters which do not carry any physical signiﬁcance. Hence, when a test input different from the training input is presented to it, the black-box model fails to replicate the accuracy seen in the case of training data. On the other hand, since the hybrid model retains the structure of the ﬁrst principles model and its parameters carry physical meaning, it is able to accurately predict the output in the case of test input.

M.S.F. Bangi and J.S.-I. Kwon / Computers and Chemical Engineering 134 (2020) 106696

11

Fig. 12. Comparison of wellbore widths obtained from the hybrid model, ﬁrst principles model and the actual data.

6.2. Comparison of deep hybrid model and ﬁrst principles model The main objective of the deep hybrid model is to develop a DNN that enhances the performance of the ﬁrst principles model by accurately predicting its unknown parameters. In this work, we developed a DNN to extract the parameter associated with the leak-off rate from the training data. In order to compare the performance of the hybrid model with only the ﬁrst principles model, we assume a constant value for the leak off rate in the ﬁrst principles model, and we do not make any changes to the deep hybrid model developed above. The three different cases of the ﬁrst principles model were considered with three different values (U = 1.2 × 10−5 , 1.2 × 10−6 , 6.3 × 10−6 ) for the leak off rate. The comparison between the outputs at the wellbore is shown in the ﬁgure below. From Fig. 12, it can be seen that the hybrid model accurately predicts, as expected, when compared to the ﬁrst principles model. This superior performance is attributed to the accurate prediction of the parameter associated with the leak-off rate by the DNN in the hybrid model. On the other hand, the ﬁrst principles model with (U = 1.2 × 10−5 ) performed on par with the hybrid model but the process described by it takes longer time to reach the desired fracture length, and the ﬁrst principles models with (U = 1.2 × 10−6 and 6.3 × 10−6 ) predicted poorly both in terms of accuracy as well as the time to reach the desired fracture length. The above comparison proves the superiority of hybrid model over ﬁrst principles model in terms of accuracy as the hybrid model contains a DNN to accurately predict the leak off rate U values. A point to be noted here is that in all of the above results we utilized simulation data. In our future work, we plan to apply our proposed methodology for ﬁeld/experimental data. 7. Conclusions In this work we proposed a deep neural network-ﬁrst principles-based hybrid modeling strategy wherein the purpose of the deep neural network is to predict the unknown process parameters in the hydraulic fracturing process. We utilized simulation data for training the hybrid model, speciﬁcally, its deep

neural network. Additionally, we utilized the Levenberg–Marquardt algorithm and ﬁnite difference-based sensitivity analysis to update the parameters of the deep neural network during the training process. Once trained, we utilized the hybrid model to prove its superior extrapolation properties compared to a purely data-driven deep neural network model. Also, we showed its superior accuracy compared to the ﬁrst principles model as it utilizes a trained deep neural network to accurately predict the initially unknown process parameters. Declaration of Competing Interest The authors declare that they have no known competing ﬁnancial interests or personal relationships that could have appeared to inﬂuence the work reported in this paper CRediT authorship contribution statement Mohammed Saad Faizan Bangi: Conceptualization, Methodology, Validation, Formal analysis, Investigation, Writing - original draft, Writing - review & editing, Visualization. Joseph Sang-Il Kwon: Conceptualization, Formal analysis, Resources, Writing - review & editing, Supervision, Project administration, Funding acquisition. Acknowledgment The authors gratefully acknowledge ﬁnancial support from the National Science Foundation (CBET-1804407), the Department of Energy (DE-EE0 0 07888-10-8), the Texas A&M Energy Institute, and the Artie McFerrin Department of Chemical Engineering. Supplementary material Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.compchemeng.2019. 106696.

12

M.S.F. Bangi and J.S.-I. Kwon / Computers and Chemical Engineering 134 (2020) 106696

References Adachi, J., Siebrits, E., Pierce, A., Desroches, J., 2007. Computer simulation of hydraulic fractures. Int. J. Rock Mech. Min. Sci. 44, 739–757. Aguiar, H.C., Filho, R.M., 2001. Neural network and hybrid model: a discussion about different modeling techniques to predict pulping degree with industrial data. Chem. Eng. Sci. 56 (2), 565–570. Arahal, M.R., Cirre, C.M., Berenguel, M., 2008. Serial grey-box model of a stratiﬁed thermal tank for hierarchical control of a solar plant. Sol. Energy 82 (5), 441–451. Bangi, M.S.F., Narasingam, A., Siddhamshetty, P., Kwon, J.S., 2019. Enlarging the domain of attraction of the local dynamic mode decomposition with control technique: application to hydraulic fracturing. Ind. Eng. Chem. Res. 58 (14), 5588–5601. Barree, R., Conway, M., 1995. Experimental and numerical modeling of convective proppant transport. J. Pet. Technol. 47, 216–222. Bohlin, T., Graebe, S.F., 1995. Issues in nonlinear stochastic grey box identiﬁcation. Int. J. Adapt. Control Signal Process. 9 (6), 465–490. Cameron, I.T., Hangos, K., 2001. Process Modelling and Model Analysis. Academic Press. Carinhas, N., Bernal, V., Teixeira, A.P., Carrondo, M.J.T., Alves, P.M., Oliveira, R., 2011. Hybrid metabolic ﬂux analysis: combining stoichiometric and statistical constraints to model the formation of complex recombinant products. BMC Syst. Biol. 5 (34), 34–46. Cubillos, F.A., Acuña, G., 2007. Adaptive control using a grey box neural model: an experimental application. In: Proceedings of the Advances in Neural Networks ISNN 2007, pp. 311–318. Daneshy, A., 1978. Numerical solution of sand transport in hydraulic fracturing. J. Pet. Technol. 30, 132–140. Delalleau, O., Bengio, Y., 2011. Shallow vs. deep sum-product networks. In: Proceedings of the Advances in Neural Information Processing Systems, 24, pp. 666–674. Dors, M., Simutis, R., Lübbert, A., 1995. Advanced supervision of mammalian cell cultures using hybrid process models. Computer Applications in Biotechnology, 72–77. Economides, M.J., Nolte, K.G., 20 0 0. Reservoir Stimulation. Wiley, Chichester. Eldan, R., Shamir, O., 2016. The power of depth for feedforward neural networks. In: Proceedings of the Twenty-Ninth Annual Conference on Learning Theory, 49, pp. 907–940. Eslamloueyan, R., Setoodeh, P., 2011. Optimization of fed-batch recombinant yeast fermentation for ethanol production using a reduced dynamic ﬂux balance model based on artiﬁcial neural networks. Chem. Eng. Commun. 198 (11), 1309–1338. Fiedler, B., Schuppert, A., 2008. Local identiﬁcation of scalar hybrid models with tree structure. IMA J. Appl. Math. 73 (3), 449–476. Fu, P.-C., Barford, J., 1995. Integration of mathematical modelling and knowledge-based systems for simulations of biochemical processes. Expert Syst. Appl. 9 (3), 295–307. Fu, P.-C., Barford, J., 1996. A hybrid neural network-ﬁrst principles approach for modelling of cell metabolism. Comput. Chem. Eng 20 (6), 951–958. Georgieva, P., de Azevedo, S., 2009. Computational Intelligence Techniques For Bioprocess Modelling, Supervision and Control, 218. Springer. Ghosh, D., Hermonat, E., Mhaskar, P., Snowling, S., Goel, R., 2019. Hybrid modeling approach integrating ﬁrst-principles models with subspace identiﬁcation. Ind. Eng. Chem. Res. 58 (30), 13533–13543. Gnoth, S., Jenzsch, M., Simutis, R., Lübbert, A., 2008. Product formation kinetics in genetically modiﬁed E. coli bacteria: inclusion body formation. Bioprocess. Biosyst. Eng. 31 (1), 41–46. Gu, Q., Hoo, K.A., 2014. Evaluating the performance of a fracturing treatment design. Ind. Eng. Chem. Res. 53 (25), 10491–10503. Gu, Q., Hoo, K.A., 2015. Model-based closed-loop control of the hydraulic fracturing process. Ind. Eng. Chem. Res. 54 (5), 1585–1594. Gupta, S., Liu, P.-H., Svoronos, S.A., Sharma, R., Abdek-Khalek, N.A., Cheng, Y., El-Shall, H., 1999. Hybrid ﬁrst-principles/neural networks model for column ﬂotation. AIChE J. 45 (3), 557–566. Howard, G.C., Fast, C.R., 1957. Optimum ﬂuid characteristics for fracture extension. Drill. Prod. Pract. 24, 261–270. Hu, G., Mao, Z., He, D., Yang, F., 2011. Hybrid modeling for the prediction of leaching rate in leaching process based on negative correlation learning bagging ensemble algorithm. Comput. Chem. Eng. 35 (12), 2611–2617. Jia, R., Mao, Z., Chang, Y., Zhao, L., 2011. Soft-sensor for copper extraction process in cobalt hydrometallurgy based on adaptive hybrid model. Chem. Eng. Res. Des. 89 (6), 722–728. Johansen, T.A., Foss, B.A., 1992. Representing and learning unmodeled dynamics with neural network memories. In: Proceedings of the American Control Conference, pp. 3037–3043. Jorgensen, S.B., Hangos, K.M., 1995. Grey box modelling for control: qualitative models as a unifying framework. Int. J. Adapt. Control Signal Process. 9 (6), 547–562. Kramer, M.A., Thompson, M.L., Bhagat, P.M., 1992. Embedding theoretical models in neural networks. In: Proceedings of the American Control Conference, pp. 475–479. Kumar Akkisetty, P., Lee, U., Reklaitis, G.V., Venkatasubramanian, V., 2010. Population balance model-based hybrid neural network for a pharmaceutical milling process. J. Pharm. Innov. 5 (4), 161–168. Lauret, P., Boyer, H., Gatina, J., 20 0 0. Hybrid modelling of a sugar boiling process. Control Eng. Pract. 8 (3), 299–310.

Levenberg, K., 1944. A method for the solution of certain non-linear problems in least squares. Q. Appl. Math. 2 (2), 164–168. Liang, S., Srikant, R., 2017. Why deep neural networks for function approximation? In: Proceedings of the Fifth International Conference on Learning Representations. Mahalec, V., Sanchez, Y., 2012. Inferential monitoring and optimization of crude separation units via hybrid models. Comput. Chem. Eng. 45, 15–26. Marquardt, D., 1963. An algorithm for least-squares estimation of nonlinear parameters. SIAM J. Appl. Math. 11 (2), 431–441. ´ Molga, E., Cherbanski, R., 1999. Hybrid ﬁrst-principle-neural-network approach to modelling of the liquid-liquid reacting system. Chem. Eng. Sci. 54 (13), 2467–2473. Montufar, G.F., Pascanu, R., Cho, K., Bengio, Y., 2014. On the number of linear regions of deep neural networks. In: Proceedings of the Advances in Neural Information Processing Systems, 27, pp. 2924–2932. Narasingam, A., Kwon, J.S., 2017. Development of local dynamic mode decomposition with control: application to model predictive control of hydraulic fracturing. Comput. Chem. Eng. 106, 501–511. Narasingam, A., Kwon, J.S.-I., 2018. Data-driven identiﬁcation of interpretable reduced-order models using sparse regression. Comput. Chem. Eng. 119 (2), 101–111. doi:10.1016/j.compchemeng.2018.08.010. Narasingam, A., Siddhamshetty, P., Kwon, J.S., 2017. Temporal clustering for order reduction of nonlinear parabolic PDE systems with time-dependent spatial domains: application to a hydraulic fracturing process. AIChE J. 63 (9), 3818–3831. Narasingam, A., Siddhamshetty, P., Kwon, J.S., 2018. Handling spatial heterogeneity in reservoir parameters using proper orthogonal decomposition based ensemble Kalman ﬁlter for model-based feedback control of hydraulic fracturing. Ind. Eng. Chem. Res. 57 (11), 39773989. Nascimento, C.A.O., Giudici, R., Scherbakoff, N., 1999. Modeling of industrial nylon-6,6 polymerization process in a twin-screw extruder reactor. II. Neural networks and hybrid models. J. Appl. Polym. Sci. 72 (7), 905–912. Nordgren, R., 1972. Propagation of a vertical hydraulic fracture. Soc. Petrol. Eng. J. 12, 306–314. Novotny, E.J., 1977. Proppant transport. In: Proceedings of the Fifty-Second SPE Annual Technical Conference and Exhibition. Denver, CO. (SPE 6813). Osborne, M.R., 1992. Fisher’s method of scoring. Int. Stat. Rev. 86, 271–286. Perkins, T.K., Kern, L.R., 1961. Widths of hydraulic fractures. J. Pet. Technol. 13, 937–949. Preusting, H., Noordover, J., Simutis, R., Lübbert, A., 1996. The use of hybrid modelling for the optimization of the penicillin fermentation process. CHIMIA Int. J. Chem. 50 (9), 416–417. Psichogios, D.C., Ungar, L.H., 1992. A hybrid neural network-ﬁrst principles approach to process modeling. AIChE. J. 38 (10), 1499–1511. Qi, H., Zhou, X.-G., Liu, L.-H., Yuan, W.-K., 1999. A hybrid neural network-ﬁrst principles model for ﬁxed-bed reactor. Chem. Eng. Sci. 54 (13), 2521–2526. Reuter, M., Deventer, J.V., Walt, T.V.D., 1993. A generalized neural-net kinetic rate equation. Chem. Eng. Sci. 48 (7), 1281–1297. Rumelhart, D.E., Hinton, G.E., Williams, R.J., 1986. Learning representations by back-propagating errors. Nature 323, 533–536. Safavi, A., Nooraii, A., Romagnoli, J., 1999. A hybrid model formulation for a distillation column and the on-line optimisation study. J. Process Control 9 (2), 125–134. Schubert, J., Simutis, R., Dors, M., Havlik, I., Luebbert, A., 1994a. Hybrid modelling of yeast production processes combination of a priori knowledge on different levels of sophistication. Chem. Eng. Technol. 17 (1), 10–20. Schubert, J., Simutis, R., Dors, M., Havlik, I., Lübbert, A., 1994b. Bioprocess optimization and control: application of hybrid modelling. J. Biotechnol. 35 (1), 51–68. Siddhamshetty, P., Kwon, J.S., Liu, S., Valko, P.P., 2017. Feedback control of proppant bank heights during hydraulic fracturing for enhanced productivity in shale formations. AIChE J. 64 (5), 1638–1650. Siddhamshetty, P., Yang, S., Kwon, J.S.-I., 2018. Modeling of hydraulic fracturing and designing of online pumping schedules to achieve uniform proppant concentration in conventional oil reservoirs. Comput. Chem. Eng. 114, 306–317. Sidhu, H.S., Narasingam, A., Siddhamshetty, P., Kwon, J.S., 2018. Model order reduction of nonlinear parabolic PDE systems with moving boundaries using sparse proper orthogonal decomposition: application to hydraulic fracturing. Comput. Chem. Eng. 112, 92–100. Simutis, R., Lübbert, A., 1997. Exploratory analysis of bioprocesses using artiﬁcial neural network based methods. AIChE J. 13 (4), 479–487. Stinchcombe, M., White, H., 1989. Universal approximation using feedforward networks with non-sigmoid hidden layer activation functions. In: Proceedings of the International Joint Conference on Neural Networks, pp. 613–617. von Stosch, M., Oliveira, R., Peres, J., de Azevedo, S.F., 2014. Hybrid semi-parametric modeling in process systems engineering: past, present and future. Comput. Chem. Eng. 60, 86–101. Su, H., Bhat, N., Minderman, P.A., McAvoy, T.J., 1993. Integrating neural networks with ﬁrst principles models for dynamic modeling. In: Proceedings of the IFAC Symposium on Dynamics and Control of Chemical Reactors, Distillation Columns and Batch Processes, pp. 327–332. Teixeira, A.P., Alves, C., Alves, P.M., Corrondo, M.J.T., Oliveira, R., 2007. Hybrid elementary ﬂux analysis/nonparametric modeling: application for bioprocess control. BMC Bioinform. 8 (30). Thompson, M.L., Kramer, M.A., 1994. Modeling chemical processes using prior knowledge and neural networks. AIChE. J. 40 (8), 1328–1340. Tsay, C., Baldea, M., 2019. 110th anniversary: using data to bridge the time and length scales of process systems. Ind. Eng. Chem. Res. 58 (36), 16696–16708.

M.S.F. Bangi and J.S.-I. Kwon / Computers and Chemical Engineering 134 (2020) 106696 Tsen, A.Y.D., Jang, S.S., Wong, D.S.H., Joseph, B., 1996. Predictive control of quality in batch polymerization using hybrid ann models. AIChE J. 45 (2), 455–465. Tulleken, H.J.A.F., 1993. Grey-box modelling and identiﬁcation using physical knowledge and Bayesian techniques. Automatica 29 (2), 285–308. Venkatasubramanian, V., 2018. The promise of artiﬁcial intelligence in chemical engineering: is it here, ﬁnally? AIChE J. 65 (2), 466–478. Wang, X., Chen, J., Liu, C., Pan, F., 2010. Hybrid modeling of penicillin fermentation process based on least square support vector machine. Chem. Eng. Res. Des. 88 (4), 415–420.

13

Werbos, P.J., 1988. Backpropagation: past and future. In: Proceedings of the Second International Conference on Neural Network, 1, pp. 343–353. Yang, S., Siddhamshetty, P., Kwon, J.S., 2017. Optimal pumping schedule design to achieve a uniform proppant concentration level in hydraulic fracturing. Comput. Chem. Eng. 101 (C), 138–147. Zahedi, G., Lohi, A., Mahdi, K., 2011. Hybrid modeling of ethylene to ethylene oxide heterogeneous reactor. Fuel Process. Technol. 92 (9), 1725–1732.

Copyright © 2021 COEK.INFO. All rights reserved.