Lattice Boltzmann methods in porous media simulations: From laminar to turbulent flow

Lattice Boltzmann methods in porous media simulations: From laminar to turbulent flow

Accepted Manuscript Lattice Boltzmann methods in porous media simulations: from laminar to turbulent flow Ehsan Fattahi, Christian Waluga, Barbara Wo...

3MB Sizes 9 Downloads 299 Views

Accepted Manuscript

Lattice Boltzmann methods in porous media simulations: from laminar to turbulent flow Ehsan Fattahi, Christian Waluga, Barbara Wohlmuth, Ulrich Rude, ¨ Michael Manhart, Rainer Helmig PII: DOI: Reference:

S0045-7930(16)30306-1 10.1016/j.compfluid.2016.10.007 CAF 3295

To appear in:

Computers and Fluids

Received date: Revised date: Accepted date:

30 May 2016 9 September 2016 7 October 2016

Please cite this article as: Ehsan Fattahi, Christian Waluga, Barbara Wohlmuth, Ulrich Rude, ¨ Michael Manhart, Rainer Helmig, Lattice Boltzmann methods in porous media simulations: from laminar to turbulent flow, Computers and Fluids (2016), doi: 10.1016/j.compfluid.2016.10.007

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Highlights • Systematic evaluation of lattice Boltzmann models for porous media

CR IP T

simulation, from creeping to turbulent flow regimes.

• Evaluation of collision models and boundary schemes in terms of parallel effciency and numerical accuracy.

• Effect of the lattice model on permeability and drag force.

AC

CE

PT

ED

M

AN US

• Parallel simulation of pore scale resolved flow scenarios.

1

ACCEPTED MANUSCRIPT

Lattice Boltzmann methods in porous media simulations: from laminar to turbulent flow

CR IP T

Ehsan Fattahia,b , Christian Walugaa , Barbara Wohlmutha,c , Ulrich Rüdeb , Michael Manhartd , Rainer Helmige

a M2 - Zentrum Mathematik, Technische Universität München, Garching Department of Computer Science 1, Universität Erlangen-Nürnberg, Erlangen c Department of Mathematics, University of Bergen, Bergen, Norway d Fachgebiet Hydromechanik, Technische Universität München, Munich e Institut für Wasser- und Umweltsystemmodellierung, Universität Stuttgart, Stuttgart

AN US

b

Abstract

The Lattice Boltzmann method has become a popular tool for determining the correlations for drag force and permeability in porous media for a wide range

M

of Reynolds numbers and solid volume fractions. In order to achieve accurate and relevant results, it is important not only to implement very efficient

ED

code, but also to choose the most appropriate simulation setup. Moreover, it is essential to accurately evaluate the boundary conditions and collision

PT

models that are effective from the Stokes regime to the inertial and turbulent flow regimes. In this paper, we compare various no-slip boundary schemes

CE

and collision operators to assess their efficiency and accuracy. Instead of assuming a constant volume force driving the flow, a periodic pressure drop

AC

boundary condition is employed to mimic the pressure-driven flow through the simple sphere pack in a periodic domain. We first consider the convergence rates of various boundary conditions with different collision operators in the Stokes flow regime. Additionally, we choose different boundary conditions that are representatives of first- to third-order schemes at curved boundaries Preprint submitted to Elsevier

October 7, 2016

ACCEPTED MANUSCRIPT

in order to evaluate their convergence rates numerically for both inertial and turbulent flow. We find that the multi-reflection boundary condition yields second order for inertial flow while it converges with third order in

CR IP T

the Stokes regime. Taking into account the both computational cost and

accuracy requirements, we choose the central linear interpolation bounce-

back scheme in combination with the two-relaxation-time collision model. This combination is characterized by providing viscosity independent results

AN US

and second-order spatial convergence. This method is applied to perform simulations of touching spheres arranged in a simple cubic array. Full- and reduced-stencil lattice models, i.e., the D3 Q27 and D3 Q19 , respectively, are

compared and the drag force and friction factor results are presented for Reynolds numbers in the range of 0.001 to 2, 477. The drag forces computed

M

using these two different lattice models have a relative difference below 3% for the highest Reynolds number considered in this study. Using the evaluation

ED

results, we demonstrate the flexibility of the models and software in two large scale computations, first a flow through an unstructured packing of spherical

PT

particles, and second for the turbulent flow over a permeable bed. Keywords: Lattice Boltzmann method, pore scale simulation, turbulence, Darcy flow, no-slip boundary condition, collision operator, periodic pressure

CE

boundary

AC

1. Introduction The simulation of fluid flow and transport processes through porous media

is an important field that contributes to a number of scientific and engineering applications, e.g., oil extraction, groundwater pollution, nuclear waste storage, 3

ACCEPTED MANUSCRIPT

and chemical reactors. The application of pore-scale simulations is challenging in most practical situations because the system under study is often larger by orders of magnitude than the characteristic size of the pores. Thus, for

CR IP T

practical purposes, many computational techniques are based on macroscopic models that average over many pores and consider average flow rates.

While the rigorous, analytic up-scaling of the pore-scale problem at lower Reynolds numbers has received much attention in the literature [1, 2], similar

AN US

approaches for higher Reynolds numbers have not been demonstrated yet primarily because of the immense mathematical difficulties that arise in

flow models if a moderate Reynolds number cannot be assumed. However, with the availability of ever-more powerful supercomputers, direct numerical simulation (DNS) has been established as a third possibility for the analysis

M

of homogenized models that can complement the classical experimental and rigorous mathematical averaging approaches.

ED

In the past two decades, the lattice Boltzmann method (LBM) has attracted the interest of researchers in CFD-related fields [3, 4]. In contrast

PT

to traditional CFD approaches based on the conservation of macroscopic quantities such as mass, momentum, and energy, the LBM models a fluid

CE

using the kinetics of discrete particles that propagate (streaming step) and collide (relaxation step) on a discrete lattice mesh. Owing to this kinetic nature, microscopic interactions in fluid flow can be handled even in com-

AC

plex geometries such as those in micro-fluidic devices or porous media [5–7]. Moreover, the inherently local dynamics used in LBM afford efficient implementation and parallelization of both of the fundamental algorithmic stages.

This allows to harness the computational power of currently available and 4

ACCEPTED MANUSCRIPT

emerging supercomputing architectures [8–11]. In this study, we use the waLBerla framework (widely applicable LatticeBoltzmann from Erlangen) [10], which is specifically designed to be used for

CR IP T

massively parallel fluid flow simulations; this enables us to compute problems with resolutions of more than one trillion (1012 ) cells and up to 1.93 trillion

cell updates per second using 1.8 million threads [12]. waLBerla has already been used to study the flow through moderately dense fluid-particle systems

AN US

[13] and to simulate large-scale particulate flows [14]. It is based on four main software concepts: blocks, sweeps, communication, and boundary handling [15]. The modular structure of this framework allows for the implementation of new LBM schemes and models.

However, having immense computational power at hand is not enough to

M

solve relevant problems. For a three-dimensional LBM simulation, stencils that differ with respect to the velocity directions can be used. Lattice models

ED

generally require an exact evaluation of their velocity moments up to secondorder to consistently recover Navier-Stokes dynamics in the continuum limit.

PT

However, they may behave differently at a discrete level, which in turn can lead to violation of some important physical requirements. Investigating

CE

an intermediate Reynolds number flow (50 < Re < 500) through a nozzle, White and Chong [16] demonstrated that the D3 Q19 lattice model, a three dimensional model with a 19 directional stencil can cause a lack of isotropy

AC

in the flow field. This deficiency, which was found to be independent of the collision model, the grid resolution, and the Mach number, can be removed by using the D3 Q27 lattice model which has a 27-directional stencil. It was found that lattice models with a plane having less than six velocity vectors 5

ACCEPTED MANUSCRIPT

are not fully isotropic and can produce qualitatively different results. Similar observations have been reported by other researchers [4, 17, 18]. Recently, Silva and Semiao [19] theoretically analyzed the truncation errors of different

CR IP T

lattice models and showed that reduced lattice schemes, i.e., D3 Q15 and

D3 Q19 , produce spurious angular velocities. They showed that, for convectiondominated flows the rotational invariance will be violated if a reduced lattice scheme is used. Nash et al. [20], on the other hand, investigated the Dean

AN US

flow at Re ≈ 400 with strong secondary flow and demonstrated that none

of the respective lattice models produces large artifacts in the velocity field. Their results suggest that using a greater number of velocity sets for a lattice model, produces better results for interpolating boundary schemes but worse results when using a simple bounce-back boundary scheme.

M

Moreover, the explicitness of classical LBM means that the spatial and temporal discretization characteristics are strongly coupled. Hence, special

ED

care must be taken when performing pore-scale simulations to properly incorporate the physics at the boundaries and inside the domain without

PT

over-resolving the problem. Several methods have been proposed for the implementation of the LBM on non-uniform grids to improve the geometrical

CE

flexibility [21–23], as well as the interpolating boundary conditions that can be used by the classical LBM [24–26]. As was already pointed out in the evaluation of Pan et al. [27], this requires a suitable combination of collision

AC

and boundary operators. The evaluation of different boundaries is mostly done for Stokes flow regimes [28, 29]; however, combinations of boundary conditions and collision models must also be evaluated for high Reynolds number flow. 6

ACCEPTED MANUSCRIPT

A common way to simulate pressure-driven flow in the LBM is to replace the pressure gradient with an equivalent body force and apply stream-wise periodic boundary conditions. However, previous studies [30–33] have shown

CR IP T

that using this approach does not lead to correct flow fields for flow through

complex geometries, such as e.g., porous media. In this study, we drive flow by imposing a pressure gradient while applying a periodic boundary condition in the stream-wise direction in order to allow the flow to develop based on the

AN US

geometry. We employ a simple periodic pressure scheme for an incompressible

flow and investigate it in combination with different types of collision operators and a number of first-, second-, and third-order bounce-back schemes. We simulate flow through simple sphere packs using different LBM approaches, which we will briefly outline in Section 2. Their accuracy and

M

convergence rates for flow in the Stokes regime, are investigated in Section 3. In addition, the computational costs of different boundary schemes are as-

ED

sessed in order to choose the best combination for highly resolved simulations in high Reynolds numbers flow. After finding a suitable configuration, we

PT

then examine the spatial convergence of the boundary schemes in the laminar steady and fluctuating flow regime. Using the results of this evaluation, in

CE

Section 3.2 we simulate the flow through a simple sphere pack for two different solid volume fractions and investigate the effect of lattice velocity sets on the drag force and permeability in high Reynolds number flow by comparing

AC

the full and the reduced stencil lattice models. By sampling over the regime Rep ∈ [0.001, 2, 477] for a regular packing of touching spheres, we numerically investigate the friction factor based on the Reynolds number. To highlight

the ability of the proposed LBM to deal with complex geometries Section 4 7

ACCEPTED MANUSCRIPT

reports on simulations with a more realistic sphere packing geometry that is generated using a rigid body dynamics simulator. In the first example we study the flow through such a random sphere pack. Finally, the flow in a models the turbulent flow over a permeable wall. 2. LBM approaches for porous media flow

CR IP T

model porous region is combined with a free-flow so that the coupled regime

AN US

The lattice Boltzmann equation (LBE) is a simplification of the Boltzmann

kinetic equation where we assume that particle velocities are restricted to a discrete set of values ek , k = 0, 1, . . . , q, i.e., that the particles can only move along a finite number of directions, connecting the nodes of a regular lattice [34, 35]. In the following, we consider a three dimensional setting with

M

q = 18, 26, the so-called D3 Q19 and D3 Q27 model, respectively. Discretizing in time using a time-step size of ∆t = tn+1 − tn , the semi-discrete LBE then

ED

becomes

k = 0, . . . , q,

(1)

PT

∆t−1 (fk (x + ek ∆t, tn+1 ) − fk (x, tn )) = gk (x, tn ),

where fk (x, tn ) represents the probability of finding a particle at position

CE

x and time tn with velocity ek . The left-hand side of (1) corresponds to a discrete representation of the Boltzmann streaming operator, while the

AC

right-hand side g := [gk ]qk=0 is responsible for controlling the relaxation to a local equilibrium. It is generally split into g := ∆t−1 Ω(x, tn ) + F(x, tn ),

where Ω(x, tn ) is a collision term function of f := [fi ]qi=0 , and F is a forcing term that drives the flow. Provided that the modeled fluid is close to its equilibrium state, Bhatnagar, Gross, and Krook (BGK) [36] proposed that 8

ACCEPTED MANUSCRIPT

the discrete local equilibria in the collision processes can be modeled using a local Maxwell distribution. The collision operator takes the general form (2)

CR IP T

Ω(x, tn ) = −R(f (x, tn ) − f eq (x, tn )),

where R is a relaxation operator and f eq (x, tn ) is an equilibrium distribution function of f (x, tn ) that, for incompressible flow, is given by [37]

   2 1 −2 1 −4 , fkeq (x, tn ) = wk ρ + ρ0 c−2 s ek · u + 2 cs (ek · u) − 2 cs u · u

(3)

AN US

where wk is a set of weights normalized to unity, ρ = ρ0 + δρ ( where δρ is the

density fluctuation, and ρ0 is the mean density which we set to ρ0 = 1), and √ cs = ∆x/( 3∆t) is the lattice speed of sound ( where ∆x is the lattice cell width). The macroscopic values of density ρ and velocity u can be calculated i.e.,

Xq

k=0

fk ,

u = ρ−1 0

ED

ρ=

M

from f as zeroth and first order moments with respect to the particle velocity, Xq

k=0

ek fk .

(4)

In a lattice Boltzmann scheme, the computation is typically split into a

PT

collision and streaming step, that are given as

CE

f˜k (x, tn ) − fk (x, tn ) = ∆t gk (x, tn ), fk (x + ek ∆t, tn+1 ) = f˜k (x, tn ),

(collision) (streaming)

respectively, for k = 0, . . . , q. The execution order of these two steps is

AC

arbitrary and may vary from code to code for implementation reasons. For instance, in waLBerla, the order is first streaming and then collision. This has the benefits that the stream and collision steps can be fused and that macroscopic values do not need to be stored and then later retrieved from memory. 9

ACCEPTED MANUSCRIPT

When studying the numerical properties (such as stability, convergence, and accuracy) and the computational cost-efficiency of LBM-based approaches, two important components must be investigated. First, the treatment of the

CR IP T

collision term Ω(x, t) in the relaxation step must be considered. Second, as the nodes of the lattice in the LBM approach are distinguished only into fluid and solid nodes, the choice of the bounce-back operator for the fluid-wall interaction is important for obtaining numerical accuracy.

AN US

Unfortunately, relaxation and bounce-back cannot be treated as two independent components as their interplay is essential for finding an optimal

trade-off between computational effort and physical accuracy. For instance, in the context of porous media flow, use of the BGK collision operator in an approach with a single-relaxation-time (SRT) ω, i.e., RSRT := ωI has

M

been reported to suffer from numerical instabilities and to exhibit unwanted boundary effects [27, 38]. More precisely, the effective pore-size modeled by

ED

the numerical scheme becomes viscosity-dependent. On a macroscopic level this has the effect of making the calculated permeability in the Stokes flow

PT

viscosity-dependent, even though it physically should be independent of fluid properties [27]. The limitations of the simple and computationally appealing

CE

approach outlined above motivate us to investigate more sophisticated schemes. Our goal, which is similar to that of Pan et al. [27], is to investigate different LBM approaches with the application of porous media flow in mind. However,

AC

instead of lumping all boundary conditions into a body-force, we place our focus on boundary conditions that are suitable for high Reynolds number flow driven by pressure-gradients. Before we present a detailed evaluation of suitable combinations for pore10

ACCEPTED MANUSCRIPT

scale simulations, we will briefly summarize the various common approaches. 2.1. Collision operators

CR IP T

Pan et al. [27] reported that, by using multi-relaxation-time [39, 40] (MRT) schemes, it is possible to overcome the deficiencies of the SRT approach when

applied to porous media flow. In the MRT model, the relaxation-times

for different kinetic modes can be treated separately, which allows for finetuning of the free relaxation-times in order to improve numerical stability

AN US

and accuracy. This is done using the following ansatz: RMRT := M−1 S M,

(MRT)

where the components of the collision matrix in the MRT are developed to

M

reflect the underlying physics of collision as a relaxation process. The rows of the orthogonal matrix M are obtained using the Gram-Schmidt orthogo-

ED

nalization procedure applied to polynomials of the Cartesian components of the particle velocities ek , and S := diag[s0 , s1 , ..., sq ] is the diagonal matrix holding the relaxation rates sk : cf., e.g., d’Humières [40] for details. Clearly,

PT

within this class of relaxation schemes, SRT approaches are contained as a special case. In the D3 Q19 lattice model, the parameters s0 , s3 , s5 and s7

CE

are the relaxation parameters corresponding to the collision invariant ρ, and j := ρu, respectively, which are conserved quantities during a collision. These

AC

are set to zero in the absence of a forcing term, i.e., if Fk = 0. Pan et al. [27] proposed to choose the remaining modes as follows: (i) setting the viscous stress vectors s9 , s11 , s13 , s14 , and s15 equal to sν , which is related to the kinematic viscosity ν = µ/ρ as sν = (3ν + 0.5)−1 , and (ii), setting the kinetic modes s1 , s2 , s4 , s6 , s8 , s10 , s12 , s16 , s17 , and s18 to sζ = 8(2 − sν )(8 − sν )−1 , 11

ACCEPTED MANUSCRIPT

which is also related to the kinematic viscosity. However, as their evaluation showed, this does not lead to viscosity-independent results for the macroscopic quantities, except for the multi-reflection boundary scheme. As Khirevich

CR IP T

et al. [29] recently stated, this problem can be solved if one modifies the

aforementioned model in a way such that the symmetric energy modes, i.e.,

s1 and s2 , maintain a constant ratio (1/sν − 0.5)/(1/si − 0.5) while sν varies. Based on a symmetry argument, Ginzburg [38] proposed a model based on

AN US

two-relaxation-times (TRT), which can be seen as the minimal configuration

that provides just enough free relaxation parameters to avoid non-linear dependencies of the truncation errors on the viscosity in the context of porous media simulations [28, 41, 42]. The scheme was derived from the MRT approach by splitting the probability density functions fk into symmetric and

M

anti-symmetric components fk+ := 12 (fk + fk¯ ) and fk− := 12 (fk − fk¯ ), where k¯

is the diametrically opposite direction to k [28, 41, 42]. Performing a separate

ED

relaxation using the two corresponding relaxation rates ω + and ω − yields the operator

(TRT)

PT

RTRT := ω + R+ + ω − R− ,

where R+ and R− are the tensorial representations of the operators extracting

CE

the symmetric and antisymmetric components, respectively. The eigenvalue of the symmetric components is again related to the kinematic viscosity as

AC

ω + = (3ν + 0.5)−1 , and the second eigenvalue ω − ∈ (0, 2) is a free parameter. For steady non-linear flow situations, it has been demonstrated that most of the macroscopic errors depend on the so-called “magic” parameter    1 1 1 1 Λ= − − ω+ 2 ω− 2 12

ACCEPTED MANUSCRIPT

which must be determined for the specific flow setup. The choice Λ = given as an optimal value for stability. Another choice, namely Λ =

3 , 16

1 4

is

yields

the exact location of bounce-back walls in case of Poiseuille flow in a straight

CR IP T

channel [28, 29]. Previous studies [26, 29] show that the accuracy depends on the magic number. In our studies, we choose different values in the range Λ ∈ (0, 34 ]. The exact numbers will be given in the corresponding section. 2.2. Boundary conditions

AN US

In pore-scale simulations, two types of boundary conditions are typically encountered: a solid-wall interaction of fluid particles that come into contact

with the porous matrix (corresponding to a no-slip condition in continuummechanics terminology); and a periodic pressure forcing that is applied to

M

drive the flow via a pressure gradient in order to compute fluid or matrix properties such as the Forchheimer coefficient or the (apparent) permeability. this study.

ED

In the following, we will outline the different schemes under consideration for

PT

2.2.1. Solid-wall interaction

In a simple bounce-back (SBB) scheme, the wall location is represented

CE

through a zeroth order interpolation (staircase approximation), and the collision of particles with a wall is incorporated by mimicking the bounce-back

AC

phenomenon of a particle reflecting its momentum upon collision with a wall, as is expected to happen half-way between a solid and fluid node. Hence, the unknown distribution function is calculated as fk¯ (xf1 , tn+1 ) = f˜k (xf1 , tn ),

13

(5)

ACCEPTED MANUSCRIPT

where, as we recall, k¯ is the diametrically opposite direction to k, and we take the values fek after collision but before streaming on the right hand

side. However, in complex geometries, the wall position is not always located

CR IP T

half-way on a regular lattice. Hence, especially at coarse resolutions, this

boundary treatment introduces severe geometric errors, which in turn lead to

boundary layer effects that can be particularly problematic in low-porosity media in which a large part of the domain is occupied by solid nodes. In

AN US

this case, implementing a discretization that adequately resolves the solid boundaries is often computationally prohibitive as the meshes would become too large; consequently, several advanced schemes have been proposed to more accurately represent non mesh aligned boundaries using spatial interpolations [24–26, 43–45].

M

In our numerical study, we compare five different interpolating boundary condition schemes to model curved boundaries: the linear interpolation

ED

bounce-back (LIBB) [24] scheme; the quadratic extension (QIBB) [44] scheme; the interpolation/extrapolation bounce-back (IEBB) [25] scheme; the multi-

PT

reflection (MR) [28] scheme; and the central linear interpolation (CLI) [28] scheme. For the sake of completeness, we briefly recapitulate these approaches.

CE

Let xfi denote a lattice node located at a distance of at most i > 0 cells from the boundary and let q = |xf1 − xw |/|xf1 − xb | define a normalized wall distance; cf. Fig. 1. Bounce-back schemes based on higher-order interpolations

AC

require more than one fluid node between nearby solid surfaces. For instance, the LIBB condition proposed by Bouzidi et al. [24] is given by   (1 − 2q)f˜k (xf2 , tn ) + 2q f˜k (xf1 , tn ), q < 1/2, fk¯ (xf1 , tn+1 ) =     1 − 1 f˜¯ (xf , tn ) + 1 f˜k (xf , tn ), q ≥ 1/2. k 1 1 2q 2q 14

(6)

ACCEPTED MANUSCRIPT

𝑥𝑓3

𝑥𝑓3

𝑒𝑘

𝑥𝑓2

𝑒𝑘

𝑥𝑓1 𝑥𝑓1

𝑞 < 0.5 𝑞 > 0.5 𝑞 = 0.5

𝑥𝑓3

𝑥𝑓2

𝑥𝑓1

𝑥𝑤

𝑥𝑤

𝑥𝑤 𝑥𝑏

CR IP T

𝑥𝑓2

AN US

Figure 1: Example of different distances of the wall to the first fluid node. Simple bounceback occurs when a lower or higher value of q = 1/2, q emerges the need of interpolated bounce-back schemes. For simplicity, we only display a two-dimensional illustration.

It is also possible to use quadratic interpolation to obtain the QIBB scheme [44], where the unknown distribution function is calculated as

M

  q(1 + 2q)f˜k (xf1 , tn ) + (1 − 4q 2 )f˜k (xf2 , tn ) − q(1 − 2q)f˜k (xf3 , tn ), q < 1/2, fk¯ (xf1 , tn+1 ) =    1−2q ˜ 1  2q−1 f˜¯ (x , t ) + ˜ q ≥ 1/2. ¯ (xf2 , tn ), k f1 n q q(2q+1) fk (xf1 , tn ) + 2q+1 fk

(7)

ED

While extrapolation can possibly give higher order consistency, it is sensitive to numerical instabilities. As an improvement to the scheme of Filippova and

PT

Hänel [46], an interpolation/extrapolation bounce-back (IEBB) model was

CE

proposed by Mei et al. [25] in which a velocity at the wall is computed via   ux , q < 1/2, f ubf =  2    1 − 3 ux , q ≥ 1/2 f1 2q

AC

to determine an auxiliary distribution function n h fk∗ (xb , tn ) = wk ρ + ρ0 c32 ek · ubf + 2c94 (ek · uxf1 )2 −

3 u 2c2 xf1

that is then used to linearly interpolate the unknown as fk¯ (xf1 , tn+1 ) = (1 − X)f˜k (xf1 , tn ) + Xfk∗ (xb , tn ). 15

· ux f 1

io

(8)

(9)

ACCEPTED MANUSCRIPT

Above we let X = (2q − 1)/(1/ω − 2) for q < 12 , while for q ≥

1 2

we let

X = (2q − 1)/(1/ω + 1/2), where ω = ω + in case of the TRT scheme and ω = sν for the MRT scheme.

CR IP T

An alternative approach to obtaining highly accurate bounce-back conditions is the multi-reflection (MR) scheme [26, 28], which reads as fk¯ (xf1 , tn+1 ) =

1−2q−2q 2 ˜ fk (xf2 , tn ) (1+q)2

1−2q−2q 2 ˜ fk¯ (xf1 , tn ) (1+q)2



q2 f˜¯ (xf2 , tn ) (1+q)2 k

+ f˜k (xf1 , tn ) + Fkpc ,

AN US



q2 f˜ (x , t ) (1+q)2 k f3 n

+

(10)

where Fkpc is the post-collision correction term. The post correction term is related to the third-order moments and for the MRT we use the one (2)

4Λk presented in [26] for the MR1 model. This reads as Fkpc = − 3ν(1+q) 2 fk

M

with Λk = (1/sν − 0.5)/(1/sk − 0.5), and fk2 is obtained from the third-order non-equilibrium and can be calculated as

ED

f (2) = M−1 S[t − teq ]

(11)

PT

where t are the third-order moments. The correction term for the TRT collision is employed as it is introduced

CE

in [28] as

Fkpc =

1 1 4 ω − ( − − )(f − − f eq− ). 2 (1 + q) ω 2

(12)

AC

It is worth to mention that the non-equilibrium term should be calculated before the collision. Note that this scheme must access five distribution values at three fluid

nodes to effect the update. A computationally cheaper variant is given by the central linear interpolation scheme (CLI), which only requires three values at 16

ACCEPTED MANUSCRIPT

two fluid nodes, i.e., 1−2q ˜ f (x , t ) 1+2q k f2 n



1−2q ˜ f¯ (xf1 , tn ) 1+2q k

+ f˜k (xf1 , tn ).

(13)

CR IP T

fk¯ (xf1 , tn+1 ) =

It should be noted that neither of the two latter schemes involves a distinction

of cases for different values of q, which allows for an efficient implementation. We note that the CLI scheme can also be modified with a post collision

correction fkpc in order to eliminate the second-order error for steady flows.

AN US

This additional correction is not considered in the present study, but we refer to [27–29] for a further discussion and results. 2.2.2. Periodic pressure boundary condition

In many applications, fluid flow is driven by a pressure differential. For can be written as

M

an incompressible flow, the corresponding periodicity boundary conditions

ED

u(x + L i, t) = u(x, t),

p(x + L i, t) = p(x, t) + ∆p,

(14)

where L is the length of the domain in the periodic direction i, and ∆p/L is domain.

PT

the pressure gradient applied between the inlet and outlet boundaries of the

CE

In LBM-based approaches, applying this type of boundary condition is

not straightforward. Simply adjusting the pressures at the inlet and outlet

AC

boundaries produces non-physical mass defects at the periodic boundary, as

was reported in [47, 48]. The most commonly employed approaches replace the pressure gradient by incorporating an equivalent body force; see e.g. [49–53]; however, these approaches suffer from the inability to accurately predict the pressure gradient locally for flow situations in general geometries [30–33]. For 17

ACCEPTED MANUSCRIPT

porous media applications, in which there are complex pore geometries it is therefore desirable to apply boundary conditions that are not lumped into a volume forcing and hence does not rely on rough predictions of the pressure

CR IP T

field. In this work, we employ a pressure boundary condition that can be applied for incompressible periodic flows. Here we specify the equilibrium distribution function fieq and the non-equilibrium distribution fineq = fi − fieq

separately, as we shall describe below. As the extension to multiple periodic

AN US

boundaries is straightforward, we consider in our description an essentially one-dimensional setting in which flow propagates from the left (L) to the right (R) boundary.

Since the pressure is related to the density via the equation of state p = ρc2s , we consider a density difference instead of a pressure difference in

M

the following. The density at the left boundary is obtained by (15)

ED

ρL = ρR + ∆ρ.

The relaxation dynamics of the non-equilibrium distribution, in presence of

PT

the periodic boundaries can be approximated as neq neq fi,L = fi,R .

(16)

CE

By using the above formulations, the unknown distribution functions can be

AC

computed as

neq eq fi,L = fi,R + fi,R (ρL , uR ),

(17)

neq eq fi,R = fi,L + fi,L (ρR , uL ).

(18)

As the momentum in a periodic channel does not change, the implementation of this approach is simple. For instance, by considering (3), updates (17) and 18

ACCEPTED MANUSCRIPT

(18) can be performed as (19)

fi,R = fi,L − wk ∆ρ.

(20)

3. Results and discussion

CR IP T

fi,L = fi,R + wk ∆ρ,

For steady-state flow at a low Reynolds number, a generally accepted and

experimentally confirmed macroscopic relation between the pressure gradient

AN US

and flow rate is given by Darcy’s law:

−1 ∇P = −µKD U,

(21)

where µ is the fluid dynamic viscosity, KD is the permeability, and U and P

M

are the volume averaged velocity and pressure, respectively. This model, which was first proposed by Darcy [54] based on experimental investigation, is valid

ED

in the regime Rep  1, where Rep := ρdp U/µ is the Reynolds number based

on a characteristic particle diameter dp (in this study, the sphere diameter),

PT

ρ denotes the density of the fluid, and U := |U · i| is the scalar velocity in

stream-wise direction i. In the following sections, we use the volume averaged

CE

velocity, i.e., the Darcy velocity, to calculate the Reynolds number of the flow. 3.1. Pore-scale simulation at Re  1

AC

To verify the implementation and assess the quality of different schemes

before tackling more complex flow problems, in the following we will first evaluate different combinations of the LBM strategies discussed in Section

2 in the Darcy regime. As the flow is linear and irrotational in the Darcy (Stokes) regime, in this section, we conduct our simulations using the D3 Q19 19

CR IP T

ACCEPTED MANUSCRIPT

AN US

Figure 2: 3D view of simple sphere array with equal radii.

lattice model. We consider flow through a periodic array of spheres arranged on a regular lattice as depicted in Fig. 2. To quantify the errors, we compute the dimensionless drag coefficient

FD , 6πµU r

M

CD =

(22)

ED

where r is the sphere radius, FD is the drag force acting on the sphere and U is the volume averaged velocity (Darcy velocity). To reduce computational cost, we compute the wall distances required for the higher-order boundary

PT

conditions only once before the time-stepping loop begins and reuse it in each boundary handling step. This is possible because we are assuming a

CE

non-deformable porous medium. Moreover, we exploit the periodicity in the domain and consider only one cubic representative elementary volume (REV)

AC

containing a single sphere. For details on boundary handling, we refer to the previous Section 2.2. In the following sections, we shall summarize our results related to the spa-

tial discretization effect, as well as the undesired effect of viscosity dependence of the schemes. 20

ACCEPTED MANUSCRIPT

3.1.1. Convergence analysis In this test, we change the size of the sphere and increase the computational

CR IP T

domain accordingly, such that the relative solid volume fraction is fixed to χ = D/L = 0.6. Although Sangani and Acrivos [55] prepared a semi-analytical

solution for a simple cubic sphere pack based on the solid volume fraction, the correlated limits are not sufficient to evaluate the convergence rate of

the boundary conditions. Therefore, first we find the limit value of each

AN US

boundary condition by least-squares curve fitting. Subsequently, we calculate

the convergence rate of the corresponding boundary scheme. We compute the dimensionless drag force CD and consider the relative error |CD /CD,∞ − 1|. Our results are plotted in Figs. 3 and 4.

It can be confirmed that the higher order boundary conditions generally

M

converge with higher order than the SBB boundary scheme which is expected to be asymptotically of first order. As Fig. 4(a) indicates, all of the interpola-

ED

tion boundary schemes have almost second-order accuracy when combined with the SRT model. Moreover, for this example, we observe that the MR

PT

scheme converges with third order. In Figs. 4(b) and 4(c), we list the results for the TRT and MRT models,

CE

respectively. Here we observe a difference between the CLI and the MR schemes, i.e., the CLI scheme converges with second-order while the MR scheme converges with third-order. The results also exhibit that the QIBB

AC

and IEBB schemes converge with second-order. Here, it is worth mentioning that the simulation results using the TRT and MRT collision can be altered by changing the magic number. In this section, the magic number is set to

3/16 in which the interpolation boundary schemes result in better accuracy 21

ACCEPTED MANUSCRIPT

[29]. The results of the convergence rate are summarized in Table 1. 5

5

SBB 4.02 LIBB QIBB 4 IEBB MR 3.98 CLI

5 4.02

75 100

25

50

75 100

CD

CD

50

4

25

50 75100

Radius

3.98 25

4

3.5

4

4.5

3.98

50

75 100

CR IP T

4.5

25

4.02

4

CD

4.5

In

4

3.5

25

50 75100

Radius

(a) SRT

3.5

25

50 75100

Radius

(b) TRT

(c) MRT

AN US

Figure 3: Dependence of the dimensionless drag CD on the resolution of the sphere for simple sphere pack with relative volume fraction of χ = 0.6. Three collision models are presented: (a) SRT, (b) TRT, and (c) MRT. Each figure depicts 6 boundary conditions scheme results. All simulations were conducted with Λ = 3/16 with the D3 Q19 lattice model.

N

-2

10-2 10-3 10

-4

10

-5

10

-6

N 25

-3

50 75100

PT

Radius

10 10

0

-1

N-1

100

N-2

10-2 10-3 10

-4

10

-5

10

-6

N 25

Radius

(b) TRT

-3

10

-1

N-1

N-2

10-2 10-3 10-4

N-3

10-5 50 75100

10

-6

25

Radius

50 75100

(c) MRT

CE

(a) SRT

101

|CD/CD, -1|

N-1

101

M

10

-1

SBB LIBB QIBB IEBB MR CLI

ED

|CD/CD, -1|

10

0

|CD/CD, -1|

101

Figure 4: Logarithmic relative error of the dimensionless drag |CD /CD,∞ − 1| as a function

of the sphere radius (in lattice units). Three collision models are presented, (a) SRT, (b)

AC

TRT, and (c) MRT. Each figure depicts 6 boundary conditions scheme results. The solid lines are eye showing convergence rate. All simulations were conducted with Λ = 3/16

with the D3 Q19 lattice model.

these experiments we also observe pre-asymptotic discretization error modes where the errors for most models do not smoothly decrease with respect to the 22

ACCEPTED MANUSCRIPT

Table 1: Limit value of the dimensionless drag force and the convergence rate α calculated from the last three results of each boundary condition in Fig. 3.

TRT

QIBB

IEBB

MR

CLI

α

1.29

1.74

2.05

2.60

2.95

1.87

CD,∞

3.98340812

3.976854186

3.97337407

3.97329727

3.97282669

3.97363293

α

1.14

1.26

2.41

2.06

2.94

2.05

CD,∞

3.98180682

3.97455861

3.97369797

3.97372456

3.97373716

3.97383257

α

1.15

1.26

2.40

2.54

2.95

1.98

CD,∞

3.98167626

3.97460279

3.97319284

3.97402321

3.97378283

3.97362

AN US

MRT

LIBB

CR IP T

SRT

SBB

resolution (r < 15). Comparing for instance the LIBB model with the QIBB model reveals that the higher order scheme is less affected by this discrepancy. To investigate the source of these fluctuations, we conduct a second series of tests where the center of the sphere is shifted stream-wise inside the cell. In

M

all cases, the radius of the sphere is kept fixed at 4.5 lattice units and the relative volume fraction was set to χ = 0.6. In Fig. 5, we show the results for

ED

a displacement of 0.0 − 0.5 times a cell width. Although the results obtained

by the SRT-SBB are relatively good for this setup, we observe that the SBB

PT

scheme for all considered collision schemes is more sensitive with respect to the positioning of the center. As this behavior stems only from the geometry

CE

representation, the use of interpolation models significantly increases the reliability of the results. Since the displacement along the span-wise showed

AC

a similar behavior, we do not list these results. 3.1.2. Effect of the viscosity on the computed permeability As many researchers have previously reported, the choice of boundary

conditions and collision models may have a strong impact on the simulation accuracy [13, 27, 38, 56]. To have a complete understanding of the effects 23

ACCEPTED MANUSCRIPT

0

-0.05

0.1

0.05

0.05

0

-0.05

-0.1

0

-0.05

-0.1

CR IP T

|CD/CDref-1|

0.05

0.1

|CD/CDref-1|

SBB LIBB QIBB IEBB MR CLI

|CD/CDref-1|

0.1

-0.1

-0.15 0 0.1 0.2 0.3 0.4 0.5 Displacement of the sphere center

-0.15 0 0.1 0.2 0.3 0.4 0.5 Displacement of the sphere center

-0.15 0 0.1 0.2 0.3 0.4 0.5 Displacement of the sphere center

(a) SRT

(b) TRT

(c) MRT

Figure 5: Relative error of the drag force (CD /CD,ref − 1) as a function of the sphere center

AN US

displacement (in a lattice cell). Three collision models are presented, (a) SRT, (b) TRT, (c) MRT. Each figure depicts the results of 6 boundary conditions. All simulations were conducted by the D3 Q19 lattice model, Λ = 3/16, r = 4.5, with a solid volume fraction of χ = 0.6.

of the boundary conditions and collision schemes on the results, we conduct

M

a viscosity dependency study for a setting where the flow is driven by a periodic-pressure-drop boundary condition. This is especially important when

ED

the simulation across a wide range of Reynolds numbers is of interest, since the viscosity is one of the key parameters to achieve different Reynolds numbers

PT

in the LBM simulation. To exclude pre-asymptotic discretization error modes from our observations and only concentrate on the physical inconsistencies

CE

introduced by a specific combination of collision and boundary treatment, we use a sufficient resolution of 16.5 (in lattice units) for the sphere radius.

AC

The permeability in the Darcy regime is only a geometrical characteristic; therefore, it is used to show the effect of viscosity. As proposed by Adler [57], the average pressure gradient in the flow direction can be computed as ∇P · i = −FD L−3 , where L is the length of the domain in flow direction. By combining this equation with Eqs. (22) and (21), we find the reference 24

ACCEPTED MANUSCRIPT

permeability Kref . Here, we use the drag force provided by Sangani and Acrivos [55] to calculate the reference permeability. The plots in Fig. 6 display the ratio of the computed permeability to the

CR IP T

reference permeability, K/Kref , for a simple sphere pack with a relative solid

volume fraction of χ = 0.6. We consider viscosity in the range [0.029, 0.45] (LB unit) and compare different collision models and boundary conditions. In

Fig. 6(a) we show that for all boundary schemes, the SRT collision results in

AN US

a permeability that is strongly dependent on the viscosity. While the IEBB

scheme is less sensitive than the other schemes, the errors we can expect from an SRT collision model are still unacceptable for our purposes. The results plotted in Figs. 6(b) and 6(c) depict the same set of experiments conducted with the TRT and MRT models, respectively. The result achieved

M

by the TRT collision operator shows that the SBB, CLI, and MR schemes all produce a viscosity-independent permeability. The LIBB scheme results

ED

in a clear viscosity-dependency of the permeability. Also QIBB and IEBB result in a slightly viscosity-dependent permeability though the effect is less

PT

pronounced than for LIBB. The results for the MRT model show the same behavior as the TRT, however, we observe a small deviation in the MR

CE

scheme, in which the relative difference is below 0.02%. The CLI and SBB also reproduce viscosity independent results. In case of the SBB we note that though they are viscosity-independent, the results severely under-predict

AC

the permeability due to the staircase representation of the geometry. We point out that our results are in accordance with those found with the TRT collision operator in Pan et al. [27]. We further note that in the comparison between the two linear boundary 25

ACCEPTED MANUSCRIPT

relaxation time 1 1.2 1.4 1.6 1.8

relaxation time 1 1.2 1.4 1.6 1.8

0.6 0.8

1.04

1.04

1.02

1.02

1.02

0.98

0.98

1

SBB LIBB QIBB IEBB MR CLI

0.98 0.96 0.1

0.2 0.3 Viscosity

1

1

0.96

0.4

0.96 0.1

(a) SRT

relaxation time 1 1.2 1.4 1.6 1.8

CR IP T

K/Kref

1.04

K/Kref

0.6 0.8

K/Kref

0.6 0.8

0.2 0.3 Viscosity

(b) TRT

0.4

0.1

0.2 0.3 Viscosity

0.4

(c) MRT

Figure 6: Viscosity dependence of the permeability for the solid volume fraction of χ = 0.6.

AN US

The results represent the normalized permeability as K/Kref for different boundary conditions in the viscosity range [0.029, 0.45] in the LB unit corresponding to the relaxation time of the range [0.58, 1.85], which covers both the over- and under-relaxed modes.

schemes, namely, the CLI and LIBB, we find that the LIBB does not yield

M

viscosity-independent results with any of the collision models under consideration, while the CLI exhibits better properties in this respect and provides

ED

viscosity independent results with the TRT and MRT collision models. It is worth mentioning that the SRT collision scheme should not be used

PT

when the flow behavior is being studied across a wide range of Reynolds numbers, i.e., correlation of the drag force or the permeability, because the

CE

viscosity in the LBM is one of the key parameters that can be adjusted to simulate a certain Reynolds number of the flow. There are also some relaxation combinations proposed for the MRT model for high Reynolds

AC

number flow, which fixes the ghost relaxation rates to achieve stability. This also leads to viscosity dependent results. For further details, we refer to [29].

26

ACCEPTED MANUSCRIPT

3.1.3. Computational cost Our results obtained in Sections 3.1.1 and 3.1.2 indicate that the TRT

CR IP T

and MRT collision operators in combination with the MR boundary scheme results in the most accurate solutions. Given that the highly optimized split

kernel of the TRT implementation in waLBerla is about as fast as the SRT model, there is no justification to use the MRT scheme, as in the optimized version the computation is about a factor of two more expensive. Although

AN US

control of the higher order moments in high Reynolds numbers is only possible with the MRT collision scheme, the TRT collision model can also be used as it can control anti-symmetric moments.

While the identification of a favorable collision operator is clearly apparent, the choice of the right boundary treatment is less obvious. It should be noted

M

that, especially for a DNS of turbulent flow, where one has to provide enough resolution to resolve the smallest dissipative scales, massive parallelism cannot

ED

be avoided. Unfortunately, the more accurate boundary handling methods must access LBM nodes from up to three cells away from the particle boundary

PT

cell, see Section 2.2. In a massively parallel setting, such physical boundary points may be near a logical processor boundary that has been created by the

CE

domain partitioning for a distributed memory machine [10]. In waLBerla this situation is handled by extra ghost-layer exchanges, i.e., by communicating a significantly extended set of distribution functions between neighboring

AC

processors along the sub-domain boundaries. The data dependencies also cause additional synchronization overhead in the parallel execution. When saving both the pre- and post-collision PDF, the nodes necessary for the interpolation along boundaries can be reduced by one. However, this still 27

ACCEPTED MANUSCRIPT

results in an additional communication for the extra saved field which again reduces the efficiency of the code. Even in a weak scaling scenario with only few spheres embedded in the flow and where large sub-domains can

CR IP T

be handled on each processor, we have measured that the more advanced

boundary schemes may already cause a slowdown of between 10% and 50% as compared to the SBB method.

However, for our computational objectives, more complex geometric con-

AN US

figurations must be considered. In these cases, the communication of data

quickly becomes the critical bottleneck. For our application, it is particularly important to save on communication, as for production runs many time steps are necessary to find the desired solution. Thus, to keep the computing times acceptable, only moderately large blocks of LBM nodes can be assigned to

M

each processor. In such a strong-scaling regime, the ratio between computation and communication can easily become a bottleneck despite the highly

ED

optimized waLBerla program. This holds, even when just one layer of ghost nodes is exchanged in every time step and it is essential that further

PT

communication is avoided wherever possible. The efficiency of the LBM both in weak and strong scaling situations is analyzed in detail in [12].

CE

As a good compromise between the cost (including communication on parallel computers) and numerical accuracy, we choose here the TRT-CLI scheme for large pore-scale problems. It leads to a reasonably good accu-

AC

racy, has no viscosity-dependence, and needs less communication than the MR scheme. Hence, using the slightly less accurate but significantly better parallelizable method results in a considerable reduction in run-time. A more concise quantitative analysis of the performance trade-offs will require the 28

ACCEPTED MANUSCRIPT

Table 2: Dimensionless drag force of steady laminar flow at Rep = 46 and χ = 0.9 for the SBB, CLI, and MR boundary schemes. All of the simulations are conducted with the TRT collision operator and the D3 Q27 lattice model.

CLI

MR

36

26.546

22.7539

22.3431

72

24.6073

22.6939

22.5754

144

23.6916

22.6731

22.6392

288

23.2805

22.6681

22.6535

1.17

1.79

1.86

AN US

α

CR IP T

SBB

D/∆x

development of sophisticated parallel performance models analogous to the techniques of Godenschwager et al. [12], Feichtinger et al. [58], but this is beyond the scope of the current paper. These articles study the program op-

M

timization for the waLBerla software framework on various supercomputer architectures for simulations with simple boundary conditions.

ED

3.2. Pore-scale simulation at Re > 1 3.2.1. Convergence rate

PT

In this section, we investigate the consistency of the boundary condition schemes under steady laminar and weakly turbulent flows. To examine the

CE

convergence rate, we use the low Reynolds results from the previous sections to choose a first-, second- and third-order boundary scheme, namely the SBB,

AC

CLI, and MR, respectively. For all of the evaluation in this section, we use the TRT collision operator with the D3 Q27 velocity set and fix the magic number to 3/16. For the laminar steady flow, we simulate the fluid flow through the simple sphere pack at Rep = 46. In this regime, inertia of the fluid flow plays an 29

ACCEPTED MANUSCRIPT

Table 3:

Dimensionless drag force of weakly turbulent flow at Rep ≈ 315 and χ = 0.9

for the SBB, CLI, and the MR boundary schemes without correction term. All of the simulations are conducted with the TRT collision operator and the D3 Q27 lattice model.

SBB

CLI

MR

102.6

64.8742

64.8312

64.8332

144

61.6376

63.9441

63.8481

202.5

62.6216

63.3753

63.2392

288

64.0017

63.0316

63.7219

-

1.58

-

AN US

α

CR IP T

D/∆x

important role and the flow is non-linear. The results of the dimensionless drag is presented in Table 2 for different boundary conditions. The SBB and CLI boundary schemes converge with first- and second-order, respectively,

M

as in the Stokes regime, while the convergence rate of the MR drops to second-order. This is because of the second-order accuracy of the LBM in

ED

bulk fluids in the non-linear regime. The results also show that the SBB boundary scheme with the highest resolution conducted here differs from the

PT

CLI and MR schemes by 2.7%.

Here we investigate the capability of the LBM in the unsteady weakly

CE

turbulent flow at Rep ≈ 315. We conduct the simulation for a sufficiently long time such that the relative difference of the time-averaged drag force is

below 0.1%. The results of the dimensionless drag force for various boundary

AC

schemes are shown in Table 3. Since the flow with this Reynolds number is fluctuating, we averaged the dimensionless drag force for 50 L/UD for each case. We use the diffusive scaling to keep the compressibility error in the same scale as the discretization error. Increasing the resolution beyond the 30

ACCEPTED MANUSCRIPT

necessary one for a stable simulation, the drag force computed in different boundary conditions does not change significantly. The MR boundary scheme in this test exhibits an instability that is associated with the correction

CR IP T

term. Since the MR model without correction term already yields a second order boundary condition and is stable, we will thus present the result of the MR model in this section without the correction term. The computed

convergence rate of the CLI based on the three point values is 1.58, while for

AN US

other boundary schemes an asymptotic convergence rate cannot be computed. The results suggest that for the simulation of turbulent flow in porous media, where high enough resolution is needed to capture the small vortices effects and reach a stable simulation, increasing the resolution does not significantly change the computed drag force. However, the boundary layer

M

effects at high Reynolds numbers should be investigated closely for different boundary schemes. In the above tests, the SBB results differ by at most 2%

ED

from the CLI and MR results. 3.2.2. Lattice model effect

PT

In this section, we evaluate the effect of the lattice models on turbulent flow and compare the results of the simulations conducted with two different

CE

lattice models of the D3 Q19 and D3 Q27 . The flow dynamics are obtained for touching spheres arranged in a simple cubic array. The magic number for the

AC

simulation is fixed at 3/16, and the TRT collision operator in combination

with the CLI boundary scheme is used. To simulate the touching sphere array (χ = 1.0), the flow field is initialized

with the result of the simulation using the SBB scheme and is continued with the CLI boundary scheme. This method of initialization helps to avoid 31

ACCEPTED MANUSCRIPT

the instability arising from the narrow gaps of the pores. For the boundary nodes for which the neighbor fluid cell does not exist, the SBB scheme is used instead of the CLI. By handling the boundary in this way, the convergence

CR IP T

rate will decrease but the approach is still stable, especially for unsteady flow. To calculate the time average Darcy velocity in the turbulent flow, we average the volume averaged velocity for a period of 100–150 L/UD flow through times.

AN US

In order to resolve the boundary layer, Tenneti et al. [59] point out

that the grid resolution in the DNS approaches should increase properly with increasing Reynolds number. The boundary layer thickness can be p approximated as δb ∼ D/ Rep . To check the grid-independency of the

turbulent flow computations, we conducted the simulation at Rep = 1045

M

with different resolutions. The results are presented in Table 4 which depict that for δb > 5.5, the time-averaged dimensionless drag force (Cd ) is not

ED

changing significantly. For the following simulation, we use 252 lattice cells as sphere diameter.

PT

Figure 7 shows the instantaneous velocity field for the highest Reynolds number of 2477 that is considered in this study. The time evolution of the

CE

dimensionless drag force is also presented in this figure which shows the chaotic behavior of the flow. A more detailed analysis of the time-averaged results gives that the same pressure drop will lead to different Darcy velocities

AC

for different lattice models. For D3 Q27 , the average velocity is 2.8% less than the D3 Q19 model and leads to the Reynolds number Rep = 2477, while for the D3 Q19 model Rep = 2547. On the other hand, the drag force calculated by these two models differs by only 0.2%. Taking into account the dimensionless 32

ACCEPTED MANUSCRIPT

Table 4: Dimensionless drag force of turbulent flow at Rep ≈ 1045 and χ = 1.0, and also

the approximate boundary layer thickness δb in lattice unit. The simulations are conducted

Rep

Cd

δb

150

1045

356.1

4.63

180

1043

355.2

5.57

252

1045

350.2

7.79

360

1048

348.9

11.12

AN US

D/∆x

CR IP T

with the TRT collision operator, the CLI boundary scheme and the D3 Q19 lattice model.

drag coefficient, the relative difference is 2.98%.

F/( AU2D)

ED

M

4.5

4

3.5 3

2.5

0.46

0.47 2

t /r

0.48

CE

PT

0.45

(a)

(b)

AC

Figure 7: Instantaneous velocity field contour of turbulent flow of touching spheres in simple periodic array arrangement at Rep = 2477 simulated with D3 Q27 (left), time series of the drag force simulated with different Rep (right). The simulations were conducted

with the TRT-CLI and Λ = 3/16 with the same simulation parameters.

Figure 8 shows the relative differences between the D3 Q19 and D3 Q27 33

0.49

1

10

0

10-1

10-2

10

-3

10

-4

10

-4

10

-3

10

-2

10

-1

10

Rep

0

10

1

10

2

10

3

CR IP T

10

AN US

Relative difference %

ACCEPTED MANUSCRIPT

Figure 8: Relative difference of the dimensionless drag force |CD,reduced /CD,f ull − 1|% of

the two lattice models, D3 Q19 and D3 Q27 , reduced stencil and full stencil, respectively. The simulations were conducted with the same setup using the TRT-CLI and Λ = 3/16.

models for different Reynolds numbers of the flow. Here we can see that for

M

Reynolds numbers below 100, the violation of the rotational invariance by the D3 Q19 model does not significantly affect the dimensionless drag force.

ED

In the regime beyond that Reynolds number the rotational invariance starts

PT

to exert an effect on the physical behavior of the flow. 3.3. Friction factor for the touching spheres

CE

At Reynolds numbers beyond the Stokes flow, or Darcy regime, Forchheimer [60] observed a non-linear deviation from this relation, for which he

AC

proposed the addition of a quadratic term to the Darcy equation as follows −1/2

−1 ∇P = −µKD U − CF KD

ρ|U|U,

(23)

where CF is the constant inertial factor, which mainly depends on the flow path and is usually found experimentally. A non-dimensional form of Eq. 34

ACCEPTED MANUSCRIPT

(23) can be provided in the form of a friction factor, i.e., FK =

1 + F, ReK

(24)

CR IP T

√ √ where FK = (∆p/L) KD /ρU 2 , and ReK = ρU KD /µ.

Figure 9(a) presents the results of the friction factor for different Reynolds numbers of flow through touching spheres. As can be seen, for ReK < 1 the

results are inversely proportional to ReK which indicates the Darcy regime.

AN US

The deviation starts at ReK = 0.5 (Rep = 11) where the inertial effect of the

flow is starting to become dominant. The unsteady laminar flow starts when ReK > 6.9 (Rep > 139) and the transition regime from unsteady to chaotic flow can be identified in the range 18 < ReK < 28 (369 < Rep < 582). By further increasing the Reynolds number, we can see the turbulent flow regime

M

in which the friction factor converges to a nearly constant value of 0.16 for this porous geometry. The obtained friction factor is in line with the recent

ED

experimental results presented by Huang et al. [61] for Rep < 1000. In figure 9(a), the results of the simulation with the D3 Q19 lattice models

PT

are also presented. It is not easy to distinguish the difference between these two models, as the application of both models results in a similar trend for the

CE

friction factor versus Reynolds number. Figure 9(b) shows the friction factor results in the transition regime. It can be seen that when Rep > 139, the same pressure gradient in the boundary can result in different Darcy velocities

AC

for the two models, with the D3 Q19 , giving the highest Darcy velocity. The maximum relative difference in our simulation is below 3% which occurs with the highest Reynolds number.

35

10-2

102

1 0.8 0.6

D3Q27 D3Q19 1/ReK

FK

FK=0.16

0

10-2 10-2 100 ReK

102

M

10-4

Rep 102

D3Q19 1/ReK

FK=0.16

0.2

100

101 ReK

102

(b)

ED

(a)

103

D3Q27

0.4

FK

4

102 10

104

AN US

10

Rep 100

CR IP T

ACCEPTED MANUSCRIPT

Figure 9: Permeability-based friction factor versus permeability-based Reynolds numbers

PT

and particle diameter-based Reynolds numbers. a) whole range of Reynolds number considered in this study, b) a zoom view representing the transition regime. The simulations were performed for touching spheres in simple array arrangement. The plot shows the results

CE

of the D3 Q19 and D3 Q27 models when the simulations were conducted with TRT-CLI

AC

method and Λ = 3/16.

36

ACCEPTED MANUSCRIPT

4. Turbulent flow over a porous bed By using the results of the evaluation which is presented in the previous

CR IP T

sections, we present a pore-scale simulation for semi-realistic porous structure consisting spherical particles arranged in a random sphere packing. To

construct such a semi-real porous structure, the in-house physics engine

framework [62, 63] is used to simulate the rigid body interaction between spherical bodies. In this simulation, several spherical particles with the same

AN US

diameter are initialized such that they fall under their own weight and collect

at the bottom of the bed. After the particles come to rest, the geometry is imported as a permeable bed in the waLBerla framework for fluid flow

PT

ED

M

simulation.

(b)

CE

(a)

Figure 10: (a) Packing structure with mono-sized spherical particles, (b) Instantaneous

AC

velocity field contour at Rep = 65. The simulation was conducted with the TRT-CLI and Λ = 3/16.

Figure 10 illustrates the packing structure containing 592 mono-sized

spherical particles in a domain with periodic stream- and span-wise flow. At the top and bottom of the domain no-slip conditions are applied so that part 37

ACCEPTED MANUSCRIPT

0.0045 0.004 0.0035

0.0025 0.002 0.0015 0.001 0.0005 20

40

60

80 100

AN US

Rep

CR IP T

K/D

2

0.003

Figure 11: Dimensionless permeability versus Reynolds number.

of the spheres are cut out of the domain. The diameter of the spheres is set to 40 lattice units, and we apply the TRT model of collision and the CLI

M

scheme for the wall boundary condition using the D3 Q19 model. The flow field is shown in the right picture with two 2D slices. The permeability is

ED

calculated at different Reynolds number in laminar flow that is presented in Fig. 11.

A free flow over a sphere bed, similar to [11] is also simulated to show

PT

a large scale simulation of a complex geometry. To resolve all scales, the porous part of the domain is refined using three different levels of refinement.

CE

Periodic boundary conditions have been used in the stream-wise and span-wise direction while the top boundary is a free slip boundary. The simulation is

AC

executed on the LIMA supercomputer [64], with 64 nodes and 24 cores on each node. With this configuration it takes roughly 24 hours of computation for the whole simulation when using a spatial resolution of 9 × 107 cells. Fig.

12 shows the contours of the flow velocity of the simulation after 5 × 106 38

AN US

CR IP T

ACCEPTED MANUSCRIPT

M

Figure 12: Turbulent flow over a porous bed: From left to right, the block structure, a 2D velocity contour plot and a 2D cut through the fine lattice.

ED

time-steps. The skeleton grid resulting from the block structure as it is used in waLBerla is shown on the left hand side of the figure to indicate the

PT

static refinement of the meshes. The grids are also displayed on the right hand side of the contour to show the grid resolution. We point out that

CE

the results are stored after coarsening the grids throughout the domain by a factor of 8 to keep the size of the outputs reasonable for post-processing and

AC

visualization.

5. Conclusions In this study, the lattice Boltzmann method is evaluated for the three-

dimensional simulation of the flow through periodic simple sphere pack. A 39

ACCEPTED MANUSCRIPT

periodic-pressure drop boundary condition has been adapted to drive the flow. This article studies different collision operators and various types of boundary conditions. Two different lattice models, namely, D3 Q19 and D3 Q27

CR IP T

have been examined for a wide range of Reynolds numbers of the flow.

The evaluation at low Reynolds numbers shows that the periodic pressure

boundary can accurately simulate porous media flows. The convergence study in the Stokes regime is investigated by computing the drag force with

AN US

various sizes of spheres while the solid volume fraction is fixed. The results show that the simple bounce-back (SBB) boundary scheme converges with

first order, the linear interpolation bounce-back (LIBB) converges between first and second-order, the quadratic interpolation bounce-back (QIBB) and the interpolation-extrapolation bounce-back (IEBB) converge with second-

M

order and the multi-reflection method (MR) converges with third-order. The SRT collision operator produces a viscosity-dependent permeability for all

ED

boundary schemes, while the MRT and the TRT models can produce a viscosity independent permeability for the SBB, CLI, and MR boundary

PT

conditions. We also investigate the effect of the displacement of the center of the sphere on the drag force to show the source of scattering results at

CE

low resolution. The results show that, the use of the interpolating boundary offers smooth results, while the SBB is highly dependent on the number of solid cells representing the sphere.

AC

Since the TRT in its optimized version results in a similar run-time to the

SRT collision model, and MRT, which is computationally more expensive, does not significantly improve the accuracy, the TRT collision scheme has been used for the simulation of high Re number flows. 40

ACCEPTED MANUSCRIPT

Investigating the convergence of various boundary conditions with the TRT collision scheme in inertial flow show that the convergence rate of the MR boundary scheme decreases to second order accuracy while the CLI

CR IP T

and SBB maintain the same second-order convergence rate as in the Stokes

regime. The degradation in the convergence rate of the MR results from the

second-order inheritance of the LBM for the bulk flow in the inertial regime. Evaluating the boundary schemes in weakly turbulent flow at Rep ≈ 315

AN US

shows that the CLI converges with between the first and second order, while

calculating the convergence rate of the other boundary schemes is not possible because of the scattering results. The drag coefficient calculated with the SBB boundary condition differs by at most 2% compared to the results calculated with the CLI and MR boundary schemes.

M

Based on this simulation, the application of the CLI combined with the TRT can be considered as a good balance between accuracy and efficiency.

ED

Comparing the two lattice models with full and reduced stencils in turbulent flow at Rep = 2477, reveals the clear differences in the flow field caused by the

PT

lack of isotropy in the reduced scheme. However, porous media properties such as permeability and the drag force show less than 3% difference between these

CE

two lattice models. For flow at Rep < 100, the difference is negligible, and that beyond this Reynolds number, the difference increases with increasing Reynolds number.

AC

We also examined the ability of the model and the framework with two

large scale simulations, such as flow through random packing of spherical particles in the laminar regime, and the turbulent flow over a permeable bed.

The results demonstrate the capability of the model for wide range of flow 41

ACCEPTED MANUSCRIPT

regimes in complex flows.

CR IP T

Acknowledgements Financial support from the German Research Foundation (DFG, Project WO 671/11-1) and also the International Graduate School of Science and

Engineering (IGSSE) of the Technische Universität München for research training group 6.03 are gratefully acknowledged. Our special thanks goes to

AN US

Simon Bogner and Dominik Bartuschat for many fruitful discussions and the

waLBerla primary authors Florian Schornbaum, Christian Godenschwager and Martin Bauer for their essential help with the optimization of the code. UR and BW wish to thank the Institute of Mathematical Sciences at National

M

University of Singapore for their hospitality. References

ED

[1] S. Whitaker, Flow in porous media I: A theoretical derivation of Darcy’s law, Transport in porous media 1 (1986) 3–25.

PT

[2] S. Whitaker, The Forchheimer equation: a theoretical development,

CE

Transport in Porous media 25 (1996) 27–61. [3] E. Fattahi, M. Farhadi, K. Sedighi, H. Nemati, Lattice boltzmann

AC

simulation of natural convection heat transfer in nanofluids, International

Journal of Thermal Sciences 52 (2012) 137 – 144.

[4] S. Geller, S. Uphoff, M. Krafczyk, Turbulent jet computations based on MRT and cascaded lattice Boltzmann models, Computers and Mathematics with Applications 65 (2013) 1956 – 1966. 42

ACCEPTED MANUSCRIPT

[5] M. Singh, K. Mohanty, Permeability of spatially correlated porous media, Chemical Engineering Science 55 (2000) 5393 – 5403.

CR IP T

[6] J. Bernsdorf, G. Brenner, F. Durst, Numerical analysis of the pressure

drop in porous media flow with lattice Boltzmann (BGK) automata, Computer Physics Communications 129 (2000) 247 – 255.

[7] J. Kim, J. Lee, K.-C. Lee, Nonlinear correction to Darcy’s law for a flow through periodic arrays of elliptic cylinders, Physica A: Statistical

AN US

Mechanics and its Applications 293 (2001) 13 – 20.

[8] A. Peters, S. Melchionna, E. Kaxiras, J. Lätt, J. Sircar, M. Bernaschi, M. Bison, S. Succi, Multiscale simulation of cardiovascular flows on the IBM Bluegene/P: Full heart-circulation system at red-blood cell resolu-

M

tion, in: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis,

ED

IEEE Computer Society, Washington DC, USA, 2010, pp. 1–10. [9] M. Schönherr, K. Kucher, M. Geier, M. Stiebler, S. Freudiger,

PT

M. Krafczyk, Multi-thread implementations of the lattice Boltzmann method on non-uniform grids for CPUs and GPUs, Computers and

CE

Mathematics with Applications 61 (2011) 3730–3743.

AC

[10] C. Feichtinger, S. Donath, H. Köstler, J. Götz, U. Rüde, WaLBerla: HPC software design for computational engineering simulations, Journal of Computational Science 2 (2011) 105 – 112.

[11] E. Fattahi, C. Waluga, B. Wohlmuth, U. Rüde, Large scale lattice boltzmann simulation for the coupling of free and porous media flow, 43

ACCEPTED MANUSCRIPT

in: High Performance Computing in Science and Engineering: Second International Conference, HPCSE 2015, Soláň, Czech Republic, May 25-28, 2015, Revised Selected Papers, Springer International Publishing,

CR IP T

Cham, 2016, pp. 1–18.

[12] C. Godenschwager, F. Schornbaum, M. Bauer, H. Köstler, U. Rüde, A framework for hybrid parallel flow simulations with a trillion cells in

complex geometries, in: Proceedings of SC13: International Conference

AN US

for High Performance Computing, Networking, Storage and Analysis, SC ’13, ACM, New York, NY, USA, 2013, pp. 35:1–35:12.

[13] S. Bogner, S. Mohanty, U. Rüde, Drag correlation for dilute and moderately dense fluid-particle systems using the lattice Boltzmann method,

M

International Journal of Multiphase Flow 68 (2015) 71 – 79. [14] J. Götz, K. Iglberger, M. Stürmer, U. Rüde, Direct numerical simulation

ED

of particulate flows on 294912 processor cores, in: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing,

PT

Networking, Storage and Analysis, SC ’10, IEEE Computer Society, Washington, DC, USA, 2010, pp. 1–11.

CE

[15] D. Bartuschat, D. Ritter, U. Rüde, Parallel multigrid for electrokinetic simulation in particle-fluid flows, in: High Performance Computing and

AC

Simulation (HPCS), 2012 International Conference on, pp. 374 –380.

[16] A. T. White, C. K. Chong, Rotational invariance in the three-dimensional lattice Boltzmann method is dependent on the choice of lattice, Journal of Computational Physics 230 (2011) 6367–6378. 44

ACCEPTED MANUSCRIPT

[17] G. Mayer, G. Házi, Direct numerical and large eddy simulation of longitudinal flow along triangular array of rods using the lattice Boltzmann method, Mathematics and Computers in Simulation (MATCOM) 72

CR IP T

(2006) 173–178.

[18] S. K. Kang, Y. A. Hassan, The effect of lattice models within the lattice

Boltzmann method in the simulation of wall-bounded turbulent flows, Journal of Computational Physics 232 (2013) 100 – 117.

AN US

[19] G. Silva, V. Semiao, Truncation errors and the rotational invariance

of three-dimensional lattice models in the lattice Boltzmann method, Journal of Computational Physics 269 (2014) 259 – 279. [20] R. W. Nash, H. B. Carver, M. O. Bernabeu, J. Hetherington, D. Groen,

M

T. Krüger, P. V. Coveney, Choice of boundary condition for latticeBoltzmann simulation of moderate-reynolds-number flow in complex

ED

domains, Physical Review E 89 (2014) 023303. [21] T. Lee, C.-L. Lin, An eulerian description of the streaming process in

PT

the lattice Boltzmann equation, Journal of Computational Physics 185

CE

(2003) 445 – 471.

[22] G. Eitel-Amor, M. Meinke, W. Schröder, A lattice-Boltzmann method

AC

with hierarchically refined meshes, Computers & Fluids 75 (2013) 127 –

139.

[23] A. Fakhari, T. Lee, Numerics of the lattice Boltzmann method on nonuniform grids: Standard LBM and finite-difference LBM, Computers & Fluids 107 (2015) 205 – 213. 45

ACCEPTED MANUSCRIPT

[24] M. Bouzidi, M. Firdaouss, P. Lallemand, Momentum transfer of a Boltzmann-lattice fluid with boundaries, Physics of Fluids (1994-present)

CR IP T

13 (2001) 3452–3459. [25] R. Mei, W. Shyy, D. Yu, L.-S. Luo, Lattice Boltzmann method for

3-D flows with curved boundary, Journal of Computational Physics 161 (2000) 680 – 699.

[26] I. Ginzburg, D. d’Humières, Multireflection boundary conditions for

AN US

lattice Boltzmann models, Physical Review E 12 (2003) 666–614.

[27] C. Pan, L.-S. Luo, C. T. Miller, An evaluation of lattice Boltzmann schemes for porous medium flow simulation, Computers & Fluids 35 (2006) 898 – 909.

M

[28] I. Ginzburg, F. Verhaeghe, D. d’Humières, Two-Relaxation-Time lattice Boltzmann Scheme: About Parametrization, Velocity, Pressure and

ED

Mixed Boundary Conditions, Communications in Computational Physics 3 (2008) 427+.

PT

[29] S. Khirevich, I. Ginzburg, U. Tallarek, Coarse-and fine-grid numerical behavior of MRT/TRT lattice Boltzmann schemes in regular and random

CE

sphere packings, Journal of Computational Physics 281 (2015) 708–742.

AC

[30] S. Chen, G. D. Doolen, Lattice Boltzmann method for fluid flows, Annual Review of Fluid Mechanics 30 (1998) 329–364.

[31] J. Zhang, D. Y. Kwok, Pressure boundary condition of the lattice Boltzmann method for fully developed periodic flows, Physical Review E 73 (2006) 047702. 46

ACCEPTED MANUSCRIPT

[32] S. H. Kim, H. Pitsch, A generalized periodic boundary condition for lattice Boltzmann method simulation of a pressure driven flow in a

CR IP T

periodic geometry, Physics of Fluids 19 (2007) –. [33] O. Gräser, A. Grimm, Adaptive generalized periodic boundary conditions for lattice Boltzmann simulations of pressure-driven flows through confined repetitive geometries, Physical Review E 82 (2010) 016702.

[34] R. Benzi, S. Succi, M. Vergassola, The lattice Boltzmann equation:

AN US

theory and applications, Physics Reports 222 (1992) 145–197.

[35] S. Succi, The Lattice-Boltzmann Equation, Oxford university press, Oxford, 2001.

M

[36] P. L. Bhatnagar, E. P. Gross, M. Krook, A model for collision processes in gases. I. Small amplitude processes in charged and neutral one-component

ED

systems, Physical Review 94 (1954) 511–525. [37] X. He, L.-S. Luo, Lattice Boltzmann model for the incompressible Navier–

PT

Stokes equation, Journal of Statistical Physics 88 (1997) 927–944. [38] I. Ginzburg, Consistent lattice Boltzmann schemes for the Brinkman

CE

model of porous flow and infinite Chapman-Enskog expansion, Physical

AC

Review E 77 (2008) 066704.

[39] I. Ginzburg, P. M. Adler, Boundary flow condition analysis for the three-dimensional lattice Boltzmann model, J. Phys. II France 4 (1994)

191–214.

47

ACCEPTED MANUSCRIPT

[40] D. d’Humières, Multiple–relaxation–time lattice Boltzmann models in three dimensions, Philosophical Transactions of the Royal Society of (2002) 437–451.

CR IP T

London. Series A: Mathematical, Physical and Engineering Sciences 360

[41] I. Ginzburg, Lattice Boltzmann modeling with discontinuous collision components: Hydrodynamic and advection-diffusion equations, Journal of Statistical Physics 126 (2007) 157–206.

AN US

[42] I. Ginzburg, F. Verhaeghe, D. d’Humières, Study of simple hydrodynamic solutions with the two-relaxation-times lattice-Boltzmann scheme, Communications in Computational Physics 3 (2008) 519+. [43] D. Yu, R. Mei, L.-S. Luo, W. Shyy, Viscous flow computations with the

ED

39 (2003) 329 – 367.

M

method of lattice Boltzmann equation, Progress in Aerospace Sciences

[44] P. Lallemand, L.-S. Luo, Lattice Boltzmann method for moving bound-

PT

aries, Journal of Computational Physics 184 (2003) 406 – 421. [45] B. Chun, A. J. C. Ladd, Interpolated boundary condition for lattice

CE

Boltzmann simulations of flows in narrow gaps, Physical Review E 75

(2007) 066705.

AC

[46] O. Filippova, D. Hänel, Grid refinement for lattice-BGK models, Journal of Computational Physics 147 (1998) 219–228.

[47] L. Talon, D. Bauer, N. Gland, S. Youssef, H. Auradou, I. Ginzburg, Assessment of the two relaxation time lattice-Boltzmann scheme to 48

ACCEPTED MANUSCRIPT

simulate stokes flow in porous media, Water Resources Research 48 (2012) n/a–n/a. W04526.

CR IP T

[48] A. Dupuis, From a Lattice Boltzmann model to a parallel and reusable

implementation of a virtual river, Ph.D. thesis, University of Geneva, 2002.

[49] N. Martys, X. Shan, H. Chen, Evaluation of the external force term in the

AN US

discrete Boltzmann equation, Physical Review E 58 (1998) 6855–6857.

[50] J. M. Buick, C. A. Greated, Gravity in a lattice Boltzmann model, Physical Review E 61 (2000) 5307–5320.

[51] Z. Guo, C. Zheng, B. Shi, Discrete lattice effects on the forcing term in

M

the lattice Boltzmann method, Physical Review E 65 (2002) 046308. [52] A. Mohamad, A. Kuzmin, A critical evaluation of force term in lattice

ED

Boltzmann method, natural convection problem, International Journal of Heat and Mass Transfer 53 (2010) 990 – 996.

PT

[53] H. Huang, M. Krafczyk, X. Lu, Forcing term in single-phase and shanchen-type multiphase lattice Boltzmann models, Physical Review E 84

CE

(2011) 046710.

AC

[54] H. Darcy, Recherches expérimentales relatives au mouvement de l’eau dans les tuyaux, Mallet-Bachelier, Paris (1857).

[55] A. Sangani, A. Acrivos, Slow flow through a periodic array of spheres, International Journal of Multiphase Flow 8 (1982) 343 – 360.

49

ACCEPTED MANUSCRIPT

[56] D. Bartuschat, U. Rüde, Parallel multiphysics simulations of charged particles in microfluidic flows, Journal of Computational Science 8 (2015)

CR IP T

1 – 19. [57] P. M. Adler, Porous media : geometry and transports, ButterworthHeinemann Limited, 1992.

[58] C. Feichtinger, J. Habich, H. Köstler, U. Rüde, T. Aoki, Performance modeling and analysis of heterogeneous lattice boltzmann simulations

AN US

on cpu–gpu clusters, Parallel Computing 46 (2015) 1–13.

[59] S. Tenneti, R. Garg, S. Subramaniam, Drag law for monodisperse gas–solid systems using particle-resolved direct numerical simulation of flow past fixed assemblies of spheres, International Journal of Multiphase

M

Flow 37 (2011) 1072 – 1092.

ED

[60] P. Forchheimer, Wasserbewegung durch Boden, Z. Ver. Deutsch. Ing 45 (1901) 1782–1788.

PT

[61] K. Huang, J. Wan, C. Chen, L. He, W. Mei, M. Zhang, Experimental investigation on water flow in cubic arrays of spheres, Journal of

CE

Hydrology 492 (2013) 61 – 68.

[62] K. Iglberger, U. Rüde, Massively parallel rigid body dynamics simulations,

AC

Computer Science-Research and Development 23 (2009) 159–167.

[63] T. Preclik, U. Rüde, Ultrascale simulations of non-smooth granular dynamics, Computational Particle Mechanics (2015) 1–24.

50

ACCEPTED MANUSCRIPT

[64] R. R. Erlangen, Lima cluster, https://www.rrze.fau.de/dienste/

AC

CE

PT

ED

M

AN US

CR IP T

arbeiten-rechnen/hpc/systeme/lima-cluster.shtml, 2016.

51