SAP Project:
Large-Scale Numerical Simulations via Parallel Computing: An Application Using Discrete Element Modeling of Granular
Youssef M.A. Hashash
Civil and Environmental Engineering
University of Illinois at Urbana-Champaign
Research Objectives
SCIENTIFIC GOALS
Over the past several years
we have developed advanced simulation codes including a code, DBLOKS3D that
uses the discrete element method for simulating flow of granular particles.
The code can be used to simulate large-scale particle movement including
landslides, avalanches, and bulk material handling. We wish to simulate very
large assemblages of particles, up to several million, to simulate more realistically
the movement of particulate material in problems such as soil machine interaction
as well as material test of granular material, landslides and lunar and Martian
soils.
COMPUTATIONAL GOALS AND METHODS
The discrete element method (DEM) was first introduced by Cundall
for simulations of jointed rocks in 1971. Since then, significant
progress has been made in development of DEM methodologies and
applications including simulations of landslides, ice flows, mine
dragline excavation, particle mixing, and silo filling. In all
these codes the algorithms to identify adjacent particles and detect
contacts are the most computationally demanding components.
Over the past 4 years we have been developing a new DEM code for the simulation of polyhedral particles called DBLOKS3D. The code has the following distinguishing features:
-
A library of polyhedral particles that include particles scanned from images of gravel particles used in railroad ballast.
-
An advanced integration algorithm for the equations of motions.
-
A new two level neighbor search algorithm to identify particle that are potentially in contact. This algorithm is faster and more efficient than other available algorithms.
-
A new contact detection algorithm (shortest link algorithm) that is used to rapidly identify the common plane between two particles that are potentially in contact. The algorithm is up to 20 times faster than the original common plane algorithm proposed by Cundall.
-
A new bi-linear contact damping law to simultaneous capture larger scale relative particle movement (flowing particles) as well as small scale relative particle movement.
POTENTIAL BENEFITS
The ability to perform large scale realistic simulation of particulate
material will have a significant impact in our ability to understand
the behavior of these large assemblages for problems ranging from interaction
of earthmoving equipments with soil, avalanche flow, and interaction
of lunar and Martian soils with proposed permanent bases for human
exploration and habitation.
COMPUTATIONAL APPROACH
Despite significant algorithmic enhancements in DBLOKS3D, the code
remains computationally demanding. For a simulation of 1 sec (real
time) of 45,000 particles 14 hrs of CPU time (3.2 GHz Pentium) are
required. During the Fall of 2004 we ported DBLOKS3D to run on the
SGI ALTIX machine to take advantage of the multiple processors and
the OpenMP environment to increase the speed of the code. We successfully
developed a single version of DBLOKS3D that be compiled in Windows,
LINUX and SGI ALTIX Shared Memory OpenMP environments. However, limited
speedup was achieved and did not exceed a factor of 4 even when more
than 4 CPU's are used which is insufficient given that large scale
simulation with millions of particles are envisioned. This however was done
using very simple techniques to parallelize the code. The computational demands
of the DEM code and the limited speedup on available parallel systems pauses
significant challenges to the use of DBLOKS3D to solve large scale problems
with more realistic number of particles that can easily exceed millions of
particles. There is a need to explore new parallelization frameworks for
significantly improving the speed of DBLOKS3D on a multi-processor system.
ACCOMPLISHMENTS AND SIGNIFICANCE
The developed code has been extensively verified and can accurately
replicate experiments of soil-tool interaction. The code has been
used to simulate a series of soil-tool interaction experiments,
which have been performed by the authors, and involve manipulation
of a pile of soil with a scaled prototype of a typical commercial
loader bucket. Results of more than 13 simulations of different
experiments reveal that DBLOKS3D can accurately replicate the forces
acting on the bucket for each simulation.
PUBLICATIONS
Erfan G. Nezami, Youssef M.A. Hashash, Dawei Zhao, and Jamshid Ghaboussi, "A
fast contact detection algorithm for 3D discrete element method", Computers
and Geotechnisc, 2004, 31, 575-587.
Youssef M.A. Hashash, Erfan G. Nezami Jamshid Ghaboussi "DBLOCK3D: A 3-D discrete element analysis code for simulation of granular media and soil-machine interaction", Workshop on Granular Materials in Lunar and Martian Exploration, Orlando Florida, February 2-3, 2005.
Erfan G. Nezami, Youssef M.A. Hashash, Dawei Zhao, and Jamshid Ghaboussi, "Shortest link method for contact detection in discrete element method", submitted to Int. J. Numer. Anal. Meth. Geomech, 2005.
Status Report
March 28.2006
The discrete element code dbloks3 simulates the motion of granular material subject to external forces. For example, the forces on the bucket of a front-loader can be obtained as it scoops the granular material. The granular material is described by polyhedral blocks with specific material properties determined from experiment. The bucket is described by a composite group of planar blocks, also with specific material properties.
The goal is to parallelize the regions of the code that dominate the runtime as the problem size is scaled up to the target size of 300,000 blocks.
The OpenMP parallelization of the dbloks3 code is being done in stages. The first stage is to parallelize the dominant region of the code: the contact detection between blocks that have moved sufficiently to require the recalculation of the ‘common-plane’ that is used in determining the type of contact. The parallelization of this region of the code requires intelligent load balancing due to the nature of the system being modeled. Standard loop-parallelization or domain decomposition techniques are not adaptive enough. We are trying to employ a work-queue/dispatch type load balancing scheme that shows promising speed-up but has encountered some issues with updating of global shared structures. This issue is currently being examined by the code authors.
The next stage will be to parallelize the remaining regions of the code: bookkeeping, boxing, rough contact detection, contact force processing, and i/o for check-pointing and visualization.











