We now present an accurate, linear-scaling method for
calculating the kinetic energy matrix elements
of equation (3). We use a mixed space Fourier
transform approach that is applicable to any Bravais
lattice symmetry. Fourier transformation is a natural
method to adopt for this task since in a total-energy
calculation one computes other terms, such as the
electron density and the Hartree energy, using reciprocal
space techniques. This implicitly defines the basis set that
we use to be plane-waves and for consistency we should
calculate the kinetic energy using the same basis
set, i.e. using Fourier transform methods. Thus we calculate
the
term in reciprocal space, where
the Laplacian operator is easy to apply, then transform
the result back to real space and obtain the matrix
elements
by summation over grid
points (4). One way to achieve this would
be to perform a discrete FFT on each support
function
, using the periodicity of the entire
simulation cell. However, unlike the FD algorithm, the
FFT is not a local operation and the cost of applying
the Laplacian to all the support functions in this
way would be proportional to
, which
clearly does not scale linearly with system size.
It is possible to overcome this undesirable scaling
without compromising accuracy by performing the FFT
over a restricted region of the simulation cell, which
we call the `FFT box' (figure 1). Before
defining the FFT box, there are two points that should
be noted. Firstly, the operator must be
Hermitian. This will ensure that the kinetic energy
matrix elements,
are Hermitian, and
hence the eigenvalues real. Secondly, when calculating
two matrix elements such as
and
, we require the quantity
in both cases. To be consistent, our method for
calculating the matrix elements must be such
that
is the same in both
cases, i.e. we require
to have
a unique and consistent representation throughout the
calculation. It is important that both these conditions
are satisfied when it comes to optimisation of the support
functions during a total-energy calculation, and
we shall return to this point later.
In order to fulfil the above requirements, it can be seen that for a given calculation the FFT box must be universal in shape and dimensions. As a result, it must be large enough to enclose any pair of overlapping support functions within the simulation cell. To define a suitable FFT box, we first consider a box with the same unit lattice vectors as the simulation cell, but of dimensions such that it exactly circumscribes the largest support region present in the simulation cell. We then define a box that is commensurate with this, but with sides that are twice as long (and hence a volume eight times as large). This we define to be the FFT box. It is clear that this FFT box is large enough to enclose any pair of support functions exhibiting any degree of overlap.
To calculate a particular matrix element
for
two overlapping support
functions
and
, we imagine
them as being enclosed within
the FFT box defined above and we treat
this region of real space as a miniature simulation cell.
We Fourier transform
using the
periodicity of the FFT box and apply the
Laplacian at each reciprocal lattice point
using standard plane-wave techniques [1].
It is then a simple matter of using one more FFT to
back-transform
to real space and subsequently calculate
by summation over the grid
points of the FFT box, according to equation (4).
The result obtained by this process is equivalent
to performing a Fourier transform of
over the whole simulation cell, applying the
Laplacian and then interpolating to a coarse, but still
regular, reciprocal space grid with only
points,
being the number of grid
points in the FFT box, before back-transforming to
real space. This coarse sampling in reciprocal
space has a negligible influence on the result
because each support function is strictly localised
in real space and therefore smooth in reciprocal space.
It is worth noting the implicit approximation that
we make in calculating the kinetic energy
in the way prescribed above. In
general,
is nonzero outside the support
region of
itself, and it is
essential to take this into account in the calculations. By
construction, we neglect contributions to the
kinetic energy from support functions whose support
regions do not overlap as we expect them to be
negligibly small. This approximation may be
controlled via a single parameter, the FFT box
size, with respect to which the calculation may be
converged if necessary. The same approximation
is of course present in the FD method as well.
We expect certain advantages to the FFT box algorithm over FD based methods. Firstly, the FFT box method should be more accurate than any FD scheme since it takes into account information from every single point of the support function and not only locally. However, it is still perfectly local as far as parallelisation is concerned since we only deal with the points within a single FFT box each time, and this constitutes a very small region of the simulation cell. The parallelisation strategy in this case would still consist of partitioning the real space grid of the simulation cell into subregions of equal size and distributing them amongst PEs. Then, FFTs local to each PE are performed on FFT boxes enclosing pairs of overlapping support regions belonging completely to the simulation cell subregion of the given PE. For pairs of overlapping support regions containing grid points common to the subregion of more than one PE, the pair would have to be attributed to one PE and copied as a whole to it for the local FFT to proceed. This would involve some communication overhead, as in the FD case for pairs of overlapping support functions with points in more than one subregion. Another important advantage of the FFT box method is that it is applicable, without any modification, to regular grids of any Bravais lattice symmetry. This is not true of FD methods.
The number of grid points in a cubic FFT box
is (which is related to
by
). Therefore
the computational cost of applying the FFT method
to a single support function in such an FFT box is
, and thus for all support functions
the cost is proportional to
,
where
is independent of
. In
other words the cost scales linearly
with the number of atoms in the system.