### Summary

Spatial weights matrices, *W*, represent potential interaction
between all *i*, *j* in a study area and play an essential role in
many spatial analysis tasks. Commonly applied binary adjacency
algorithms using decomposition and tree representations scale
quadratically and are ill suited for large data sets. We present a
linearly scaling, adjacency algorithm with significantly improved
performance.

### Description

Spatial weights matrices, *W*, play an essential role in many
spatial analysis tasks including measures of spatial association,
regionalization, and spatial regression. A spatial weight
*w*_{i, j} represents potential interaction between each
*i*, *j* pair in a set of *n* spatial units. The weights are
generally defined as either binary *w*_{i, j} =

1, 0

, depending
on whether or not *i* and *j* are considered neighbors, or a
continuous value reflecting some general distance relationship between
*i* and *j*. This work focuses on the case of binary weights
using a contiguity criteria where *i* and *j* are rook
neighbors when sharing an edge and queen neighbors when sharing a
vertex.

Population of the *W* is computationally expensive, requiring, in
the naive case, *O*(*n*^{2}) point or edge comparisons. To improve
efficiency data decomposition techniques, in the form of regular grids
and quad-trees, as well as spatial indexing techniques using r-trees
have be utilized to reduce the total number of local point or edge
comparisons. Unfortunately, these algorithms still scale quadratically.
Recent research has also shown that even with the application of
parallel processing techniques, the gridded decomposition method does
not scale as *n* increases.

This work presents the development and testing of a high performance
implementation,
written in pure Python, using time constant and *O*(*n*) operations,
by leveraging high performance containers and a vertex comparison
method. The figures below depict results of initial testing using
synthetically generated lattices of triangles, squares, and hexagons
with rook contiguity in black and queen contiguity in gray. These
geometries were selected to control for average neighbour cardinality
and average vertex count per geometry. From these initial tests, we
report a significant speedup over r-tree implementations and a more
modest speedup over gridded decomposition methods. In addition to
scaling linearly, while existing methods scale quadratically, this
method is also more memory efficient. Ongoing work is focusing on
testing using randomly distributed data, and U.S. Census data, the
application of parallelization techniques to test further performance
improvements, and the use of fuzzy operator to account for spatial error
.