This packages provides R functions for running the SpeakEasy 2 community detection algorithm using the SpeakEasy2 C library. See the Genome Biology article.
SpeakEasy 2 (SE2) is a graph community detection algorithm that aims to be performant on large graphs and robust, returning consistent results across runs. SE2 does not require precognition about the number of communities in the network. Additionally, while the user can provide parameters to alter how the algorithm is run, the default option work well on a wide arrange of graphs and tweaking options generally has little affect on the results, reducing the risk of influencing the algorithm.
The core algorithm is written in C, providing speed and keeping the memory requirements low. This implementation can take advantage of multiple computing cores without increasing memory usage. SE2 can detect community structure across scales, making it a good choice for biological data, which is often organized hierarchical structure.
Graphs can be passed to the algorithm as adjacency matrices using the
Matrix library, igraph graphs, or any data
that can coerced into a matrix.
For most users, this package should be installed from CRAN using:
install.packages("speakeasyR")It can also be installed using devtools:
devtools::install_github("speakeasy-2/speakeasyR")Additionally, it’s possible to download the latest release from the
release
page (the speakeasyR_${release}.tar.gz asset) and
install it using install.packages:
install.packages("speakeasyR_${release}.tar.gz")Where ${release} must be replaced with the value in the
tarball’s name.
Installation with devtools::install_github has been
tested in clean VMs running Ubuntu and Fedora.
To set up the development environment on Windows, install the appropriate version of Rtools for your R install. Using Rtools’ MSYS2, install the required build tools. This has been tested with ucrt64 environment but likely works in other environments.
pacman -S mingw-w64-ucrt-x86_64-toolchain gitFor development, clone this repository and use:
git submodule update --init --recursiveTo set up the vendored dependencies.
For development astyle is recommended for formatting C
code while texlive/latex, qpdf,
and checkbashims are expected by R for
building the documentation and checking shell scripts during the
R CMD build process.
It should now be possible to run devtools::load_all() in
R.
GNU autotools is used to generate the configuration script and files
needed to run the configuration script. R’s build commands
do not run autoconf instead, if changes are made to the
configuration.ac file, autoconf (and possibly
autoreconf -i) needs to be run and manually and the
resulting files should be committed along with the source
configuration.ac file. The Makefile can
determine when the autoconf programs need to be run by
either directly calling the configure target
(i.e. make configure) or running a build target
(i.e. make build or make check or
similar).
The makefile in the top level directory is intended for
development. It will automate recreating committed generated files when
needed. These generated files must be committed with changes to the
source files that created them as they are not created by the
R CMD build command. It should always be possible to run
R CMD build to build the project in a clean state without
needing to run make to generate other files. The
makefile also sets some flags to provide stricter checks
than what are run during the normal build process.
As clang and gcc can behave differently
changes should be tested against both. To explicitly set the compiler
used run make ${target} CC=${CC} where target is likely
build or check and CC is either
clang or gcc.