Software

collapse Logo

collapse

Advanced and Fast Statistical Computing and Data Transformation in R

CRAN status collapse status badge Conda Version DOI arXiv

collapse is a large C/C++-based infrastructure package facilitating complex statistical computing, data transformation, and exploration tasks in R - at outstanding levels of performance and memory efficiency. It also implements a class-agnostic approach to R programming, supporting vector, matrix and data frame-like objects and their popular variants (e.g., factor, ts, xts, tibble, data.table, sf), ensuring seamless compatibility with large parts of the R ecosystem. collapse is among the world's fastest data frame libraries and has been benchmarked (Linux | Windows) for standard tasks. It excels at complex statistical tasks, such as weighted quantiles, statistical modes, time series and panel data operations. It was downloaded 2 million times.

fastverse Logo

fastverse

An Extensible Suite of High-Performance and Low-Dependency Packages for Statistical Computing and Data Manipulation in R

CRAN status fastverse status badge Conda Version

The fastverse is a suite of complementary high-performance packages for statistical computing and data manipulation in R. The fastverse R package enables easy installation, loading and management of these packages. It is an extensible framework that allows users to flexibly attach and manage multiple packages or create separate verses of packages. An active list of suggested extension packages endorses high-performance statistical computing packages for R with few dependencies. I highly recommend using the fastverse to teach students scientific computing in R, as all core packages and most suggested extensions have stable APIs, are lightweight to install, and scaleable to large and complex tasks.

dfms picture

dfms

Dynamic Factor Models for R

CRAN status dfms status badge

dfms enables numerically robust and computationally efficient estimation of the (linear Gaussian) Dynamic Factor Model via the EM algorithm. Convenient methods for forecasting and extracting factors facilitate its application to time series dimensionality reduction, forecasting, and nowcasting tasks. The implementation is based on efficient C++ code, making dfms orders of magnitude faster than other R packages.

OptimalTransportNetworks.jl picture

OptimalTransportNetworks.jl

Optimal Transport Networks in Spatial Equilibrium

Julia version

Julia implementation of the quantitative spatial model and algorithms presented in: Fajgelbaum, P. D., & Schaal, E. (2020). Optimal transport networks in spatial equilibrium. Econometrica, 88(4), 1411-1452. The software uses duality principles to optimize over the space of networks, nesting an optimal flows problem and a neoclasical general-equilibrium trade model into a global network design problem to derive the optimal (welfare maximizing) transport network (extension) from any primitive set of economic fundamantals [population per location, productivity per location for each of N traded goods, endowment of a non-traded good, and (optionally) a pre-existing transport network].

osmclass picture

osmclass

Classify Open Street Map Features

CRAN status osmclass status badge

Classify Open Street Map (OSM) features into meaningful functional or analytical categories. Designed for OSM PBF files, e.g. from geofabrik.de imported as spatial data frames. A classification consists of a list of categories that map to certain OSM tags and values. Given a layer from an OSM PBF file and a classification, the main osm_classify() function returns a classification data table giving, for each feature, the primary and alternative categories (if there is overlap) assigned, and the tag(s) and value(s) matched on. The package also contains a classification of OSM features by economic function/significance, following Krantz (2023).

dggridR picture

dggridR

Discrete Global Grids for R: Spatial Analysis Done Right

CRAN status dggridR status badge DOI

dggridR builds discrete global grids which partition the surface of the Earth into hexagonal, triangular, or diamond cells, all of which have the same size. I am a contributor to and current maintainer of it.

API Packages

I have created several open databases (see Data Products) for which I have written API packages:

Contributions

I have contributed to R packages fixest (Fast Fixed-Effects Estimations), plm (Linear Models for Panel Data), decompr (Global Value Chain Decomposition), and kit (Data Manipulation Functions Implemented in C).