The R Data Scientist logo

The R Data Scientist

Archives
Subscribe
December 30, 2025

The R Data Scientist 30-12-2025

đź§µ R Ecosystem Radar

Weekly Recap (December 24, 2025) (blog​.stephenturner​.us). R updates, AI in wet lab biology, Docker hardened images, LLMs in review, funding debates, and new papers from Stephen Turner and peers

Drop #746 (2025-12-26): Boxing Day Grab Bag (dailydrop​.hrbrmstr​.dev). mq, UBLOCKAI, and Friendly SQL showcased for Markdown processing, AI content blocking, and DuckDB data tricks

đź§° R Package Engineering

revdeprun 2.1.0: hunting bottlenecks and a new speedrun record (nanx​.me). Revdeprun 2.1.0 speeds up reverse dependency checks with an optimized install scheduler, parallel tarball downloads, and streamlined prep for data.table on a 256-core machine using Rust and R

R Package Development Advent Calendar 2025: A Complete Journey (drmowinckels​.io). Modern R package development walkthrough using usethis, devtools, GitHub Actions, pkgdown, and testthat for CI/CD and CRAN workflows

Creating an R package with C++ and Armadillo code (video) (pacha​.dev). Create an R package with C++ and Armadillo using armadillo4r; setup RTools, templates, and workflow for Windows users

đź§Ş Hands-on R Analysis

Railway population (r​.iresmi​.net). Using R with sf, osmdata, and ggplot2 to map railway corridors and population exposure in France

Implementation of DBSCAN Clustering in R (jmsallan​.netlify​.app). DBSCAN clustering in R with dbscan and tidyverse, exploring core/border/noise points on irregular shapes

sfReapportion (f​.briatte​.org). sfReapportion enables areal-weighted interpolation for sf and sp objects in R, porting spReapportion to sf and enabling reproducible French census data mapping

Understanding Data Import and Export in R: Working with CSV and Excel Files (mfatihtuzen​.netlify​.app). R tutorial on importing and exporting data with CSV and Excel in R, using tips dataset, read.table, read.csv, openxlsx, and related workflows

🎲 Stats & Inference

A problem with correlations: rare traits usually have small correlations to other things, just by virtue of being rare (spencergreenberg​.com). Rare traits skew correlations; introduces Generalized Cohen’s d and practical guidance for binary and non-binary variables

The Raven Paradox (allendowney​.com). Bayesian analysis of the Raven Paradox with scenarios, priors, and sampling ambiguity in Python concepts

Two Ways to See Abadie-Imbens Bias Correction (And Why It Might Matter) (causalinf​.substack​.com). Scott Cunningham explains Abadie-Imbens bias correction for nearest-neighbor matching, showing imputation and augmentation are equivalent

📚 Academic Research

Learning from Neighbors with PHIBP: Predicting Infectious Disease Dynamics in Data-Sparse Environments (arxiv:stat). Details Poisson Hierarchical Indian Buffet Process for sparse count prediction, borrowing strength across regions to forecast outbreaks with coherent uncertainty; relevant for Bayesian R pipelines

Ranked Set Sampling in Survival Analysis (arxiv:stat). Extends Kaplan–Meier and Nelson–Aalen estimators to ranked set sampling with censoring, derives asymptotics and variance estimators; promises efficiency gains and R package for survival analysis

Don't miss what's next. Subscribe to The R Data Scientist:

Add a comment:

Share this email:
Share on LinkedIn Share on Hacker News Share on Mastodon Share on Bluesky
Bluesky
https://mastodo...
LinkedIn
Powered by Buttondown, the easiest way to start and grow your newsletter.