The R Data Scientist logo

The R Data Scientist

Archives
Subscribe
December 23, 2025

The R Data Scientist 23-12-2025

festive ML, next-gen R contributors, R's return to top ten languages

🌐 R Community Pulse

R Weekly 2025-W52 Festive ML, next gen R contributors, The Economist style ggplot2 (rweekly​.org). Festive ML, next gen R contributors, The Economist style ggplot2 in R

Weekly Recap (December 19, 2025) (blog​.stephenturner​.us). Weekly recap covering R updates, local LLMs, AI in peer review, Biothreat Benchmark Generation, code quality tools, and insights from notable AI researchers and practitioners

rOpenSci News Digest, December 2025 (ropensci​.org). rOpenSci shares updates on LatinR, uRos talks, podcast features, coworking, new packages, and peer review activity in December 2025

2025 in Review: Growth, Community, & Momentum (r-consortium​.org). Global R community growth in 2025: RUGs, R-Ladies+, R/Medicine, R+AI, governance, and industry/regulatory momentum with ISC funding

Empowering Government Professionals in Nepal Using R programming for Forestry Data Analysis (r-consortium​.org). Nepalese government forestry professionals learn data wrangling, visualization, statistics, and geospatial mapping in R through a seven-day training program

2025 Year in Review (rfortherestofus​.com). Reflection on 2025: course updates, Clarity Data Studio spin-off, charitable giving, and upcoming AI-enhanced teaching and reporting projects

RSMF: Enabling the Next Generation of Contributors to R (blog​.r-project​.org). Mentoring a cohort of expert contributors and improving governance, communication, and sustainability for R via RSMF funding and community initiatives

R climbs back up into the top ten programming languages (flowingdata​.com). R re-enters top ten of TIOBE, signaling growing interest in statistics and data visualization tools like R and Mathematica

🔓 Open Science & Publishing

Co-Creating the Future of Research Assessment: Highlights from DORA’s RFO Guide Workshop (sfdora​.org). DORA hosts a full-day co-creation workshop in Copenhagen to refine the RFO Guide with GRC RRA WG and Science Europe

Weekly digest: AI literacy, open publishing models and equity in OA (openpharma​.blog). Practical copyright guidance, AI literacy, transparency in global health research, equity in OA, and a shift to Publish, Review, Curate models

Code Hosting Options Beyond GitHub (ropensci​.org). Explores mirroring GitHub repos to Codeberg and GitLab, managing multiple remotes, and keeping primary code locations while reducing platform dependence

🧰 R Performance & Setup

Finally figured out a way to port python packages to R using uv and reticulate: example with nnetsauce (thierrymoudiki​.github​.io). Using uv and reticulate to port Python nnetsauce into R with examples and benchmarks

R Code Optimization III: Hardware Utilization and Performance (blasbenito​.com). Explores vectorization, parallelization, and memory management in R, leveraging SIMD, BLAS/LAPACK, and tools like data.table, Arrow, and DuckDB

R Code Optimization IV: Practical Tools and Workflow (blasbenito​.com). Practical profiling with profvis, benchmarking with microbenchmark and bench, and a structured optimization workflow in R

Installing R Packages in Your Own Directories (nas​.nasa​.gov). Installing R packages in a user directory on NAS x86_64 with R_LIBS and sample steps for xts and zoo packages

A Quack-Packed Fall (motherduck​.com). MotherDuck showcases sessions from Big Data London, Small Data SF, and AI-focused events, highlighting DuckDB, serverless Lakehouse ideas, and cost-sensitive analytics with expert speakers

📑 Reporting & Tables

Explore the Pharmaverse Examples: Your Gateway to Clinical Reporting with Open-Source Tools (pharmaverse​.github​.io). Explore open-source tooling for end-to-end clinical reporting with Pharmaverse examples and step-by-step code, interactive teal apps, and community-driven guidance

Introducing docorator to the pharmaverse (pharmaverse​.github​.io). docorator neatly decorates GT, ggplot2, and related outputs in R for production-ready PDFs and downstream reuse

What’s New in gt 1.2.0: Better Tables Through Collaboration (posit​.co). gt 1.2.0 advances table collaboration with centralized management for R, Python, and cloud environments

Drop #743 (2025-12-22): Monday Afternoon Grab Bag (dailydrop​.hrbrmstr​.dev). Daff diff tool for tabular data, MDXport converts Markdown to PDFs in-browser, and GrAIphViz structures AI instructions with GraphViz DOT

🗺️ Spatial Data & Maps

App: visualizador de mapas comunales del Censo 2024 por manzanas (bastianoleah​.netlify​.app). Shiny app in R to map Censo 2024 data by comuna and manzana using arrow-backed datasets

2025 geocompx report: advancing spatial data analysis across languages (geocompx​.org). Geocompx reports 2025 milestones across R, Python, Julia; books, blogs, translations, and new visualisation and Julia resources

Into the void (dosull​.github​.io). Geospatial exploration of Paparoa Track viewsheds using R (terra, sf, tmap) and QGIS data in a guided walk by David O’Sullivan, December 2025

Visualiza datos del Censo 2024 en mapas a nivel de manzana con R (bastianoleah​.netlify​.app). Visualiza datos del Censo 2024 en manzanas de Chile con R usando ggplot2 y mapgl

Estimated crime rates are ~134% higher in London’s mostdeprived neighborhoods (stevenponce​.netlify​.app). London crime gradient by income deprivation deciles analyzed with R, per-capita rates, and equal-population estimates

📈 Inference & Modeling

Frequently Asked Questions (metafor-project​.org). Overview of metafor package for R, validation, funding, usage, and technical details on I2, H2, R2, and Freeman–Tukey transformations

Predicting survival using a super learner and right-censored data (aliceinstatisticsland​.wordpress​.com). Survival analysis with a super learner using right-censored data in R (survivalSL, flexsurv, glmnet) and methods like randomSRC and survival neural networks

Power analysis – A flexible simulation approach using R (nicolaromano​.net). Monte Carlo power analysis in R comparing designs for plant growth using nlme and custom simulations

Corrupción, libertad y por qué debemos ser criteriosos al usar regresión lineal (pacha​.dev). R, ggplot2, dplyr, readxl, and regression critique of corruption vs economic freedom with Cook's distance in a Spanish-language blog

Construcción de intervalos de confianza para gráficos de calibración vía "bootstrap" y algunos asuntos más (datanalytics​.com). Calibración de gráficos con bootstrap, intervalos de confianza y temas relacionados en estadística y ML usando R y Python

🤖 Bayesian & ML Notes

Local models are not there (yet) (posit​.co). Local models are not there yet; Posit discusses R, Python, Jupyter, and Shiny alongside partnerships and tooling

Machine Learning Powered Naughty List: A Festive Jumping Rivers Story (jumpingrivers​.com). Festive ML demo using R and Random Forest to classify 'naughty' team traits with playful features

a (sunny, crisp) day at ICSDS 2025 (xianblog​.wordpress​.com). Bayesian learning sessions at ICSDS 2025 in Xi’an; proper prior minimaxity, variational inference, DIC, AI priors, martingale prediction, and urn-based math discussed by George, Margossian, Christensen, Rockova, Ng, Cappello, Ghiglietti

Good if make prior after data instead of before (dynomight​.substack​.com). Explores Bayesian priors, data-driven categories, and infinite possibilities using aliens as a thought experiment

Elo rating systems via Markov Chains (xianblog​.wordpress​.com). Explores Elo ratings via Markov Chains, Bradley–Terry–Luce models, spectral gap optimization, SGD updates, and Bayesian ranking discussions

New ZeMKI Working Paper on Longitudinal Social Media Engagement (nicolarighetti​.net). Longitudinal Bayesian multilevel analysis of anger-driven climate-skeptic propagation on Facebook during the 2021 German election

📚 Academic Research

Inference for high dimensional repeated measure designs with the R package hdrm (arxiv:stat). hdrm adds high-dimensional repeated-measures mean tests to R, using unbiased trace estimators, subsampling, and Pearson-type approximations. Useful for EEG/omics when d≈N, delivering practical split-plot inference

Deep Gaussian Processes with Gradients (arxiv:stat). Introduces deep Gaussian processes that incorporate gradient observations, improving nonstationary surrogate modeling. Provides CRAN deepgp code with Vecchia scaling for faster Bayesian inference in R

Hazard-based distributional regression via ordinary differential equations (arxiv:stat). Models survival hazards via autonomous ODE systems, letting covariates change hazard shapes beyond proportional hazards. Bayesian computation and asymptotics aid flexible, interpretable inference for trials

Bayesian Markov-Switching Partial Reduced-Rank Regression (arxiv:stat). Bayesian Markov-switching partial reduced-rank regression mixes low-rank linear and GP components, learning groups and rank over time. Useful for multivariate time-series forecasting with uncertainty quantification

Enhancing Line Density Plots with Outlier Control and Bin-based Illumination (arxiv:cs). Proposes bin-based illumination for line density plots, separating structure from density to preserve paths and highlight outliers. Inspires better trajectory visualizations in R at scale

Don't miss what's next. Subscribe to The R Data Scientist:

Add a comment:

Share this email:
Share on LinkedIn Share on Hacker News Share on Mastodon Share on Bluesky
Bluesky
https://mastodo...
LinkedIn
Powered by Buttondown, the easiest way to start and grow your newsletter.