The R Data Scientist 09-12-2025
R's growth in Latin America and Pakistan, accessible Shiny apps, new tools, visualizations, and key updates in AI
🌍 R Community & Events
LatinR <- Latinamerican Conference About the Use of R in R&D 2025 (ropensci.org). LatinR 2025 conference on the use of R in R&D with tutorials, talks by rOpenSci members and Latin American researchers
A First for Pakistan: Karachi R User Group Builds R Ecosystem from Scratch (r-consortium.org). Karachi R User Group grows from 2023 by hosting online events, expanding to 500+ members, with Quarto, ggplot2, and international speakers
Social Coworking and Office Hours - Getting to know The Carpentries (ropensci.org). Social coworking and office hours focusing on getting to know The Carpentries with Angelique Trusler and Steffi LaZerte
The Test Set: Now on YouTube + a look at what’s next (posit.co). The Test Set, Posit's data science video podcast, is now on YouTube with future guests and 2026 lineup
Make Your Shiny Apps Accessible to Everyone – Free Jumping Rivers Webinar! (jumpingrivers.com). Free 55-minute webinar teaching accessible Shiny apps using inclusivity, standards, and practical techniques
📰 R & AI Newsletters
R Weekly 2025-W50 New AI Newsletter, Test Set on Youtube, Haskell for Data Science (rweekly.org). New AI newsletter, YouTube Test Set, Haskell for Data Science, Shiny plots, and R ecosystem updates
Weekly Recap (December 4, 2025) (blog.stephenturner.us). Weekly recap covering R updates, Posit open source investments, AI in academia and biology, funding cuts, bedder bedtools, and recent papers
2025-12-05 AI Newsletter (posit.co). Posit highlights AI news, Databricks collaboration, model pricing insights, and tools for R, Python, and notebooks across RStudio, Jupyter, and VS Code
Ready for 4 2025-12-03: More Databot, and Package Management (buttondown.com/ready4r). Databot review and exploration of rv and uv package managers for R and Python dependency management
🧰 R Tooling & Infrastructure
Procesa datos con R sin programar y de forma interactiva (bastianoleah.netlify.app). Interactively process data in R with block-based GUI using blockr for data manipulation and visualization
New release of maplegend: version 0.4.0 (rcarto.github.io). maplegend 0.4.0 adds histo legends, aspect-ratio support, and decimal/big-separator controls for R graphics
dplyr and Oracle database with DatabaseConnector and JDBC on Windows (guillaumepressiat.github.io). Using JDBC and DatabaseConnector with R, dplyr, and Oracle on Windows for cross-bitness database access
Simplicity of a Database, but the Speed of a Cache: OLAP Caches for DuckDB (motherduck.com). Caching OLAP with DuckDB extensions like QuackStore, cache_httpfs, and DiskCache to speed dashboards
📊 Applied R Visualizations
Pet cats rest 70–80% of the day regardless of season (stevenponce.netlify.app). Makeover Monday redesign using R, ggplot2, and tidiverse to compare cat resting behavior across seasons
Nightlife of Barcelona Neighborhoods (jmsallan.netlify.app). Analyzes Barcelona nightlife by neighborhoods using sf, BAdatasetsSpatial, tidyverse, and kableExtra in R
Net migration in Pacific island countries (freerangestats.info). R for data wrangling and ggplot2 visualisations of Pacific net migration using UN 2024 data, focusing on Fiji, PNG, Samoa and others with Peter Ellis
What’s In The Box: Wrapped but not streamed 2025 (quantixed.org). R and XML parsing of iTunes libraries to compute 2025 listening stats, plus ggplot visualisations and album recommendations
Analyzing my music listening data with Databot (simonpcouch.com). Explores analyzing Apple Music library.xml with Databot in Positron using tidyverse to extract Play Count and identify top tracks
Gráfico exploratorio para el Sistema de Indicadores y Estándares Territoriales (bastianoleah.netlify.app). Exploratory chart placing Chile’s 346 communes by potable water access and paved road network, using urban, mixed, and rural classifications
🧪 R for Research & Open Science
freeCount Bioinformatics Analysis Apps on Posit Cloud (morphoscape.wordpress.com). FreeCount R Shiny apps on Posit Cloud for bioinformatics data analysis using GitHub-hosted project and guided deployment
Expanded FDA eCTD File Format Support for R Packages — A Milestone Achieved Through Industry–FDA Collaboration (r-consortium.org). FDA expands accepted R file formats for eCTD submissions, enabling direct .zip and R packaging with open-source encouragement
CMiNet: Building Reliable Microbiome Networks Through Consensus. (methodsblog.com). CMiNet builds consensus microbiome networks by integrating multiple tools (SparCC, SPIEC-EASI, SPRING) via an R package and Shiny app for reliable, reproducible interactions
Introducing openESM: A database of openly available experience sampling datasets including R/Python interface (jmbh.github.io). OpenESM harmonizes openly available experience sampling datasets with R and Python interfaces for robust, cross-study analyses
Research Integrity in an Era of AI and Massive Amounts of Data (sensible-med.com). Massive data, AI in medicine, and replication challenges; calls for registration, AI-assisted peer review, and robust governance
📈 Statistical Methods & Inference
Frequently Asked Questions (metafor-project.org). Meta-analysis in R with metafor: validation, implementations, and statistical details in Higgins–Thompson theory and RF-based methods
New Preprint: Model Checking for Vector Autoregressive Models (jmbh.github.io). Tutorial on VAR model checking with diagnostics, plots, simulations, and R-code for multilevel VAR in psychological time series
Dogmatic Bayesianism Disorder (daniellakens.blogspot.com). Humorous critique of dogmatic Bayesianism framed as a fictional DSM-5 style disorder in an academic statistics blog post
Gaussian mixture via continuous sparse regularization (xianblog.wordpress.com). Gaussian mixtures with diagonal covariances estimated via continuous sparse regularization and BLASSO using point processes
Bayesian Decision Agents: The Next Frontier in Real-Time Risk Intelligence (magazine.amstat.org). Bayesian decision agents, real-time risk intelligence, and tools like Stan, PyMC, and Python for decision making
Modest replication probabilities of p-values–desirable, not regrettable: a note from Stephen Senn (errorstatistics.com). Stephen Senn discusses replication probabilities of p-values and their interpretation in error statistics and inference
The perplexing “connected cluster axiom” (skewed.de). Explores why the connected-cluster axiom is flawed in SBM-based community detection and its implications for inference
📚 Academic Research
DeeDeeExperiment: Building an infrastructure for integrating and managing omics data analysis results in R/Bioconductor (arxiv:q-bio). DeeDeeExperiment introduces an R/Bioconductor S4 class unifying multi-omics differential expression and enrichment results. R users gain a reproducible container simplifying contrasts, metadata and workflows analysis
Eye of the Beholder: Towards Measuring Visualization Complexity (arxiv:cs). Crowdsourced ratings link human-perceived visualization complexity to image metrics, manual annotation and zero-shot GPT-4o-mini feature extraction. Results suggest LLMs approximate complexity and guide visualization design
The Bag-and-Whisker Plot: A New Bagplot for Bivariate Data (arxiv:stat). The bag-and-whisker plot refines bivariate boxplots using multiple-testing-based fences and granular whiskers for robust, sample-size-adaptive outlier detection. R users gain multivariate diagnostics and stabler graphics
When are novel methods for analyzing complex chemical mixtures in epidemiology beneficial? (arxiv:stat). Extensive simulations compare generalized linear models with novel chemical-mixture methods for estimating health effects under varying correlation and interaction structures. Results guide epidemiologic method choice
Generalised Bayesian Inference using Robust divergences for von Mises-Fisher distribution (arxiv:stat). The authors develop robust Bayesian estimators for von Mises–Fisher directional data using density-power and gamma divergences with weighted Bayesian bootstrap computation. Supports outlier-resistant spherical modeling
👋 Before you go...
I've got a big favor to ask - keeping Blaze running isn't expensive, but it does all add up, so I'm asking readers like you to help, if you can, by joining the Patreon page. Nothing flashy, just a way for folks who find value in these newsletters to chip in a little each month.
If you are getting value from blaze, checking this out would mean the absolute world. But if you can't contribute, no worries - the newsletters keep coming either way. Thanks for reading and being part of this nerdy corner of the internet. All the best for the coming week - Alastair.
Add a comment: