The R Data Scientist logo

The R Data Scientist

Subscribe
Archives
October 7, 2025

Data Scientist (with R)

Community news, IDEs, quarto, ggplot2 mapping

🌐 Community news and updates

The Sovereign Tech Fund invests $450,000 in R Foundation to Enhance R’s Sustainability and Security (r-consortium​.org). Sovereign Tech Fund funds R Foundation to modernize core infrastructure, strengthen supply chain security, and boost reproducibility

Weekly recap (Oct 3, 2025) (blog​.stephenturner​.us). Weekly recap of R updates, Slidecrafting, AI trends, structural variation, biotech, RAG, tinytable, and related papers and talks

posit::conf(2025) Recap (posit​.co). Posit conf highlights: Positron IDE, AI-powered workflows, Snowflake and Databricks partnerships, AWS demos, open-source updates, and enterprise deployments

R Weekly 2025-W40 Ducklake, Slidecrafting, Shiny & LLMs (rweekly​.org). R Weekly highlights ducklake, Slidecrafting, Shiny & LLMs with ggplot2 styling and rOpenSci updates


🛠️ IDEs, Quarto and automation

Real-time pricing with a pretrained probabilistic stock return model (thierrymoudiki​.github​.io). Real-time pricing with a pretrained probabilistic stock return model using Python FastAPI and R Plumber

Automating the Github Copilot Agent from the command line with Copilot CLI (seascapemodels​.org). Using Copilot CLI from R to automate agent runs, tool permissions, and directory isolation for reproducible experiments

Resolved: Bug affecting neonUtilities in latest RStudio version on Windows (neonscience​.org). Bug fixed in neonUtilities on Windows with latest RStudio 2025.09.1; download package conflict resolved

2025 MAPOR Fall Webinar Series (mapor​.org). MAPOR's Fall Webinar Series cover Career transitions, Quarto automation, and improved questionnaire design for survey research

Create a Quarto Document in Positron (posit​.co). Positron offers integrated Quarto support with pre-installed extension, YAML validation, code cells, and a built-in terminal for publishing and collaboration


📊 ggplot2 mapping and visuals

Rising Fastball Velocities are Surpressing the Home Run (conormclaughlin​.net). Hard fastballs suppress home runs; analysis uses 325k pitches from 2025, velocity bins, and R-like plotting with baseballr, tidyverse, and stringr

Mapping locations related to the Amelia Earhart disappearance (freerangestats​.info). Mapping key Earhart/Noonan locations in the Pacific with R, ggplot2, and custom map-building code

Still here. Still writing occasional posts for a tiny audience. (nsaunders​.wordpress​.com). Explores AFL jumpers, data wrangling in R with dplyr and fitzRoy, and rare cases of players wearing different numbers in consecutive Grand Finals

ggplot2 styling (tidyverse​.org). Styling ggplot2 with complete themes, theme elements, and extensions for typography, grids, panels, strips, and axis customization

European Basketball Success by Nation (stevenponce​.netlify​.app). Greece leads with 27 Final Four appearances and 10 titles in a faceted bar chart analysis using tidytuesday data

Double y-axis plots with ggplot2 and purrr (pacha​.dev). Double y-axis plotting in ggplot2 using spuriouscorrelations data, scaling with purrr and tintin palette


📘 Statistics, text, and reporting

2025(1)The leisurely cruise begins: Excerpt from Excursion 1 Tour 1 of Statistical Inference as Severe Testing (SIST) (errorstatistics​.com). Explores severity testing in statistics, anti-pseudoscience philosophy, and the 'severe testing' framework for evaluating evidence

Latent Semantic Scale based on Word2vec (blog​.koheiw​.net). Latent Semantic Scaling with Word2vec: probabilistic LSS using seed words and quanteda tokens

Welcome to Missing Data Solutions (missingdatasolutions​.rbind​.io). Missing Data Solutions covers missing data handling, pooling methods, and R packages like psfmi for Rubin’s Rules, D1-D3, and median pooling

Recreating APA Manual Table 7.23 in R with apa7 (wjschne​.github​.io). Recreating APA Table 7.23 in R using apa7, flextable, ftExtra, and tidyverse with hanging indents and decimal alignment

Recreating APA Manual Table 7.24 in R with apa7 (wjschne​.github​.io). Recreating APA Table 7.24 in R using apa7, flextable, ftExtra, tidyverse, and easystats with LME4 for data visualization


🧩 Iteration, simulation, idiomatic R

Iterating some sample data (kieranhealy​.org). Iterates sample data to illustrate LLM evaluation via confusion matrices, R code, and tibble-based data frames

Mapply: When You Need to Iterate Over Multiple Inputs (drmowinckels​.io). Using mapply to pair multiple varying inputs in R, with examples of scaling, labeling, and handling constants

A new simstudy function to make simulating replications easier (rdatagen​.net). New simstudy function scenario_list generates all combinations for simulation setups, with grouping and replication via each

Construct objects with idiomatic R code (blog​.stephenturner​.us). Construct human-readable R objects with the constructive package using construct() for reproducible examples

Building a Command-Line Quiz Application in R (towardsdatascience​.com). Step-by-step guide to building a command-line quiz in R using readline, trimws, tolower, lists, and functions


📚 Academic Research

False Discovery Rate Control via Bayesian Mirror Statistic (arxiv:stat). Bayesian Mirror Statistics for FDR control in high-dimensional variable selection using ADVI without data splitting

Compressed Bayesian Tensor Regression (arxiv:stat). Generalized tensor random projection with Bayesian inference for compressed tensor regression using low-rank representations and model averaging

Forecasting intraday particle number size distribution: A functional time series approach (arxiv:stat). Multilevel functional time series with a functional factor model for one-day-ahead forecasting of 51 intraday particle size curves in London

Total Robustness in Bayesian Nonlinear Regression for Measurement Error Problems under Model Misspecification (arxiv:stat). Bayesian nonparametric total robustness for measurement error in nonlinear regression using Dirichlet process priors and latent input pseudo-samples

One-shot variable-ratio matching with fine balance (arxiv:stat). One-shot variable-ratio matching with fine balance using one-shot optimization to achieve exact covariate balance in observational studies

hi

Don't miss what's next. Subscribe to The R Data Scientist:
Start the conversation:
Bluesky Mastodon LinkedIn
Powered by Buttondown, the easiest way to start and grow your newsletter.