# Science Research Weekly

## By Mark R Williamson

A weekly overview of helpful scientific research topics and the successor of the podcast Statistics Weekly. Every week, I will cover 3 topic areas: Research Articles, Research Tools, and Research Funding. Science On.

## Episode 2: Lies, Two-Truths and a Lie, and Statistical R-packages

Episode 2: Lies, Two-Truths and a Lie, and Statistical R-packages

### Science Research WeeklyApr 08, 2022

00:00

07:17

#### Episode 10: Sticks and Cones May Break My Bones

Episode 10: Sticks and Cones May Break My Bones

Get ready for super microscopes, tardigrade blood stabilization, broken stick modeling, an R version codenamed ‘Already Tomorrow’, a ‘pinecone’ made of Python, and a pitch-style competition for infectious disease funding. Science On.
References:
Parallelized computational 3D video microscopy of freely moving organisms at multiple gigapixels per second
Natural and engineered mediators of desiccation tolerance stabilize Human Blood Clotting Factor VIII in a dry state
Hyperactive nanobacteria with host-dependent traits pervade Omnitrophota
Electrochemical degradation of PFOA and its common alternatives: Assessment of key parameters, roles of active species, and transformation pathway
Mother-to-infant microbiota transmission and infant microbiota development across multiple body sites
Journal of Statistical Software Volume 106
R 4.3.0 scheduled for April 21
dummy coding in spss glm–more on fixed factors, covariates, and reference groups
Partial correlation: controlling for confounding variables
Pynecone
A suite of tools to scrape and parse search engine results
NIH: Molecular Transducers of Physical Activity Bioinformatics Center
NIH: Instrumentation Grant Program for Resource-Limited Institutions
NSF: Synthesis Center for Understanding Organismal Resilience
ACS: Postdoctoral Fellowships
HFSP: Cross-Disciplinary Fellowships
IDSA: IDea Incubator
R Packages:
crayons: Color Palettes from Crayon Boxes
baseq: Basic Sequence Processing Tool for Biological Data
reappraised: Statistical Tools for Assessing Publication Integrity of Groups of Trials
ridgregextra: Ridge Regression Parameter Estimation

Mar 24, 202310:23

#### Episode 9: Robot Parade!

Episode 9: Robot Parade!

Get ready for self-healing soft robots, modular NASA robots, Shiny recaps, Statistical Rethinking training, statistical kindergarten, and the NIH Pioneer Award Program. Science On.
References:
Optical tweezers throw and catch single atoms
The connectome of an insect brain
A hydrogenotrophic Sulfurimonas is globally abundant in deep-sea oxygen-saturated hydrothermal plumes
A self-healing electrically conductive organogel composite
Engineering breakthrough in softbotics
Mix-and-match kit could enable astronauts to build a menagerie of lunar exploration bots
The Young Supernova Experiment Data Release 1 (YSE DR1)
Observation of a critical charge mode in a strange metal
6 predictions for AI and data in health care and life sciences
Use the metalog distribution in SAS
R 4.2.3 is released
ShinyConf 2022 – Short Recap
Statistical Rethinking (2023 Edition)
Introduction to data analysis with {Statgarten}.
WebR IS HERE!
NIH: Director’s Pioneer Award Program
NIH: Entry-Level Modules for Training the Genomics Research Workforce
NIH: Enhancement of external environmental exposure assessment for cancer epidemiology research
Falk Medical Research Trust: Catalyst Awards Program
CRI: Immuno-Informatics Postdoctoral Fellowship
ACS: Early Career Postdoctoral-Faculty Bridge Grant
Other Articles:
LEXpander: Applying colexification networks to automated lexicon expansion
Surface changes observed on a Venusian volcano during the Magellan mission
Why is Pumas-QSP So Fast? Some Insights Into Differentiable Simulation Performance
R-packages:
pharmaverse: Navigate 'Pharmaverse'
toweranNA: A Method for Handling Missing Values in Prediction Applications
waspasR: Tool Kit to Implement a W.A.S.P.A.S. Based Multi-Criteria Decision Analysis Solution

Mar 17, 202309:36

#### Episode 8: Flash Grants in the Pan

Episode 8: Flash Grants in the Pan

Get ready for technological tribology, tingible translational cell biology, flipping fossils, Volume 14/4 of the R Journal, ebook tips and tricks, and flash grants to fixate on. Science On.
References:
Novel friction stabilization technology for surface damage conditions using machine learning
Crevasse refreezing and signatures of retreat observed at Kamb Ice Stream grounding zone
Apoptotic cell fragments locally activate tingible body macrophages in the germinal center
Dynamic interplay between RPL3- and RPL3L-containing ribosomes modulates mitochondrial activity in the mammalian heart
An abundant sea anemone from the Carboniferous Mazon Creek Lagerstӓtte, USA
Paleontologists flip the script on anemone fossils
Python 3.12.0 alpha 6 released
The R Journal Volume 14/4
7 New books added to Big Book of R
What I’ve learned making an .epub Ebook with Quarto
DOD: DEPSoR - Research Collaboration
DOD: Ovarian Cancer CTR Award
DOD: Ovarian Cancer Clinical Trial Award
NHBLI: Research Project Grants in Valvular Heart Disease
NHBLI: Integrative Omics Analysis of NHLBI TOPMed Data
HHS: Small R01s for Clinical Trials Targeting Diseases for NIDDK
Biogen: Sponsorships
AHR: Strategically Focused Research Networks
CRI: Technology Impact Award
IAFNS: Live Dietary Microbe Intake and Health Outcomes
NC Life Sciences: Flash Grants
R-packages:
BayesFluxR: Implementation of Bayesian Neural Networks
lorem: Generate Lorem Ipsum Text
HDRFA: High-Dimensional Robust Factor Analysis
guildai: Track Machine Learning Experiments
aihuman: Experimental Evaluation of Algorithm-Assisted Human Decision-Making

Mar 10, 202310:22

#### Episode 7: Life, the R-Universe, and Everything

Episode 7: Life, the R-Universe, and Everything

Get ready for an exploration of the R-Universe, simulated octopus tentacles, effects of Vitamin D and exercise on health, fossil wings and fins, statistical conferences to set your calendar by, and a very special 23rd birthday. Science On.
References:
Energy-shaping control of a muscular octopus arm moving in three dimensions
Injury primes mutation-bearing astrocytes for dedifferentiation in later life
Origin of the propatagium in non-avian dinosaurs
Effectiveness of physical activity interventions for improving depression, anxiety and distress: an overview of systematic reviews
MoBIE: a Fiji plugin for sharing and exploration of multi-modal cloud-hosted big image data
3D printing of living structural biocomposites
Network expansion of genetic associations defines a pleiotropy map of human cell biology
A shape-shifting nuclease unravels structured RNA
Systematics and Phylogenetic Interrelationships of the Enigmatic Late Jurassic Shark Protospinax annectans Woodward
Vitamin D supplementation and incident dementia: Effects of sex, APOE, and baseline cognitive status
JuliaHub Newsletter February 2023
Bioconductor Conference 2023
PhyloPic
R 4.2.3
R-universe
January 2023: “Top 40” New CRAN Packages
Happy 23rd Birthday R!
NIH: Translational Research in Maternal and Pediatric Pharmacology and Therapeutics (R01)
NIH: Translational Research in Maternal and Pediatric Pharmacology and Therapeutics (R21)
NIH: Genomic Community Resources (U24)
CDC: Clinical and Applied Research Strategies for the Prevention and Control of Fungal Diseases
USDA: USDA-ARS Postdoctoral Fellowship in Spatial Statistics for Continental-Scale Soil Health Modeling
TSC Alliance: 2023 Grant Cycle
CHEST: COPD Research Grant
CHEST: Young Investigator Research Grant
R-packages:
aihuman: Experimental Evaluation of Algorithm-Assisted Human Decision-Making
reactCheckbox: Checkbox Group Input for 'Shiny'
treediff: Testing Differences Between Families of Trees
rutledge: Real-Time PCR Data Sets by Rutledge et al. (2004)

Mar 03, 202311:07

#### Episode 6: Boss-Level Antibiotics

Episode 6: Boss-Level Antibiotics

Get ready for better arsenic detection, broad-spectrum antibiotics, model-breaking massive galaxies, Julia and Python working in harmony, a hard-to-pronounce R-package on Gaussian mixture models, and Department of Defense lung cancer grants aplenty. Science On.
References:
Scalable selective absorber with quasiperiodic nanostructure for low-grade solar energy harvesting
Toward surface-enhanced Raman scattering using electroless substrate for trace arsenic detection and speciation
Site-specific encoding of photoactivity and photoreactivity into antibody fragments
Complete Genome Sequence of the Lysogenic Pseudomonas Bacteriophage Fyn8
A broad-spectrum synthetic antibiotic that does not evoke bacterial resistance
Discovery of massive early galaxies defies prior understanding of the universe
Julia and Python better together
DOD: Multidisciplinary Research Program of the University Research Initiative (MURI)
DOD: Lung Cancer, Concept Award
DOD: Lung Cancer, Translational Research Award
DOD: Lung Cancer, Idea Development Award
DOD: Lung Cancer, Career Development Award
NASA: Early Career Faculty
FDA: Natural History and Biomarker Studies of Rare Neurodegenerative Diseases
LCRF: Leading Edge Research Grant Program
UMDF: Research Grant Program
R-packages:
dataMojo: Reshape Data Table
ggmapinset: Add Inset Panels to Maps
bubbleHeatmap: Produces 'bubbleHeatmap' Plots for Visualising Metabolomics Data
gmmsslm: Semi-Supervised Gaussian Mixture Model with a Missing-Data Mechanism
rWCVP: Generating Summaries, Reports and Plots from the World Checklist of Vascular Plants

Feb 24, 202305:16

#### Episode 5: A Bouquet of Space ROSEs

Episode 5: A Bouquet of Space ROSEs

Get ready for exoplanetary systems, missing Neptunes, a Dungeons & Dragons R package, GIFs for top R packages, Tidvyverse functions to save your day, and a whole host of NASA grant announcements. Science On.
References
Framework for the architecture of exoplanetary systems
RNA's 'joints' play key role in our gene expression
Help! Our Neptune-sized exoplanets are missing
The ins and outs of an undergrad-run journal
the difference between the bernoulli and binomial distributions
Top 9+ Interesting R Project Ideas To Try In 2023
10 Tidyverse functions that might save your day
Putting the top 100 R packages into a GIF
NSF: Plant Genome Research Program
NSF: Computer and Information Science and Enginering Core Programs
NIH: HEAL Initative
NIH: Aging Research
NIH: IDeA Networks of Biomedical Research Excellence (INBRE)
DOD: Ovarian Cancer Early-Career Investigator Award
DOD: Ovarian Cancer Investigator-Initiated Research Award
DOD: Ovarian Cancer Pilot Award
NASA: Research Opportunities in Space and Earth Science (ROSES) 2023
R-packages
HanStat: Package for Easy Interpretation of Statistical Methods
hidecan: Create HIDECAN Plots for Visualising Genome-Wide Association Studies and Differential Expression Results
dndR: Dungeons Dragons Functions for Players and Dungeon Masters

Feb 17, 202306:12

#### Episode 4: To Err in R is Human

Episode 4: To Err in R is Human

Get ready for the top 10 errors in R, self-assembling peptides, microbiome mishaps, a theory-defying dwarf planet ring system, and a chance to save tiny hearts. Science On.
References
The virtual microbiome: A computational framework to evaluate microbiome analyses
A dense ring of the trans-Neptunian object Quaoar outside its Roche limit
3D Printing of Self-Assembling Nanofibrous Multidomain Peptide Hydrogels
This one-atom chemical reaction could transform drug discovery
logitr: Fast Estimation of Multinomial and Mixed Logit Models with Preference Space and Willingness-to-Pay Space Utility Parameterizations
Python 3.11.2, Python 3.10.10 and 3.12.0 alpha 5 are available
How to use Git to share SAS programs
Top 10 errors in R and how to fix them
NIH: Using Multimodal Biomarkers to Differentially Diagnose ADRDs for Clinical Trials
NSF: Accelerating Research Translation
Saving Tiny Hearts Society: Research Grants
R packages
deFit: Fitting Differential Equations to Time Series Data
synMicrodata: Synthetic Microdata Generator
animate: A Web-Based Graphics Device for Animated Visualisations
formatBibtex: Format BibTeX Entries and Files

Feb 10, 202305:21

#### Episode 3: The Ocean that Snuck up on Me

Episode 3: The Ocean that Snuck up on Me

Get ready for heart-attack healing hydrogels, super rare supernovas, volumes of viroid sequences, stealth oceans, finding hex logos and packages by topic in CRAN, and NASA funding for exobiology. Science On.
References
This groundbreaking biomaterial heals tissues from the inside out
A high-mass X-ray binary descended from an ultra-stripped supernova
Mining metatranscriptomes reveals a vast world of viroid-like circular RNAs
Tracking the Evolution of an Ocean Within Mimas Using the Herschel Impact Basin
Journal of Statistical Software Volume 105
Annual Julia Growth Statistics - January 2023
Tips for organizing your R code
December 2022: “Top 40” New CRAN Packages
NASA: Exobiology- Living Systems Science
NIH: Data Harmonization, Curation and Secondary Analysis of Existing Clinical Datasets
DOD: Peer Reviewed Medical, Discovery Award
DE-FOA: 2023 Exploratory Research for Extreme Scale Science
NIH NOSI: Development of Functional Assay Sites to Evaluate Candidate -Omics Variants Associated with Heart, Lung, Blood, or Sleep Disease
NIH NOSI: Revision Applications to add a Curation and Informatics Component to existing Animal and Biological Material Resource Centers
R-packages
hexFinder: Find Hex Logos for CRAN Packages
vvsculptor: Apply Manipulations to Data Frames
findPackage: Find 'CRAN' Package by Topic

Feb 03, 202305:40

#### Episode 2: What Double Teeth You Have

Episode 2: What Double Teeth You Have

Get ready for new deep sea bacterial phyla, extra ancient caecilian fossils, R Journal Volume 14/3 JuliaHub 6.0, visualizing star maps in R, and an IceCube neutrino competition! Science Research Weekly covers 3 topic areas: Research Articles, Research Tools, and Research Funding. Science On.
References
Global patterns of diversity and metabolism of microbial communities in deep-sea hydrothermal vent deposits
Chem-map profiles drug binding to chromatin in cells
Study offers first glimpse of how many suffer from previously unknown illness
Molecular mechanism of topoisomerase poisoning by the peptide antibiotic albicidin
Triassic stem caecilian supports dissorophoid origin of living amphibians
The R Journal Volume 14/3
Star Map with R
Show all data in the background of your faceted ggplot
Fitting robust non-Gaussian models in Stan and R-INLA
JuliaHub 6.0 Features Released
Kaggle Competition: IceCube - Neutrinos in Deep Ice
NSF: Molecular Foundations for Biotechnology
NSF: Expanding Capacity in Quantum Information Science and Engineering
DE: Scientific Machine Learning for Complex Systems
AFOSR: Fiscal Year 2023 Science, Technology, Engineering and Mathematics (STEM) Program
R-packages
neatR: Neat Data for Presentation
polyglotr: Translate Text
clarify: Simulation-Based Inference for Regression Models
poolHelper: Simulates Pooled Sequencing Genetic Data
vDiveR: Visualization of Viral Protein Sequence Diversity Dynamics

Jan 27, 202307:49

#### Episode 1: An Explosive Beginning

Episode 1: An Explosive Beginning

Welcome to Science Research Weekly, the successor to my first podcast, Statistics Weekly. This abbreviated trial run of an episode includes: a research article on microbes from a volcanic island, the largest stellar catalogue to date, Arduino project ideas for 2023, and NSF Research and Development grants.
References:
The Early Microbial Colonizers of a Short-Lived Volcanic Island in the Kingdom of Tonga
A high-quality Ixodes scapularis genome advances tick science
Haplotype-aware pantranscriptome analyses using spliced pangenome graphs
The molecular evolution of genes previously associated with large sizes reveals possible pathways to cetacean gigantism
The Dark Energy Camera Plane Survey 2 (DECaPS2): More Sky, Less Bias, and Better Uncertainties
Volume 105 of the Journal of Statistical Software
12 blog posts from 2022 that deserve a second look
Top 15 Stunning Arduino Project Ideas For 2023
Some R Conferences for 2023
Web scraping in R
Imputation in R: Top 3 Ways for Imputing Missing Data
FDA: Fellowship in Understanding Pathogenesis and Improving Detection of Flaviviruses
NSF: Future Manufacturing
NSF: Division of Environmental Biology
NSF: Division of Molecular and Cellular Biosciences Core Programs
NSF: Division of Integrative Organismal Systems Core Programs
NIH NOSI: Research on the Etiology, Early Detection, Screening and Prevention of Early-Onset Colorectal Cancer
NIH NOSI: Computational and Statistical Methods to Enhance Discovery from Health Data
R packages:
ggplate: Create Layout Plots of Biological Culture Plates and Microplates
glmmrOptim: Approximate Optimal Experimental Designs Using Generalised Linear Mixed Models
shiny.benchmark: Benchmark the Performance of 'shiny' Applications
blindreview: Blind Review Using Forward Search Procedures
metabias: Meta-Analysis for Within-Study and/or Across-Study Biases
evoTS: Analyses of Evolutionary Time-Series
OssaNMA: Optimal Sample Size and Allocation with a Network Meta-Analysis

Jan 25, 202307:48

#### Science Research Weekly Trailer

Science Research Weekly Trailer

Welcome to Science Research Weekly. This is the successor to my first podcast, Statistics Weekly. Every week, I will cover 3 topic areas: Research Articles, Research Tools, and Research Funding. Check out the accompanying blog at: https://scienceresearchweekly.wordpress.com/

Jan 24, 202300:60

#### Episode 40: The Close of 2022

Episode 40: The Close of 2022

Take the poll to weigh in on the future of Statistics Weekly.

Dec 30, 202201:01

#### Episode 39: Last Stat Before Christmas

Episode 39: Last Stat Before Christmas

In this episode, I pondered the problem of false positives in publications, rotated around the JuliaHub, machinated over machine learning projects, and put R package ‘drawCell’ to the test.
References:
Are most published research findings false in a continuous universe?
Principal component analysis
A no-free-lunch theorem for multitask learning
JuliaHub
Signal-to-noise ratio and statistical tests
Large Enough Sample Condition
Top 11 Interesting Machine Learning Project Ideas For 2023
Code longevity of the R programming language
{drawCell} – R package and Shiny app for drawing cell structures
R-packages:
colorRamp2: Generate Color Mapping Functions
rworkflows: Test, Document, Containerise, and Deploy R Packages
MetChem: Chemical Structural Similarity Analysis
panelsummary: Create Publication-Ready Regression Tables with Panels
arf: Adversarial Random Forests

Dec 23, 202204:58

#### Episode 38: Ho, Ho, Holiday Statistics Projects

Episode 38: Ho, Ho, Holiday Statistics Projects

In this episode, I marched through machine learning abstractions, brought home the BQN, discovered statistics projects to work on over Christmas, and found the Gosling that lays the golden egg,
References:
Quantifying uncertainty of machine learning methods for loss given default
Evaluating the performance of memory type logarithmic estimators using simple random sampling
A fast kernel independence test for cluster-correlated data
Autoencoders for sample size estimation for fully connected neural network classifiers
[R-pkgs] onetime 0.1.0: Run Code Only Once
🚧 WIP 🚧 From Julia to BQN
Using SAS to score a test
Using the cspade action to find frequent gene sequences
The easygoing relationship between computer scientists and null hypothesis significance testing
Top 11+ Coding Projects for Beginners (2023 Edition)
Gosling: Interactive Genomics Charts in R Shiny
Gosling Main Page
R-packages:
POMS: Phylogenetic Organization of Metagenomic Signals
macroBiome: A Tool for Mapping the Distribution of the Biomes and Bioclimate
reservr: Fit Distributions and Neural Networks to Censored and Truncated Data
NPCox: Nonparametric and Semiparametric Proportional Hazards Model

Dec 16, 202206:28

#### Episode 37: Hexadecimal Codes of Christmas

Episode 37: Hexadecimal Codes of Christmas

In this episode, I got into the spirit of Sinterklaas with Python, jumpstarted Julia learning, cleared up confusion between ANOVA and regression results, and color paletted the night away.
References:
Estimation of population variance under ranked set sampling method by using the ratio of supplementary information with study variable
10 Simple Reasons to Learn the Julia Programming Language (in 2023)
Python 3.11.1, 3.10.9, 3.9.16, 3.8.16, 3.7.16, and 3.12.0 alpha 3 are now available
Art in SAS: Christmas wrapping paper
when the results of your anova table and regression coefficients disagree
Select colours from an image in R with {eyedroppeR}
How to make your own #RStats Wrapped!
R-packages:
HodgesTools: Common Use Tools for Genomic Analysis
DependencyReviewer: Tool Suite to Investigate Other Packages
gmvjoint: Joint Models of Survival and Multivariate Longitudinal Data
r2social: App Inclusion of Social Sharing and Connect Buttons
lterpalettefinder: Extract Color Palettes from Photos and Pick Official LTER Palettes
jackknifeR: Jackknife Linear Regression and Correlation
POMADE: Power for Meta-Analysis of Dependent Effects
HTT: Hypothesis Testing Tree

Dec 09, 202205:39

#### Episode 36: Curb your Stats-thusiasm

Episode 36: Curb your Stats-thusiasm

In this episode, I picked the choicest R journal articles, served up a digital poker hand in SAS, and tried not to let a defective R-package get me down.
References:
brolgar: An R package to BRowse Over Longitudinal Data Graphically and Analytically in R
shinybrms: Fitting Bayesian Regression Models Using a Graphical User Interface for the R Package brms
R-miss-tastic: a unified platform for missing values methods and workflows
Bias reduction in the logistic model parameters with the LogF(1,1) penalty under MAR assumption
A simulation study: Using dual ancillary variable to estimate population mean under stratified random sampling
Multi-omics peripheral and core regions of cancer
Efficient and automated bandwidth selection in SAS PROC KPCA
Simulate poker hands in SAS
Covariates: Definition & Uses
Our First Peer-Reviewed Statistical R Packages!
Navigating folders in RStudio: a beginner’s guide
October 2022: “Top 40” New CRAN Packages
R-packages:
HTT: Hypothesis Testing Tree
boxfilter: Filter Noisy Data
ggpcp: Parallel Coordinate Plots in the 'ggplot2' Framework
StatTeacherAssistant: An App that Assists Intro Statistics Instructors with Data Sets
tidyclust: A Common API to Clustering

Dec 02, 202205:36

#### Episode 35: Turkey v. Tukey

Episode 35: Turkey v. Tukey

In this episode, I skimmed over a dimension estimator, skipped past a piecewise linear curve, and slipped by the launch of a simple data science ebook.
References:
The generalized ratios intrinsic dimension estimator
The area under a piecewise linear curve
Adapted Randomization
confusing statistical term #13: missing at random and missing completely at random
When to Use plotly?
PCA for Categorical Variables in R
How to combine Multiple Plots in R
RObservations #42: Using the jinjar and tidyRSS packages to make a simple newsletter template
Ebook launch – Simple Data Science (R)
R-packages:
ASRgenomics: 'ASReml-R' Genomics Tools
flashr: Create Flashcards of Terms and Definitions
scistreer: Maximum-Likelihood Perfect Phylogeny Inference at Scale

Nov 24, 202202:31

#### Episode 34: Genomics, My Dear Watson

Episode 34: Genomics, My Dear Watson

In this episode, I dusted off my genomic gloves with the HCLC-FC and CLIMB methods, waxed poetic with Python’s latest alpha release, and solved plotting mysteries with R-package ‘sherlock’.
References:
Regression discontinuity threshold optimization
Explaining predictive factors in patient pathways using autoencoders
HCLC-FC: A novel statistical method for phenome-wide association studies
Uncertainty-aware mixed-variable machine learning for materials design
Closed-form continuous-time neural networks
New statistical method improves genomic analyzes
synr: Analyze stimulus-color consistency test data
Profile plots in SAS
Optimal linear profile plots in SAS
Python 3.12.0 alpha 2 released
What is Interactive Analytics?
Top 9 Highest Paying Programming Languages (2023)
How to Analyze Likert Scale Data?
Top 10 Data Visualisation Tools Every Data Science Enthusiast Must Know
Open source is a hard requirement for reproducibility
R-packages:
exactLTRE: An Exact Method for Life Table Response Experiment (LTRE) Analysis
sherlock: Graphical Displays to Aid Structured Problem Solving and Diagnosis
cipheR: Encryption and Decryption with Text Ciphers
rjtools: Preparing, Checking, and Submitting Articles to the 'R Journal'
commafree: Call Functions Without Commas Between Arguments

Nov 18, 202206:17

#### Episode 33: Healthy Heaping of Free Stats

Episode 33: Healthy Heaping of Free Stats

In this episode, I took advantage of more than one free data science course, mused over mappings for R, and helped myself to a heap of helper functions with R package ‘mlmhelpr’.
References:
The Hybrid Model of Autoregressive Integrated Moving Average and Fuzzy Time Series Markov Chain on Long Memory Data
Predicting prediction: A systematic workflow to analyze factors affecting the classification performance in genomic biomarker discovery
A supervised topic embedding model and its application
Shrinkage estimators of large covariance matrices with Toeplitz targets in array signal processing
SAS® Fast-KPCA: An efficient and innovative nonlinear principal components method
TDApplied version 2.0.0
Issues While Switching R to UTF-8 and UCRT on Windows
Datacamp Free Access Week
Map any region in the world with R – Part II: Obtaining the coordinates
R ggmap – How to Visualize Spatial Data in R
R-packages:
cauchypca: Robust Principal Component Analysis Using the Cauchy Distribution
nett: Network Analysis and Community Detection
pwrss: Power and Sample Size Calculation Tools
SASdates: Convert the Dates to 'SAS' Formats
planningML: A Sample Size Calculator for Machine Learning Applications in Healthcare
mlmhelpr: Multilevel/Mixed Model Helper Functions
ExomeDepth: Calls Copy Number Variants from Targeted Sequence Data

Nov 10, 202205:35

#### Episode 32: Spooky Scary Statistics

Episode 32: Spooky Scary Statistics

In this episode, I waved my hand across several financial data journal articles, took advantage of free data science tutorials, learned how to run a dev environment from a flash drive, and did a double take at a seeming CRAN package duplication.
References:
Analyzing Intraday Financial Data in R: The highfrequency Package
A Practitioner's Guide and MATLAB Toolbox for Mixed Frequency State Space Models
spsur: An R Package for Dealing with Spatial Seemingly Unrelated Regression Models
Bias reduction in the logistic model parameters with the LogF(1,1) penalty under MAR assumption
The Annals of Statistics Volume 50 Issue 5
R 4.2.2 is released
Call for proposals to organize a hybrid useR! in 2023 or 2024
A trick to combine and split strings
365 Data Science courses free until November 21
A Linux Live USB as a statistical programming dev environment
Top 5 Shiny UI Add-On Packages
September 2022: “Top 40” New CRAN Packages
R-packages:
ExomeDepth: Calls Copy Number Variants from Targeted Sequence Data
minsample1: The Minimum Sample Size
minsample2: The Minimum Sample Size
palaeoverse: Prepare and Explore Data for Palaeobiological Analyses
SAP: Statistical Analysis and Programming
tidyfit: Regularized Linear Modeling with Tidy Data
robustbetareg: Robust Beta Regression

Nov 04, 202207:24

#### Episode 31: Awash in Acronyms

Episode 31: Awash in Acronyms

In this episode, I tried to keep my acronyms straight, dealt with big and tall (data), poked the ribs of SPSS, and contrasted the PIC with the AIC (more acronyms on the march!)
References:
Fast Penalized Regression and Cross Validation for Tall Data with the oem Package
synthACS: Spatial Microsimulation Modeling with Synthetic American Community Survey Data
BGVAR: Bayesian Global Vector Autoregressions with Shrinkage Priors in R
Facilitating time series classification by linear law-based feature space transformation
Severe testing with high-dimensional omics data for enhancing biomedical scientific discovery
Visualize dependencies of missing values
Formatting your Microsoft Excel output using ODS Excel
How to use the image function in R
How to create a heatmap in R
The pheatmap function in R
SAS or R-Yould Should Know!
Understanding ShinyApps
R-packages:
worrrd: Generate Wordsearch and Crossword Puzzles
picR: Predictive Information Criteria for Model Selection

Oct 28, 202206:31

#### Episode 30: The GitHub Secret Sauce

Episode 30: The GitHub Secret Sauce

In this episode, I spilled the secret sauce behind my research methods, predicted extreme events, pursued plots off the beaten ggplot path, and added another simulation-based power analysis tool to my belt.
References:
Statistics Weekly Research GitHub Repository
Flexible time-to-event models for double-interval-censored infectious disease data with clearance of the infection as a competing risk
Predicting the data structure prior to extreme events from passive observables using echo state network
SCpubr: Generate high quality, publication-ready plots of single-cell transcriptomics data
Concordances
the difference between chi square tests of independence and homogeneity
How to create a ggalluvial plot in R?
How to create a Sankey plot in R?
Understanding the Basics of Package Writing in R
R-packages:
VIGoR: Variational Bayesian Inference for Genome-Wide Regression
ggstats: Extension to 'ggplot2' for Plotting Stats
mlpwr: A Power Analysis Toolbox to Find Cost-Efficient Study Designs
ARIMAANN: Time Series Forecasting using ARIMA-ANN Hybrid Model
makeunique: Make Character Strings Unique

Oct 21, 202206:06

#### Episode 29: Business or Bust

Episode 29: Business or Bust

In this episode, I contributed to the Python Developers Survey, wiped a sweaty brow over the Isoband Incident, and booted up package 'bupaverse' for business process analytics.
References:
The R Journal Volume 14/1
gsDesign 3.4.0 is released
Improvements in handling bytes encoding
Join the Python Developers Survey 2022: Share and learn about the community
Who rotated my cookie?
CRAN and the Isoband Incident – Is Your Project at Risk and How to Fix It
Map any region in the world with R – Part I: The basic map
R-packages:
dfms: Dynamic Factor Models
precmed: Precision Medicine
RobustANOVA: Robust One-Way ANOVA Tests under Heteroscedasticity and Nonnormality
bakR: Analyze and Compare Nucleotide Recoding RNA Sequencing Datasets
nestedmodels: Tidy Modelling for Nested Data
powRICLPM: Perform Power Analysis for the Random Intercept Cross-Lagged Panel Model
bupaverse: Easily Install and Load the 'bupaverse'

Oct 14, 202206:20

#### Episode 28: Everything’s Coming Up Machine Learning

Episode 28: Everything’s Coming Up Machine Learning

In this episode, I netted new ways to knock neural networks out of the park, let R do my calculus calculations for me, garnered gentle overviews to many machine learning topics, and made sure my phylogenic trees were publication ready with the R package ‘CancerEvolutionVisualization'.
References:
Pathogen.jl: Infectious Disease Transmission Network Modeling with Julia
calculus: High-Dimensional Numerical and Symbolic Calculus in R
Deep Image Prior for medical image denoising, a study about parameter initialization
On Physics-Informed Neural Networks for Quantum Computers
Not frequentist enough.
ggradar: radar plots with ggplot in R
Mastering Debugging in R
Understanding leaf node numbers when using rpart and rpart.rules
A Gentle Introduction to using Support Vector Machines for Classification
Boosting in Machine Learning: A Brief Overview
Algorithm Classifications in Machine Learning
R-packages:
pirouette: Create a Bayesian Posterior from a Phylogeny
CancerEvolutionVisualization: Publication Quality Phylogenetic Tree Plots
odetector: Outlier Detection Using Partitioning Clustering Algorithms
stats4teaching: Simulate Pedagogical Statistical Data
camcorder: Record Your Plot History
openxlsx2: Read, Write and Edit 'xlsx' Files

Oct 07, 202205:47

#### Episode 27: Statistical Models on the Quick

Episode 27: Statistical Models on the Quick

In this episode, I kept it quick with appropriate microbial resistance data procedures, using a TI 83 calculator for statistics, and hoping for an easy entry to NEON data wrangling with package ‘neonOS’.
References:
Statistical Network Analysis with Bergm
ParMA: Parallelized Bayesian Model Averaging for Generalized Linear Models
AMR: An R Package for Working with Antimicrobial Resistance Data
new version of package declared
3 ways to enhance productivity with AI
TI 83 for Statistics: Easy Steps for Common Problems
Hazard Ratio: Interpretation & Definition
Writing Functions in R
August 2022: “Top 40” New CRAN Packages
R-packages:
easydb: Easily Connect to Common Types of Databases
neonOS: Basic Data Wrangling for NEON Observational Data
getpublist: Obtain a List of Publications from 'PubMed' and 'Google Scholar'
adjustedCurves: Confounder-Adjusted Survival Curves and Cumulative Incidence Functions

Sep 30, 202204:54

#### Episode 26: Random Learning on the Short

Episode 26: Random Learning on the Short

In this episode, I kept it short and sweet with a new statistical journal reference, data science job scraping, and random learning for R-packages.
References:
Generalized resilience and robust statistics
Stochastic continuum-armed bandits with additive models: Minimax regrets and adaptive algorithm
Generate random ID values for subjects in SAS
Linear Regression Equation Explained
An Example of Specifying Within-Subjects Factors in Repeated Measures
Designing Online Data Science Training for the Modern Age
Optimizing my search for Data scientist jobs by scraping Indeed with R
Seeking Inspiration from Random Learning
R-packages:
BiOFI: Feature Identification Between Metabolome and Microbiome in Disease and Health
glmmrBase: Specification of Generalised Linear Mixed Models
REDCapTidieR: Extract 'REDCap' Databases into Tidy 'Tibble's
verbaliseR: Make your Text Mighty Fine
betaSandwich: Robust Confidence Intervals for Standardized Regression Coefficients
DynForest: Random Forest with Multivariate Longitudinal Predictors
roBTT: Robust Bayesian T-Test
alqrfe: Adaptive Lasso Quantile Regression with Fixed Effects
recurrentpseudo: Creates Pseudo-Observations and Analysis for Recurrent Event Data
alkahest: Pre-Processing XY Data from Experimental Methods

Sep 23, 202204:01

#### Episode 25: Machine Learning vs. Magic

Episode 25: Machine Learning vs. Magic

In this episode, I found that Bambi (Bayesian Model Building Interface) is more than a little deer, rooted for the round-robin operator for genetic algorithms, and had my convictions that machine learning is not magic reinforced.
References:
HighFrequencyCovariance: A Julia Package for Estimating Covariance Matrices Using High Frequency Financial Data
Bambi: A Simple Interface for Fitting Bayesian Linear Models in Python
Spbsampling: An R Package for Spatially Balanced Sampling
plot3logit: Ternary Plots for Interpreting Trinomial Regression Models
Learning Base R (2nd Edition)
Python and R for the Modern Data Scientist
Genetic algorithm with a new round-robin based tournament selection: Statistical properties analysis
Healthcare researchers must be wary of misusing AI
New method to identify symmetries in data using Bayesian statistics
Python 3.11.0rc2 is now available
Complex Layouts using the SG Procedures
Factor Analysis Guide with an Example
How to Choose Appropriate Clustering Method for Your Dataset
How to Apply AI to Small Data Sets?
The R Consortium Needs Your Help with satRdays
Visualizing OLS Linear Regression Assumptions in R
R-packages:
TPCselect: Variable Selection via Threshold Partial Correlation
historicalborrow: Non-Longitudinal Bayesian Historical Borrowing Models
historicalborrowlong: Longitudinal Bayesian Historical Borrowing Models
kgp: 1000 Genomes Project Metadata
DBIsqldf: Manipulate R Data Frames Using SQL
latentFactoR: Data Simulation Based on Latent Factors
TSdeeplearning: Deep Learning Model for Time Series Forecasting

Sep 16, 202208:05

#### Episode 24: The Way of the R Warrior

Episode 24: The Way of the R Warrior

In this episode, I chose mediation analysis over a hospital heist, gained additional sample size calculation insights, memorized machine learning methods, and battled my way up a text-based tower using package ‘rwarrior’.
References:
Hierarchical Clustering with Contiguity Constraint in R
Blang: Bayesian Declarative Modeling of General Data Structures and Inference via Algorithms Based on Distribution Continua
Robust Mediation Analysis: The R Package robmed
A Software Tool For Sparse Estimation Of A General Class Of High-dimensional GLMs
bayesanova: An R package for Bayesian Inference in the Analysis of Variance via Markov Chain Monte Carlo in Gaussian Mixture Models
blindrecalc - An R Package for Blinded Sample Size Recalculation
Power and Sample Size for Longitudinal Models in R – The longpower Package and Shiny App
RFpredInterval: An R Package for Prediction Intervals with Random Forests and Boosted Forests
Alternative stopping rules to limit tree expansion for random forest models
gMCPLite: Lightweight Graph Based Multiple Comparison Procedures
pkglite 0.2.1 is released
Python releases 3.10.7, 3.9.14, 3.8.14, and 3.7.14 are now available
Ordinal Data: Definition, Examples & Analysis
How the Population Distribution Influences the Confidence Interval
Talks to watch at the RSS International Conference
How to Avoid Overfitting?
Beneath and Beyond the Cox Model (www.r-bloggers.com/2022/09/beneath-and-beyond-the-cox-model/)
What are the algorithms used in machine learning? (www.r-bloggers.com/2022/09/what-are-the-algorithms-used-in-machine-learning/)
23 New books added to Big Book of R (www.r-bloggers.com/2022/09/23-new-books-added-to-big-book-of-r/)
R Warrior Leader Board (www.r-bloggers.com/2022/08/r-warrior-leader-board/)
R-packages:
oceanexplorer: Explore Our Planet's Oceans with NOAA
OneArm2stage: Optimal One-Arm Two-Stage Phase II Design with Survival Endpoint
rwarrior: R Warrior - An AI Programming Game
manymome: Mediation, Moderation and Moderated-Mediation After Model Fitting
voice: Tools for Voice Analysis, Speaker Recognition and Mood Inference
shinylight: Web Interface to 'R' Functions

Sep 09, 202210:09

#### Episode 23: Biting Off More Statistics Than I Can Chew

Episode 23: Biting Off More Statistics Than I Can Chew

In this episode, I bite off more journal articles than I could chew, bookmarked how to reorder SGPLOT bars, was underwhelmed by R surprises, and muddled through the meta-package ‘easystats’.
References:
On the Programmatic Generation of Reproducible Documents
Automatic Identification and Forecasting of Structural Unobserved Components Models with UComp
exuber: Recursive Right-Tailed Unit Root Testing with R
The R Journal Volume 14/1
A unified Framework for analyzing complex systems: Juxtaposing the (Kernel) PCA method and Graph Theory
Grand Challenges in Mathematical Biology: Integrating Multiscale Modelling and Data
CohensdpLibrary: Compute Cohen's d and its confidence interval in any experimental design
Two types of syntax for the SELECT-WHEN statement in SAS
Order the bars in a bar chart with PROC SGPLOT
Top 10 online data science programs
How to handle Imbalanced Data?
July 2022: “Top 40” New CRAN Packages
Surprising Things You Can Do With R
SAS vs R Programming: Which to Choose and How to Switch
R-packages:
easystats: Framework for Easy Statistical Modeling, Visualization, and Reporting
once: Execute Expensive Operations Only Once
snSMART: Small N Sequential Multiple Assignment Randomized Trial Methods
tidyplus: Additional 'tidyverse' Functions
transPlotR: Visualize Transcript Structures in Elegant Way

Sep 02, 202211:30

#### Episode 22: Python with a Side of Waffles

Episode 22: Python with a Side of Waffles

In this episode, I grabbed the tools to conduct mobility data analysis and interactive plots in Python, detected outliers in spherical data, considered the next programing language to learn, and readied the statistical syrup for waffle plots using package ‘baffle’.
References:
scikit-mobility: A Python Library for the Analysis, Generation, and Risk Assessment of Mobility Data
spNNGP R Package for Nearest Neighbor Gaussian Process Models
Feller-Pareto and Related Distributions: Numerical Implementation and Actuarial Applications
A new outlier detection method for spherical data
Confidence bands in survival analysis
How to stagger labels on an axis in PROC SGPLOT
Experimental Design: Definition and Types
The Difference Between R-squared and Adjusted R-squared
Top 5 Python Libraries for Data Science (2023) Edition
Base-R and Tidyverse Code, Side-by-Side
R Design Patterns, Base-R vs. Tidyverse With a view toward the teaching of R beginners
Hello Shiny Python
Which programming language should I learn?
R Markdown Tips: Code, Images, Comments, Tables, and more
Is Data Science a Dying Profession?
R-packages:
evalR Evaluation of Unverified Code
goldfish Statistical Network Models for Dynamic Network Data
baffle Make Waffle Plots with Base Graphics
ggDoE Modern Graphs for Design of Experiments with 'ggplot2'
jjAnno An Annotation Package for 'ggplot2' Output
multinomialLogitMix Clustering Multinomial Count Data under the Presence of Covariates
IntLIM Integration of Omics Data Using Linear Modeling
scrutiny Error Detection in Science
shinyHugePlot Efficient Plotting of Large-Sized Data
ctmva Continuous-Time Multivariate Analysis
ggcoverage Visualize Genome Coverage with Various Annotations

Aug 26, 202207:31

#### Episode 21: Behind the Statistical Software Curve

Episode 21: Behind the Statistical Software Curve

In this episode, I played catch-up with the Journal of Statistical Software, schemed how to snag a top spot in the XP Learning Challenge, and roared alongside the R-package ‘TRexSelector’.
References:
modelsummary: Data and Model Summaries in R
stringi: Fast and Portable Character String Processing in R
evgam: An R Package for Generalized Additive Extreme Value Models
Julia 1.8 Highlights
The Box-Cox transformation for a dependent variable in a regression
TDApplied: Machine Learning and Inference for Topological Data Analysis
Some project opportunities for Ph.D. students!
Learn data skills, earn points and win money prizes daily with Datacamp!
Designing #TidyTuesday visualisations for mobile (with Quarto)
Find out which data skills are most in demand?
R-packages:
ctmva: Continuous-Time Multivariate Analysis
ggcoverage: Visualize Genome Coverage with Various Annotations
TRexSelector: T-Rex Selector: High-Dimensional Variable Selection FDR Control
countdown: A Countdown Timer for HTML Presentations, Documents, and Web Apps
TDApplied: Machine Learning and Inference for Topological Data Analysis
brassica: 1970s BASIC Interpreter

Aug 19, 202208:37

#### Episode 20: The Count of Monte Carlo

Episode 20: The Count of Monte Carlo

In this episode, I browsed yet more Monte Carlo simulation papers, introduced myself to various structural equation models, championed base R, and went neural to neural with the R package ‘cito’.
References:
A kernel mixing strategy for use in adaptive Markov chain Monte Carlo and stochastic optimization contexts
Bootstrap-based inferential improvements to the simplex nonlinear regression model
Python 3.10.6 is available
Stanine Score: Definition, Examples, How to Convert
The Four Models You Meet in Structural Equation Modeling
Downstream Bioinformatics Analysis of Omics Data with edgeR
Simulating data from a non-linear function by specifying a handful of points
RObservations #36: Opinions on RStudio’s name change. A Bayesian approach with Stan
Base-R Is Alive and Well
R-packages:
cito Building and Training Neural Networks
cbioportalR Browse and Query Clinical and Genomic Data from cBioPortal
gtreg Regulatory Tables for Clinical Research
diffdfs Compute the Difference Between Data Frames
fake Flexible Data Simulation Using the Multivariate Normal Distribution
seeker Simplified Fetching and Processing of Microarray and RNA-Seq Data
ympes Collection of Helper Functions

Aug 12, 202205:44

#### Episode 19: Get Python Not Sleep

Episode 19: Get Python Not Sleep

In this episode, I didn’t get a lot of sleep. But I did get a lot of quirky Python, classy SAS, and ridiculous R.
References:
A systematic review of statistical methodology used to evaluate progression of chronic kidney disease using electronic healthcare records
Statistical inference for a constant-stress partially accelerated life tests based on progressively hybrid censored samples from inverted Kumaraswamy distribution
Python 3.11.0b5 is now available
How to reveal new connections in a knowledge graph with link prediction
https://blogs.sas.com/content/iml/2022/08/01/examples-monte-carlo-simulation.html
5 pillars for a successful DataOps strategy
June 2022: “Top 40” New CRAN Packages
Bioinformatics Analysis of Omics Data with the Shell & R
Script monkey!
Getting fun with ggdogs, ggcats and gganimate
R-packages:
gganimate

Aug 05, 202204:34

#### Episode 18: The Software Formerly Known as RStudio

Episode 18: The Software Formerly Known as RStudio

In this episode, RStudio gets a Posit-tively new name, SAS estimates area with Monte Carlo simulation, and I take in the view of package ‘P2C2M.Skyline.’
References:
Progressive censoring schemes for marshall-olkin pareto distribution with applications
Assessing model adequacy for Bayesian Skyline plots using posterior predictive simulation
Testing for Benford’s Law in very small samples: Simulation study and a new test proposal
Inferring the underlying multivariate structure from bivariate networks with highly correlated nodes
R Can Use Your Help: Translating R Messages
Monte Carlo estimates of area
effectsize: A new Python package for computing effect sizes / standardized differences
Hoeffding’s Inequality
Exogenous and Endogenous Variables in Structural Equation Modeling
Announcing Quarto, a new scientific and technical publishing system
RStudio is becoming Posit
Posit– Why Rstudio is changing its name
How to Create an Interaction Plot in R?
How to Standardize Data in R?
R code snippet : Transform from long format to wide format
Programming a simple minimax chess engine in R
R-packages:
P2C2M_Skyline

Jul 29, 202207:53

#### Episode 17: Tidal Waves of R Packages

Episode 17: Tidal Waves of R Packages

In this episode, I pared down three important -omics data considerations, coded my way into wave of new R packages, boasted about the Big Book of R, and havered over the ‘dots’ package.
References:
Editorial: Statistical Data Science - Theory and Applications in Analyzing Omics Data
Introducing riskCommunicator: An R package to obtain interpretable effect estimates for public health
On the mixed Kibria–Lukman estimator for the linear regression model
Rejection Region (Critical Region) for Statistical Tests
Join rstudio::conf(2022) Virtually
ggdensity: A new R package for plotting high-density regions
R Color
Generalized Linear Models, Part I: The Logistic Model
Part 1 of 3: 300+ milestone for Big Book of R
Part 2 of 3: 300+ milestone for Big Book of R
Part 3 of 3: 300+ milestone for Big Book of R
R-packages:
htmcglm: Hypothesis Testing for McGLMs
nextGenShinyApps: Advanced Tools for Building the Next Generation of 'Shiny' Applications and Dashboards
robustmeta: Robust Inference for Meta-Analysis with Influential Outlying Studies
lnmCluster: Perform Logistic Normal Multinomial Clustering for Microbiome Compositional Data
binaryTimeSeries: Analyzes a Binary Variable During a Time Series
codebookr: Create Codebooks from Data Frames
dots: Dot Density Maps
ggpackets: Package Plot Layers for Easier Portability and Modularization
trouBBlme4SolveR: Troubles Solver for 'lme4'

Jul 22, 202206:40

#### Episode 16: The Wizard of Stats

Episode 16: The Wizard of Stats

In this episode, I clued in on the claims that Linear Programming Discriminant analysis is effective, mused on the methods of PROC SQL in SAS, and fired up a virtual machine to use for package ‘vmr’.
References:
moreThanANOVA: A user-friendly Shiny/R application for exploring and comparing data with interactive visualization
LPDA: A new classification method based on linear programming
Speedups in operations with regular expressions
Why learn SQL?
Correlated Errors in Confirmatory Factor Analysis
Network Graphs in R
vmrPoster
R-packages:
htmldf: Simple Scraping and Tidy Webpage Summaries
gpboost: Combining Tree-Boosting with Gaussian Process and Mixed Effects Models
vmr: Virtual Machines for R

Jul 15, 202204:50

#### Episode 15: Data Synthetic and the Multiverse of Modeling

Episode 15: Data Synthetic and the Multiverse of Modeling

In this episode, I assessed applications for AI-generated synthetic data, read up on the radical R tutorial rflow, and stepped into the ‘multiverse’ package to manage reporting on many different statistical analyses.
References:
Machine learning techniques for forecasting agricultural prices: A case of brinjal in Odisha, India
The risk of bias in denoising methods: Examples from neuroimaging
Bayesian model averaging for nonparametric discontinuity design
AI-generated Synthetic Data, easy and fast access to high quality data?
Use functions in a WHERE statement to filter observations
Cronbach’s Alpha: Definition, Calculations & Example
Updated Comparison of R Graphical User Interfaces
A Major Contribution to Learning R
R-packages:
multiverse: 'Explorable Multiverse' Data Analysis and Reports

Jul 08, 202205:60

#### Episode 14: Outliers from the Reaches of Sample Space

Episode 14: Outliers from the Reaches of Sample Space

In this episode, I explored model selection methods for fending off invading outliers, marveled at May’s Top 40 CRAN packages, refreshed my suppositions on samples space, and rallied up to replace words in text using package ‘rflashtext’.
References:
Robust model selection using the out-of-bag bootstrap in linear regression
Data-informed deep optimization
Compute the multivariate t density function
Sample Space Examples and The Counting Principle
Cohens D: Definition, Using & Examples
Statistical Inference: Definition, Methods & Example
Replace NA with Zero in R
May 2022: “Top 40” New CRAN Packages
R-packages:
sciRmdTheme: Upgraded 'Rmarkdown' Themes for Scientific Writing
rflashtext: FlashText Algorithm for Finding and Replacing Words

Jul 01, 202205:14

#### Episode 13: Funny-Looking Bioinformatics

Episode 13: Funny-Looking Bioinformatics

In this episode, I broadened up my bioinformatics background, found that the latest R-release was a Funny-Looking Kid, and extracted lobster species using package ‘OpenTreeChronograms’.
References:
A New Tobit Ridge-Type Estimator of the Censored Regression Model with Multicollinearity Problem
Selecting predictive biomarkers from genomic data
COBREXA.jl: constraint-based reconstruction and exascale analysis
Partial leverage plots
R 4.2.1 is released
{CIPerm}: Computationally-Efficient Confidence Intervals for Mean Shift from Permutation Methods
When Linear Models Don’t Fit Your Data, Now What?
SVG to PNG
How to Find Unmatched Records in R
Create new variables from existing variables in R
R-packages:
OpenTreeChronograms: Open Tree of Life Chronograms
phacking: Sensitivity Analysis for p-Hacking in Meta-Analyses
RCTS: Clustering Time Series While Resisting Outliers
MakeYourPalette: Make Your Palette
ontologics: Code-Logics to Handle Ontologies
CIPerm: Computationally-Efficient Confidence Intervals for Mean Shift from Permutation Methods

Jun 24, 202207:09

#### Episode 12: Bayesian Fire Engine

Episode 12: Bayesian Fire Engine

In this episode, I tackled ‘omics’ data with a Bayesian fire engine, soared over a timeline of statistical ideas, and decreased statistical bias by increasing bias understanding.
References:
A Survey of Statistical Methods for Microbiome Data Analysis
PyCon US: Successful Return to In-Person in 2022
PUSH, POP, and reset options for ODS graphics
Using Bayesian Additive Regression Trees for Flexible Outcome Modeling
R 4.2.1 scheduled for June 23
Bias in Statistics: Definition, Selection Bias & Survivorship Bias
X-MR (X-Moving Range) Chart
From Anna Menacher: A timeline of the most important statistical ideas of the past 50 years
useR! 2022 – all virtual – is next week!
R-packages:
bnClustOmics: Bayesian Network-Based Clustering of Multi-Omics Data
starticles: A Generic, Publisher-Independent Template for Writing Scientific Documents in 'rmarkdown'

Jun 17, 202205:11

#### Episode 11: Putting the Meta in Meta and Drooling Over Donut Plots

Episode 11: Putting the Meta in Meta and Drooling Over Donut Plots

In this episode, I learned there are both good and bad R books, umbrella reviews are meta on top of meta, flowcharts can be fashioned in ggplot2, and package ‘ggpie’ is more than mere fluff.
References:
Monotone Regression: A Simple and Fast O(n) PAVA Implementation
Pro Data Visualization Using R and JavaScript: Analyze and Visualize Key Data on the Web
Doing Meta-Analysis with R - A Hands-On Guide
Convergence analysis and approximate optimal temporal step sizes for some finite difference methods discretising Fisher’s equation
Early warning signals of financial crises using persistent homology and critical slowing down: Evidence from difference correlation tests
Methodological approaches for assessing certainty of the evidence in umbrella reviews: A scoping review
Python 3.10.5 is available
Weights for residuals in robust regression
The effect of weight functions in a robust regression method
Considerations for generating time-accurate animation with SAS
Extra! Extra! Read all about how SAS is leading in analytics!
Bug BBQ
What Is Specification Error in Statistical Models?
Becoming an R developer: the workshop
Creating flowcharts with {ggplot2}
R-packages:
clinicalsignificance: Determine the Clinical Significance in Clinical Trials
mlrpro: Stepwise Regression with Assumptions Checking
ggpie: Pie, Donut and Rose Pie Plots
gTestsMulti: New Graph-Based Multi-Sample Tests
plnr: A System for Planing Analyses
holiglm: Holistic Generalized Linear Models

Jun 10, 202208:41

#### Episode 10: Multiple Last Names and Growing a Backbone

Episode 10: Multiple Last Names and Growing a Backbone

In this episode, I chained together multiple scientific last names, found heaps of PLOS ONE journal articles, and learned how to grow a backbone in R using package ‘backbone’.
References:
Event History Regression with Pseudo-Observations: Computational Approaches and an Implementation in R
More on Multidimensional Scaling and Unfolding in R: smacof Version 2
A Practical Guide to the Numerical Implementation of Tensor Networks I: Contractions, Decompositions, and Gauge Freedom
Rank-polyserial correlation. Quest for a "missing" coefficient of correlation
Assessing and visualizing fragility of clinical results with binary outcomes in R using the fragility package
backbone: An R package to extract network backbones
Evaluation of spatial Bayesian Empirical Likelihood models in analysis of small area data
Clinical artificial intelligence quality improvement: towards continual monitoring and updating of AI algorithms in healthcare
The interplay between ranking and communities in networks
Random assignment of subjects to groups in SAS
Using formatted CLASS variables
Python 3.11.0b2 is now available
shadowr: Selenium plugin to manage multiple levels of shadow elements on web page
Difference Between Standard Deviation and Standard Error
Clopper-Pearson Exact Method
Applications of Python | Top 10 uses of Python for The Real World
15 Fun & Easy Excel Project Ideas To Sharpen Your Skills
2022-01 Vectorised Pattern Fills in R Graphics
April: “Top 40” New CRAN Packages
R Packages Download Stats
R-packages:
AHSurv: Flexible Parametric Accelerated Hazards Models
misPRIME: Partial Replacement Imputation Estimation for Missing Covariates
tracee: Easily Save Output and Trace it Back to Code
backbone: Extracts the Backbone from Graphs

Jun 03, 202209:33

#### Episode 9: Boxes and Babies

Episode 9: Boxes and Babies

In this episode I thought outside the Box-Pierce; fuzzed, sweated, and flustered my way through basic stats; encountered functions and AI for babies; and chronicled the R-package ‘chronicler’.
References:
NeuralSens: Sensitivity Analysis of Neural Networks
econet: An R Package for Parameter-Dependent Network Centrality Measures
Modified Quantile Regression For Modeling the Low Birth Rate
A Novel Correction for the Adjusted Box-Pierce Test
Comments on identifying causal relationships in nonlinear dynamical systems via empirical mode decomposition
Adaptive numerical simulations with Trixi.jl: A case study of Julia for scientific computing
The balanced bootstrap in SAS
Game-changer AI tool will save mothers and babies
D. Mayo & D. Hand: “Statistical significance and its critics: practicing damaging science, or damaging scientific practice?”
Fuzzy Clustering: Definition
How to Find the P value: Process and Calculations
The Difference Between an Odds Ratio and a Predicted Odds
‘Data analysis with tidyverse’ workshop
Think like a programmeR: the workshop
How to add labels at the end of each line in ggplot2?
Subsetting with multiple conditions in R
Hierarchical data visualization with Shiny and D3
chronicler: Add Logging to Functions
R-packages:
Rwclust: Random Walk Clustering on Weighted Graphs
ScaleSpikeSlab: Scalable Spike-and-Slab
webshot2: Take Screenshots of Web Pages
mverse: Tidy Multiverse Analysis Made Simple
chronicler: Adding Logging to Functions

May 27, 202210:09

#### Episode 8: Shortened Blankets

Episode 8: Shortened Blankets

In this episode, I got lost in among the statistical trees, flipped a Bernoulli trial coin, and wrapped myself up in package ‘ggblanket’.
References:
sensobol: An R Package to Compute Variance-Based Sensitivity Indices
The R Package stagedtrees for Structural Learning of Stratified Staged Trees
Wave Effects of the Fractional Shallow Water Equation and the Fractional Optical Fiber Equation
Two Domains of Meandering Spiral Waves in a Modified Barkley Model
Access for success: SAS Certification practice exams now offered for free
How to unroll frequency data
Sampling Methods: Different Types in Research
Bernoulli Trials: Definition, Examples
ANCOVA: Analysis of Covariance
Measures of Model Fit for Linear Regression Models
ggblanket: making beautiful ggplot2 visualisation simpler
Self-documenting {ggplot}s thanks to the power of monads!
Launch of R Screencasts
How to draw heatmap in r: Quick and Easy way
R packages:
DemographicTable: Creating Demographic Table
Linea: Linear Regression Interface
linelist : Tagging and Validating Epidemiological Data
NovelDistns: Computes PDF, CDF, Quantile, Random Numbers and Measures of Inference for 3 General Families of Distributions
semtree: Recursive Partitioning for Structural Equation Models
getwiki R: Wrapper for Wikipedia Data
simer: Data Simulation for Life Science and Breeding
ggblanket: Wrappers to Make Beautiful 'ggplot2' Visualization Simpler
dimensionalAnalysis: Dimensional Analysis

May 18, 202208:05

#### Episode 7: Riding the Python Lightning

Episode 7: Riding the Python Lightning

In this episode, I dived into non-normal distributions, went from rags to ridges, rode the Python Language Summit lightning, and looked at the ‘GenericML’ package.
References:
Covsim: An R package for Simulating Non-Normal Data for Structural Equation Models Using Copulas
Rags2ridges: A One-stop-L2-shop for Graphical Modeling of High-Dimensional Precision Matrices
Transmissibility in Interactive Nanocomposite Diffusion: The Nonlinear Double-Diffusion Model
Editorial: Modelling collective motion across scales
The Application of the Functional Variable Method for Solving the Loaded Nonlinear Evaluation Equations
The derivative of the determinant of a matrix
Python 3.11.0b1 is now available
The 2022 Python Language Summit: Upstreaming optimisations from Cinder
The 2022 Python Language Summit: Performance Improvements by the Faster CPython team
The 2022 Python Language Summit: Python in the browser
The 2022 Python Language Summit: A per-interpreter GIL
The 2022 Python Language Summit: Achieving immortality
Fiducial Inference: Fisher’s Great Failure
Beta Distribution: Uses, Parameters & Examples
Geometric Distribution: Uses, Calculator & Formula
Alignment cheatsheet
5 Key Data Visualization Principles Explained – Examples in R
Three packages that port the tidyverse to Python
How to embed a Shiny app into your blog posts
R packages:
biogeom: Biological Geometries
GenericML: Generic Machine Learning Inference
PMA2: Penalized Multivariate Analysis

May 13, 202208:36

#### Episode 6: Mixes and Metabolomics

Episode 6: Mixes and Metabolomics

In this episode, I dig into the first articles of a new Journal of Statistical Software volume, try not to mix up package ‘nvmix’ with ‘nmix’, scratch my head over Navier-Stokes equations, and take the random R-package section in a metabolomics direction.
References:
Journal of Statistical Software Volume 102
Convolutional Neural Networks for Very Low-Dimensional LPV Approximations of Incompressible Navier-Stokes Equations
Simulate the null distribution for a hypothesis test
Nmix package - Bayesian inference about univariate Gaussian mixtures with an unknown number of components
What is Power in Statistics?
Member Training: Introduction to SPSS Software Tutorial
Announcing R/Medicine 2022!
New Workshops Series Kick-off
Little useless-useful R functions – Animating datasets
March: “Top 40” New CRAN Packages
An introduction to the metabolomicsR
R packages:
arthistory: Art History Textbook Data
nls2: Non-Linear Regression with Brute Force
metabolomicsR: Tools for Metabolomics Data
conductor: Create Tours in 'Shiny' Apps Using 'Shepherd.js'

May 06, 202209:26

#### Episode 5: Stretching the Limits and Vigorous Calisthenics

Episode 5: Stretching the Limits and Vigorous Calisthenics

In this episode, I stretch before vigorous calisthenics, give a thumbs up to O’Reilly coding books, get to the bottom of R vectorization, try not to panic during debugging, and showcase the R-package ‘fauxnaif’.
References:
A NSFD Discretization of Two-Dimensional Singularly Perturbed Semilinear Convection-Diffusion Problems
Misnomers regarding outliers and their usefulness in statistical modeling
On Bartlett's sphericity test for correlation
Two perspectives on numerical integration
R 4.2.0 Released
Parameter vs Statistic: Examples & Differences
Spurious Correlation: Definition, Examples & Detecting
Hey, check this out, it’s really cool: A Bayesian framework for interpreting findings from impact evaluations
Top 10 Best R Books For Beginners
How to create your own functions in R
RObservations #30: Fixing R’s “messy string concatenation” with a special function
Where for (loop) ARt Thou?
RStudio Community Table Gallery
Don’t Panic! a Scientific Approach to Debugging Production Failure
R packages:
CircularSilhouette: Fast Silhouette on Circular or Linear Data Clusters
forplo: Flexible Forest Plots
pspatreg: Spatial and Spatio-Temporal Semiparametric Regression Models with Spatial Lags
SLBDD: Statistical Learning for Big Dependent Data
cleaR: Clean the R Console and Environment
gameR: Color Palettes Inspired by Video Games
prettymapr: Scale Bar, North Arrow, and Pretty Margins in R
fauxnaif: Convert Values to NA

Apr 29, 202207:37

#### Episode 4: Stealing Python’s Thunder and Rolling Your Rs

Episode 4: Stealing Python’s Thunder and Rolling Your Rs

In this episode, I jump into Julia packages, estimate how to pronounce the generalized Kibria-Lukman estimator, commiserate with Python, and showcase the R-package ‘rsconnect’.
References:
Generalized Kibria-Lukman Estimator: Method, Simulation, and Application
Trending Julia Packages
Use a heat map to visualize an ordinal response in longitudinal data
The McNemar test in SAS
Happy Birthday Neyman: What was Neyman opposing when he opposed the ‘Inferential’ Probabilists? Your weekend Phil Stat reading
How to Interpret Adjusted R-Squared and Predicted R-Squared in Regression Analysis
Three Principles of Experimental Designs
MLDataR – Real-world Datasets for Machine Learning Applications
Getting started with Python using R and reticulate
Text Analysis of Job Descriptions for Data Scientists, Data Engineers, Machine Learning Engineers and Data Analysts
R-packages:
bookdown: Authoring Books and Technical Documents with R Markdown
rconfig: Manage R Configuration at the Command Line
BayesPPD: Bayesian Power Prior Design
spatialreg: Spatial Regression Analysis
pagoda2: Single Cell Analysis and Differential Expression
microeco: Microbial Community Ecology Data Analysis
paletteknife: Create Colour Scales and Legend from Continuous or Categorical Vectors
starter: Starter Kit for New Projects
rsconnect: Deployment Interface for R Markdown Documents and Shiny Applications

Apr 22, 202206:22

#### Episode 3: Jackknifing, Leibniz, and Feeling the Python Breeze

Episode 3: Jackknifing, Leibniz, and Feeling the Python Breeze

In this episode, I play with jackknives, chuckle over Python, list upcoming R conferences, and showcase the R-package ‘RcppBDT’.
References:
Additive Noise-Induced System Evolution (ANISE)
Face Recognition in Single Sample Per Person Fusing Multi-Scale Features Extraction and Virtual Sample Generation Methods
An Epidemiological Compartmental Model With Automated Parameter Estimation and Forecasting of the Spread of COVID-19 With Analysis of Data From Germany and Brazil
Accelerating Jackknife Resampling for the Canonical Polyadic Decomposition
The last Python 3.11 alpha (3.11.0a7) is available
New methods for solving differential equations in SAS
Pascal matrices and inverses
Enhancements to HTML Documentation
Some discussion of “how to get confident with statistics”: Reading, practicing, and questioning
Permutation vs Combination: Differences & Examples
Chi-Square Goodness of Fit Test: Uses & Examples
SPSS Syntax 101
The sftime Package
To impute or not: the case of an RCT with baseline and follow-up measurements
Why you should(n’t) care about Monads if you’re an R programmer
Bayesian Estimation by using rjags Package
Upcoming R conferences (2022)
R-packages:
clusterHD: Tools for Clustering High-Dimensional Data
Allspice: RNA-Seq Profile Classifier
cheatR: Catch Cheaters
spooky: Time Feature Extrapolation Using Spectral Analysis and Jack-Knife Resampling
taxonbridge: Create Custom Taxonomies Based on the NCBI Taxonomy and GBIF Backbone Taxonomy
wrMisc: Analyze Experimental High-Throughput (Omics) Data
findInFiles: Find Pattern in Files
mclogit: Multinomial Logit Models, with or without Random Effects or Overdispersion
aRtsy: Generative Art with 'ggplot2'
glmm: Generalized Linear Mixed Models via Monte Carlo Likelihood Approximation
glmtoolbox: Set of Tools to Data Analysis using Genera
RcppBDT: 'Rcpp' Bindings for the Boost Date_Time Library

Apr 15, 202208:38

#### Episode 2: Lies, Two-Truths and a Lie, and Statistical R-packages

Episode 2: Lies, Two-Truths and a Lie, and Statistical R-packages

In this episode, I drink from the firehose of CRAN packages, learn a use for ‘two truths and a lie’ in the statistics classroom, and showcase the R-package ‘pipeliner’.
References:
https://www.frontiersin.org/articles/10.3389/fams.2022.848898/abstract
https://www.frontiersin.org/articles/10.3389/fams.2022.852314/full
https://stat.ethz.ch/pipermail/r-announce/2022/000681.html
https://cran.r-project.org/web/packages/available_packages_by_date.html
https://www.spss-tutorials.com/open-excel-file-in-spss/
https://www.spss-tutorials.com/spss-missing-values-for-string-variables/
https://errorstatistics.com/2022/04/05/insevere-tests-of-severe-testing/
https://statmodeling.stat.columbia.edu/2022/04/04/two-truths-and-a-lie-as-a-class-participation-activity/
https://statmodeling.stat.columbia.edu/2022/04/05/confidence-intervals-compatability-intervals-uncertainty-intervals/
https://www.statisticshowto.com/research-methods-qualitative-research-and-quantitative-research/
https://statisticsbyjim.com/basics/cohort-study/
https://statisticsbyjim.com/hypothesis-testing/sampling-error/
https://www.r-bloggers.com/2022/04/paper-semi-markov-modeling-for-cancer-insurance/
https://www.r-bloggers.com/2022/04/teaching-data-science-in-the-cloud/
https://www.r-bloggers.com/2022/04/new-features-in-r-4-2-0/
https://www.r-bloggers.com/2022/04/bio7-3-4-released/
https://cran.r-project.org/web/packages/pipeliner/index.html
https://cran.r-project.org/web/packages/pipeliner/vignettes/Introduction.html

Apr 08, 202207:17

#### Episode 1: Getting a Flat Distribution Under Us

Episode 1: Getting a Flat Distribution Under Us

In this episode, I brace myself for the R release of Vigorous Calisthenics, scratch my head at Lyapunov Exponents, survey the brewing battlefield in the AI/ML Wars, and showcase the R-package 'numform'.
References:
https://journal.r-project.org/archive/2021/RJ-2021-109/RJ-2021-109.pdf
https://journal.r-project.org/archive/2021/RJ-2021-080/RJ-2021-080.pdf
https://www.jstatsoft.org/article/view/v101i12
https://www.frontiersin.org/articles/10.3389/fams.2022.818799/full
https://www.frontiersin.org/articles/10.3389/fams.2022.869999/full
https://www.frontiersin.org/articles/10.3389/fams.2022.839669/full
https://cran.r-project.org/index.html
https://pythoninsider.blogspot.com/2022/03/python-3110a6-is-available.html
https://blog.python.org/2022/03/python-3103-3911-3813-and-3713-are-now.html
https://blog.python.org/2022/03/python-3104-and-3912-are-now-available.html
https://julialang.org/blog/2022/02/10years/
https://www.statistics.com/data-literacy-the-chainsaw-case/
https://errorstatistics.com/2022/03/23/the-ai-ml-wars-explain-or-test-black-box-models/
https://statmodeling.stat.columbia.edu/2022/03/27/the-examples-you-see-in-the-textbooks-are-not-representative-of-the-sorts-of-problems-you-see-in-the-real-world/
https://statisticsbyjim.com/regression/interpret-coefficients-p-values-regression/
https://www.r-bloggers.com/2022/03/february-2022-top-40-new-cran-packages/
https://www.r-bloggers.com/2022/03/how-i-analyze-100-ggplots-at-once/
https://cran.r-project.org/web/packages/numform/numform.pdf

Apr 01, 202207:39

#### Statistics Weekly Trailer

Statistics Weekly Trailer

Trailer for the podcast Statistics Weekly.

Mar 30, 202201:03