Skip to main content

jbrnbrg

Plotly Scattermapbox with R and Python

I recently tried out plot.ly’s open source graphing library and found it to be challenging but worth the effort. Challenging, in that the documentation has some gaps but worth it in that the features of the standard plots are responsive (via .js) and feature-rich right out of the box. I tried out both python and R versions of plot.ly and I found the R version to be the most straight-forward to use and deploy via shinyapps.

Binary Response and GGally

If I am working on data with a binary response, I like to use the GGally package for its ggpairs function. It provides a way to look at a lot of different data types at the same time but the setup and customization can be a little daunting. In this example, which leverages this crime data, I demonstrate how ggpairs can be used to reveal a lot of information in a single figure.

Predicting Car Crashes with Insurance Data

Purpose Today I review a data set containing information on approximately 8,000 customers of an insurance company. Each record contains two response variables that indicate whether a customer was in a car crash or not and how much said car-crash cost. library(tidyverse) library(stargazer) library(GGally) library(kableExtra) library(ResourceSelection) library(RCurl) library(pROC) clr_dollar <- function(x){ # cleans out commas and $ of string-formatted currency return(as.numeric(gsub('[$,]','',x))) } # Mappings to clean-up data entries for easier manipulation and analysis job_map <- data.

Multiple Linear Regression Modelling; Money Ball

In this post I will pre-process, explore, transform, and model data from baseball team seasons from 1871-2006 inclusive (stats adjusted to match 162-game season). Feature additions and transformations can improve the predictive power of created models and in this post I’ll employ a Box-Cox transformation to acheive this. The goal is to create a model that can predict baseball team wins based on the performance metrics captured in the training dataset moneyball-training-data.