Skip to main content

jbrnbrg

Tag: prediction

Predicting House Sale Price with the Ames Housing Data

In this post I’m going to perform simplified bottoms-up EDA, model development, prediction, and model evaluation using a public data set that contains data for houses that sold in Ames, Iowa. This post essentially revisits a previous analysis I performed in a group for my MS degree program at CUNY. The main differences here will be that my analysis will be abbreviated to illustrate my skillset, I’ll be using python instead of R, and I will be relying on python’s fantastic scikit-learn library to perform the regressions.

Fraudulent Transaction Detection with GBM

Introduction In this post I will be creating a predictive model to identify fraudulent credit card transactions on a public data set from kaggle. Along the way, I will be reviewing some of the functionality of R’s gbm package for predictive modelling. Data Overview The data set contiains 280K+ records of credit card transactions from a two-week period in which a small percentage of transactions have been labeled as fraudulent.

LSTM for EMS Call Volume Prediction

Multivariate time-series forecasting is a non-trivial task when it comes to complex seasonality. Forecasting: Principles and Practice by R.J. Hyndman and G. Athanasopoulo, gives several powerful examples if you’re using R and dealing with seasonality using Fourier terms for each seasonal period (kinda like I did in this post). In this post I’ll be using’s Keras RNN’s module for LSTM in python and forecasting the next 24 hours of call volume, per hour, into the future using the past 24 hours of my EMS data along with hourly weather data from Central Park via the NOAA.

Forecasting Seasonal Time Series in R

Traditional forecasting methods require stationary data to make a valid forecast. In this post I’ll run through a relatively simple time-series forecast that employs forcast package’s auto.arima function and regression tress to via the rpart library. library(RCurl) library(forecast) library(lubridate) library(tidyverse) library(ggforce) library(kableExtra) library(rpart) library(rpart.plot) library(data.table) Here’s a preview of the data I’m working with. It’s the same EMS data (with different aggregations) from my EMS Flexdashboard post. https://jbrnbrg.

Binary Response and GGally

If I am working on data with a binary response, I like to use the GGally package for its ggpairs function. It provides a way to look at a lot of different data types at the same time but the setup and customization can be a little daunting. In this example, which leverages this crime data, I demonstrate how ggpairs can be used to reveal a lot of information in a single figure.