Sample Academic Exercises in Analytics

Data Mining Comparative Analysis with WEKA 

The objective of this paper is to conduct a comparative analysis of various data mining methodologies. This paper is focused firstly on how these various models handle different datasets, namely in terms of dimension including the effects of correlation

Evaluating Machine Learning Classification for Internet Ad data

Study of various machine learning classification methods to predict whether a given website image is an advertisement or not. Focuses particularly on the rules based JRip algorithm, and the decision tree methods J48, Random Forest, and RPART. There are 1,558 attributes to describe the 3,279 image observations

Hierarchical Model – Pricing Diamonds Across Channels

Examine hierarchical modeling to fit into the diamonds data. Should there be significant differences not only within channels based on 4-Cs explanatory variable/s but also between channels (internet vs. offline), then we can segment our approach, whether as buyers looking to focus on where to get the best deal, or as sellers looking to maximize sales and price discriminate.

Mathematical Programming for Call Center Optimization

Solve for management problem to determine optimal scheduling of operator shifts for a call center.

Panel Data Analysis

Panel data consists of repeated measurements on the same subject over a period of time, allowing for modeling differences in behavior across subjects.

Spatio-Temporal Housing Data Analysis with R

The data for this assignment comprise observations on housing prices and nine economic covariates—these represent public-domain data for 20,640 California block groups from the 1990 U.S. Census. The original data were provided by R. Kelly Pace and Ronald Barry and are available from the StatLib archive: data file houses.zip at http://lib.stat.cmu.edu/datasets/.

Sports Analytics

Baseball data contains 337 observations and 18 variables where we regress salary over performance data.

Wine Cultivar Data Classification with R.

This dataset consists of results from a chemical analysis of 178 wines grown from three different cultivars (our Class) from the same region in Italy: Barbera (48), Barolo (59), and Grignolino (71). The chemical analysis determined the quantities of 13 constituents found in each of the three types of wine, and are used to predict from which cultivar samples came. dataset https://archive.ics.uci.edu/ml/datasets/Wine


 

Leave a Comment

Your email address will not be published. Required fields are marked *