Monthly Archives: September 2012

Association Rule Learning and the Apriori Algorithm

Association Rule Learning (also called Association Rule Mining) is a common technique used to find associations between many variables. It is often used by grocery stores, retailers, and anyone with a large transactional databases. It’s the same way that Target knows your pregnant or when you’re buying an item on Amazon.com they know what else you want… Read More »

Category: Uncategorized

Data Frames and Transactions

Transactions are a very useful tool when dealing with data mining.  It provides a way to mine itemsets or rules on datasets. In R the data must be in transactions form.  If the data is only available in a data.frame then to create (or coerce) the data frame to transaction the researcher may use the… Read More »

Category: Uncategorized

Chi Square Probability Distribution Code Using PHP

This example is a Web application piece of code that I wrote to add a (approximate) p-value to some dynamically generated crosstabs.  This will allow a researcher to provide a way to deliver data over the web and will allow the researcher a way to calculate the p-value from a distribution based on input data.… Read More »

Category: Uncategorized

N-Way ANOVA

N-Way ANOVA example Two-way analysis of variance is where the rubber hits the road, so to speak. This extends the concepts of ANOVA with only one factor to two factors. When there are two factors this means that there can be an interaction between the two factors that should be tested. As one might expect… Read More »

Category: Uncategorized

One-Way ANOVA

One-Way ANOVA Analysis of variance is a tool used for a variety of purposes. Applications range from a common one-way ANOVA, to experimental blocking, to more complex nested designs. This first ANOVA example provides the necessary tools to analyze data using this technique. This example will show a basic one-way ANOVA. I will save the… Read More »

Category: Uncategorized

Using R to connect to a SQL Server and MySQL Database using MS Windows

Connecting to MySQL and Microsoft SQL Server Connecting to a MySQL database or MS SQL Server from the R environment can be extremely useful.  It allows a researcher direct access to the data without have to first export it from a database and then import it from a csv file or entering it directly into… Read More »

Category: Uncategorized

Kendall’s Tau

Kendall’s Tau This is an example of Kendall’s Tau rank correlation.  This is similar to Spearman’s Rho in that it is a non-parametric measure of correlation on ranks.  It is an appropriate measure for ordinal data and is fairly straight forward when there are no ties in the ranks. When ties do exist then variations… Read More »

Category: Uncategorized