INTRODUCTION The dplyr is an R-package that is used for transformation and summarization of tabular data with rows and columns. It includes a set of functions that filter rows, select specific columns, re-order rows, adds new columns and summarizes data. Moreover, dplyr contains a useful function to perform another common task, which is the “split-apply-combine” […]

# Tutorials

## Regression Model Assumptions Tutorial

Regression is used to explore the relationship between one variable (often termed the response) and one or more other variables (termed explanatory). Several exercises are already available on simple linear regression or multiple regression. These are fantastic tools that are used frequently. However, each has a number of assumptions that need to be met. Unfortunately, […]

## How to plot basic charts with plotly

INTRODUCTION Plotly’s R graphing library makes interactive, publication-quality web graphs. More specifically it gives us the ability to make line plots, scatter plots, area charts, bar charts, error bars, box plots, histograms, heatmaps, subplots, multiple-axes, and 3D charts. In this tutorial we are going to make a first step in plotly’s world by learning to […]

## How to prepare and apply machine learning to your dataset

INTRODUCTION Dear reader, If you are a newbie in the world of machine learning, then this tutorial is exactly what you need in order to introduce yourself to this exciting new part of the data science world. This post includes a full machine learning project that will guide you step by step to create a […]

## Bayesian A/B Testing Made Easy

A/B Testing is a familiar task for many working in business analytics. Essentially, A/B Testing is a simple form of hypothesis testing with one control group and one treatment group. Classical frequentist methodology instructs the analyst to estimate the expected effect of the treatment, calculate the required sample size, and perform a test to determine […]

## How to create interactive data visualizations with ggvis

INTRODUCTION The ggvis package is used to make interactive data visualizations. The fact that it combines shiny’s reactive programming model and dplyr’s grammar of data transformation make it a useful tool for data scientists. This package may allows us to implement features like interactivity, but on the other hand every interactive ggvis plot must be […]

## How to create reports with R Markdown in RStudio

Introduction R Markdown is one of the most popular data science tools and is used to save and execute code to create exceptional reports whice are easily shareable. The documents that R Markdown provides are fully reproducible and support a wide variety of static and dynamic output formats. R Markdown uses markdown syntax, which provides […]

## How to create visualizations with iPlots package in R

INTRODUCTION iPlots is a package which provides interactive statistical graphics, written in Java. You can find many interesting plots such as histograms, barcharts, scatterplots, boxplots, fluctuation diagrams, parallel coordinates plots and spineplots. The amazing part is that all of these plots support querying, linked highlighting, color brushing, and interactive changing of parameters. Furthermore, iPlots includes […]

## How to create your first vector in R

Are you an expert R programmer? If so, this is *not* for you. This is a short tutorial for R novices, explaining vectors, a basic R data structure. Here’s an example: 10 150 30 45 20.3 And here’s another one: -5 -4 -3 -2 -1 0 1 2 3 still another one: “Darth Vader” “Luke […]