The rvest package allows for simple and convenient extraction of data from the web into R, which is often called “web scraping.” Web scraping is a basic and important skill that every data analyst should master. You’ll often see it as a job requirement. In the following exercises, you will practice your scraping skills on […]
importing data
Tidy Data Reading: Exercises
Every analysis starts with data; and reading data from different sources into R can be very challenging. Multiple formats, multiple libraries, multiple interfaces, and so on. Fortunately, the authors of tidyverse created a number of packages to handle reading data in the most common formats in a simple, intuitive way. This exercise set will give you […]
Easy Web Scraping With Rvest: Exercises
The Internet is full of interesting data, there’s no doubt about it. Some sites, such as Twitter, provide users with systemized access (API) around which some neat R packages have been built. In this exercise set, we practice much more general techniques of extracting/scraping data from the web directly, using the rvest package. Note […]
Protected: Bonus: Obtaining Twitter Data Exercises
There is no excerpt because this is a protected post.
R with remote databases Exercises (Part-2)
This is common case when working with data that your source is a remote database. Usual ways to cope this when using R is either to load all the data into R or to perform the heaviest joins and aggregations with SQL before loading the data. Both of them have cons: the former one is […]
R with remote databases Exercises (Part-1)
This is common case when working with data that your source is a remote database. Usual ways to cope this when using R is either to load all the data into R or to perform the heaviest joins and aggregations with SQL before loading the data. Both of them have cons: the former one is […]
Protected: Loading data from the web Exercises
There is no excerpt because this is a protected post.
Web Scraping Exercises
[For this exercise, before proceeding, first read the rvest package help and the selectorgadget help.] Answers to the exercises are available here. Exercise 1 Consider the url ‘http://statbel.fgov.be/en/statistics/figures/economy/indicators/prix_prod_con/’ Extract all the information load on table ‘Third Quarter 2016’. Exercise 2 Consider the url ‘http://www2.sas.com/proceedings/sugi30/toc.html’ Extract all the papers names, from 001-30 to 268-30 Exercise 3 […]
R-SQL Exercises
How to write Structured Query Language (SQL) code in R. Well there are many packages on CRAN that relate to databases. In the exercises below we cover some of the important data manipulation operations using SQL in R. We will use a ‘sqldf’ package, an R package for running SQL statements on data frames. Answers […]
Scan exercises
In the exercises below we cover the basics of the scan function. Before proceeding, first read section 7.2 of An Introduction to R. Answers to the exercises are available here. For each exercise we provide a data set that can be accessed through the link shown in the exercise. You can scan the data from […]