Introduction

Purpose

This document illustrates how R and RStudio can be used for the 4 steps of data analysis:

  1. Data acquisition
  2. Data “cleaning” (process the data to be researchable)
  3. Analysis
  4. Reporting (conveying your results)

If you have correctly installed R and Rstudio, this document should run and produce both graphs and regression results.

What this does

This example code uses R to

Output

This program uses tidyquant’s tq_get function to obtain stock price data for Ford from January 1, 2010 to December 31, 2016.

The program will write the dataset, as a CSV file, to your working directory in R, which is /datadisk/home/rmcd/tex/rclass/code

Here are the first four lines of the downloaded data:

# A tibble: 4 x 7
        date   open   high    low close    volume adjusted
      <date>  <dbl>  <dbl>  <dbl> <dbl>     <dbl>    <dbl>
1 2010-01-04 12.747 12.885 12.597 10.28  60855800 8.201456
2 2010-01-05 13.098 14.089 13.036 10.96 215620200 8.743967
3 2010-01-06 14.051 14.364 13.951 11.37 200070600 9.071067
4 2010-01-07 14.364 14.653 14.189 11.66 130201700 9.302429

We take the data and

Return Analysis

Now we analyze returns and volatility

Plots

Historical Volatility

Compute historical volatility by year.

Volatility by year
Year Volatility (%)
2010 38.105
2011 39.982
2012 25.368
2013 23.952
2014 21.021
2015 22.155
2016 26.164

Absolute and squared returns regressed on year dummies

Dependent variable:
abs(return) Squared return
(1) (2)
year2011 -0.001 0.0001
(0.001) (0.0001)
year2012 -0.007*** -0.0003***
(0.001) (0.0001)
year2013 -0.007*** -0.0004***
(0.001) (0.0001)
year2014 -0.009*** -0.0004***
(0.001) (0.0001)
year2015 -0.008*** -0.0004***
(0.001) (0.0001)
year2016 -0.007*** -0.0003***
(0.001) (0.0001)
Constant 0.019*** 0.001***
(0.001) (0.0001)
Observations 1,761 1,761
R2 0.070 0.045
Adjusted R2 0.067 0.041
Residual Std. Error (df = 1754) 0.012 0.001
F Statistic (df = 6; 1754) 21.941*** 13.645***
Note: p<0.1; p<0.05; p<0.01