Getting round some Yahoo Finance problems using Google Finance and R

Yahoo Finance is probably the most widely used free source of historic share price data. Normally it works really well and the integration into R packages like quantmod makes it an easy to use and powerful solution. However, it is not without problems. Here are a couple of them and some solutions using Google Finance. I’m only going to talk about London traded shares and UK based mutual funds. I will be using the UK version of Google finance https://www.google.co.uk/finance throughout.

Obviously spurious data

This normally only happens for shares with small trading volumes or with ETFs and not for FTSE350 companies. Here are charts for the same ETF (Vanguard S&P 500, VUSA.L) from Yahoo Finance and Google Finance. Using quantmod you can easily change the source to use Google instead of Yahoo. The moral of the story is to always check the basic chart before attempting any more sophisticated analysis.

VUSA YahooVUSA Google

Missing data

It is often impossible to find mutual fund data or more exotic indicies on Yahoo Finance. Sometimes this data is available on Google and bringing up the page shows historic data but frequently with no download option. The reason, I understand, is to do with the licence that Google has with the data provider. In this case, calls to the API using quantmod or a manual solution seem to fail as well.
You could of course manually cut and paste the data into a spreadsheet. However as the data appears on several pages this becomes tedious if you want larger amounts of data. It is worth noting that although the drop down at the bottom of the page gives a maximum of 30 lines per page, manually typing in 200 in the appropriate place in the url gives you 200 lines.
It is very easy to set up R to do the data scraping for you. The script below works for me.
You will need to use the Google finance web page to find the stock code you are interested in. I’m going to use the FTSE100 minimum variance index (Google code is INDEXFTSE:UKXMV) in this example – an index I couldn’t find on Yahoo and where there is no option to download the data on Google. You will need to manually add symb, fname, startdate, and enddate to the script.  If there is less data available than you have asked for you may get an error when it tries to bind a non existent file to your data frame.
library(XML)
library(RCurl)
library(lubridate)
library(xts)


### Complete data required below

symb <- "INDEXFTSE:UKXMV"
fname <- "UKXMV"
startdate <- ymd("2014-01-31")
enddate <- ymd("2015-11-16")

## Construct basic URL 

sd <- day(startdate)
sm <- month(startdate, label = T)
sy <- year(startdate)

ed <- day(enddate)
em <- month(enddate, label = T)
ey <- year(enddate)

basicURL <- paste("https://www.google.co.uk/finance/historical?q=",symb,
                  "&startdate=",sm,"+",sd,"+",sy,"&enddate=",em,"+",ed,"+",ey,"&num=200&start=", sep ="")

nrows <- as.numeric(difftime( enddate, startdate, units = "days")/7*5)

### Download data and merge into a data frame called DF


start <- seq( 0, nrows, by = 200) 

DF <- NULL
for(i in 1:length(start)){

theurl <- getURL(paste(basicURL,start[i],sep = ""))  
temp <- readHTMLTable(theurl, which= 4, header = TRUE, stringsAsFactors = FALSE ) 
DF <- rbind(DF, temp)
}

## Tidy up DF
colnames(DF) <- gsub("\\s|\\\n",'',colnames(DF))

DF[DF == "-"] <- NA

head(DF)
tail(DF)

DF[,1] <- mdy(DF[,1])
row.names(DF) <- DF[,1]
DF[,1] <- NULL

id <- c(1:ncol(DF)) 
DF[,id] <- as.numeric(as.character(unlist(DF[,id])))

head(DF)

## Convert the data frame DF into an XTS object

DF.xts <- as.xts(DF)
assign(fname, DF.xts)


head(DF.xts)
plot(DF.xts)
Advertisements

One thought on “Getting round some Yahoo Finance problems using Google Finance and R

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s