In the last part of this series we looked at an R equivalent to VLOOKUP in Excel. Another use of VLOOKUP is to match data on a one-to-one basis. For example, inserting the price of a stock from a master prices table into a table of stock holdings. I can’t believe how easy this is to do in R!
Here are two data frames. One contains the phonetic alphabet and the other contains a message.
> head(phonetic) Letter Code 1 A Alpha 2 B Bravo 3 C Charlie 4 D Delta 5 E Echo 6 F Foxtrot
> message No Letter 1 1 R 2 2 I 3 3 S 4 4 A 5 5 M 6 6 A 7 7 Z 8 8 I 9 9 N 10 10 G
To find the phonetic pronunciation of each letter in the message, all we need to do is to combine the two data frames using merge, ensuring that there is a column by the same name in each data frame. Unfortunately you can either sort it by the column you are using to combine by (“Letter” in this case) or not at all (ie random, or an anagram for free!) To get round this, I sorted by an extra ID column (called “No”) and then deleted it afterwards.
> result <- merge(message, phonetic, by = "Letter") > result <- result[order(result$No),] > result[,2] = NULL > result Letter Code 8 R Romeo 4 I India 9 S Sierra 2 A Alpha 6 M Mike 1 A Alpha 10 Z Zulu 5 I India 7 N November 3 G Golf
This post from r-statistics.com helped me with the sorting idea and also describes an alternative approach using join from the {plyr} library. I know I need to get into {plyr} and it’s on my list of “known unknowns” for my journey to being competent in R.