How to install every R package your scripts require using Python

Posted on Updated on

One of the annoying things many R users probably face is the annoyance of having to re-install all the R packages you use.

This can be necessary for a variety of reasons. You might buy a new computer or upgrade (as I did recently) to a new version of R, or you might want to dual boot your Windows computer with Linux.

Either way, you can be in a situation where you need to re-install all of your packages. This can be a pain.

So, I have written a Python script that does this for you. In essence it looks through your entire computer for R files and first finds where they have the words “require” or “library”. It then strips out the names of the packages required. Finally, it creates and then calls an R script which will check if these packages are installed and if they are not, it will install them.

I wrote this script because I had installed R 3.2 and all of my packages were wiped and I figured it would be easier to write this script than the annoyance of reinstalling them when I needed them. I am posting the code here in case anyone else needs it.

If people find it useful I might recode it in R.

Here is the code. Just copy and paste and save it as “InstallRPackages.py” and run from your home directory or wherever.


##############
import os

####### This is the root directory to start the search from. This may need modified ####
rootdir = '.'

#### Function to find the indices of all occurences of something in a string
def find_all(a_str, sub):
    start = 0
    while True:
        start = a_str.find(sub, start)
        if start == -1: return
        yield start
        start += len(sub) # use start += 1 to find overlapping matches
#### List of packages that need to be checked. This will be populated as it runs through the R code
packagecheck = list()

for subdir, dirs, files in os.walk(rootdir):
    for file in files:
		if ".R" in file:
			f = open(os.path.join(subdir, file))
			for line in f:
				if "library" in line or "require" in line:
					data = line.split(";")
					for i in data:
						if i.replace(" ", "").replace("require", "install.packages").replace("library", "install.packages").replace('(', '("').replace(')', '")').replace('""', '"').replace(")install",");install").replace("}", ")")[0:17] == "install.packages(":
							pack2print = i.replace(" ", "").replace("require", "install.packages").replace("library", "install.packages").replace('(', '("').replace(')', '")').replace('""', '"').replace(")install",");install").replace("}", ")").replace(")install",");install")
							if len(list(find_all(pack2print, ','))) > 0:
								pack2print = pack2print[0:list(find_all(pack2print, ','))[0]+1]
								pack2print = pack2print.replace(',',')')							
							if len(list(find_all(pack2print, '"'))) > 1:
								pack2print = pack2print[0:list(find_all(pack2print, '"'))[1]+2]
								### Now strip out the stuff and leave the package name								
								pack2print = pack2print.replace("install.packages(", "").replace(")","")
								### Add the package to the package list if it's not there already
								if pack2print not in packagecheck:
									packagecheck.append(pack2print)


text_file = open("InstallPackages.R", "w")
text_file.write('options(repos=structure(c(CRAN="http://cran.cnr.berkeley.edu/")))\n')

for pp in packagecheck:
	text_file.write("if(" + pp + "%in% installed.packages() != T)\n")
	text_file.write("{\n")
	text_file.write('print("Installing ' + pp.replace('"','') + '")\n')
	text_file.write("\t install.packages(" + pp + ", quiet = T)\n")
	text_file.write("}\n")

text_file.close()
import os
os.system("Rscript InstallPackages.R")
print "finished"
Advertisements