First, although not necessary, we recommend you use Chrome as your browser. You can freely download and install Chrome here.
In this section we will describe how to install the software tools that we recommend as the main productivity tools for data science. To motivate the use of these tools, we will also provide brief illustrations of how we use them. We will install the following three software tools:
We will also show how to open a GitHub account and sync it with RStudio.
RStudio is an interactive desktop environment, but it is not R, nor does it include R when you download and install it. Therefore, to use RStudio, we first need to install R.
Here we show screenshots for Windows, but the process is similar for the other platforms. When they differ we will also show screenshots for Mac OS X.
If you are using Chrome, at the bottom of your browser you should see a tab that shows you the progress of the download. Once the installer file downloads, you can go ahead and click on that tab to start the installation process. Other browsers may be different so you will have to find where they store downloaded files and click on them to get the process started.
If using Safari on a Mac, you can access the download here:
Now you can now click through different choices to finish the installation. We recommend you select the all the default choices.
Even when you get a omnimous warning
When selecting the language, consider that it will be easier to follow this book if you select English.
Continue to select all the defaults:
On the Mac it looks different, but you are also accepting the defaults:
Congratulations! You have installed R.
Although we highly recommend that beginners use R through RStudio, you can use R without RStudio.
You can start it like any other program.
If you followed the default installation, on Windows a shortcut will appear on your desktop which you can click to start R.
On the Mac, R will be in the Application folder.
If you start R without RStudio, you will see an R console in which you can start typing commands:
But we will be much more productive using an editor developed for coding, such as the one provided by RStudio. In the next section, we demonstrate how to install RStudio.
On the Mac, there are fewer clicks. You basically drag and drop the RStudio Icon into the Applications folder icon here:
Congratulations! You have installed RStudio. You can now get started as you do on any other program in your computer.
On windows you can open RStudio from the Start menu. If RStudio does not show, you can search for it:
On the Mac, it will be in the Applications folder:
Pro tip for the Mac: To avoid using the mouse to open RStudio, hit command+spacebar to open Spotlight Search and type RStudio into that search bar, then hit enter.
Another great advantage of RStudio projects is that one can share them with collaborators or the public through GitHub. To do this, we will need a piece of software named Git as well as access to a Unix terminal.
The installation process for Git is quite different for Mac and Windows. We include both below.
Git is what we refer to as a version control system. These are useful for tracking changes to files as well as coordinating the editing of code by multiple collaborators. We will later learn how to use GitHub which is a hosting system for code. You need Git to interact with GitHub. Having your code and, more generally, data science projects on GitHub is, among other things, a way to show employers what you can do.
Git is most effectively used with Unix, although one can also use it through RStudio. In the next section, we describe Unix in more detail. Here we show you how to install software that permits you to use Git and Unix. The installation process is quite different for Windows and Mac, so we include two different sections.
Warning: These instructions are not for Mac users
There are several pieces of software that will permit you to perform Unix commands on Windows. We will be using Git Bash as it interfaces with RStudio and it is automatically installed when we install Git for Windows.
Start by searching for Git for windows on your browser and clicking on the link from git-scm.com.
This will take you to the Download Git page from which you can download the more recent maintained build:
You can now continue selecting the default options.
You have now installed Git on Windows.
A final and important step is to change a preference in RStudio so that Git Batch becomes the default Unix shell in RStudio:
To check that you in fact are using Git Bash in RStudio, you can open a New Terminal in RStudio:
It should look something like this:
Before we show you the installation process we introduce you to the Terminal. Macs already come with this terminal and it can be used to learn Unix. We can also use it to check if Git is already installed and, if not, start the installation process.
You might have Git installed already. One way to check is by asking for the version by typing:
git --version
If you get a version number back, it is already installed. If not, you will get the following message:
and you will be asked if you want to install it. You should click Install:
Congratulations. You have installed Git on your Mac.
Reminder: On Windows, we install Git Bash. We do not need to do this on the Mac since they come with the terminal pre-installed and we can use this to run Unix commands.