Becoming an R Programmer - A brief introduction to R programming

Published

Blog image

If you're reading this article, you've probably already heard of the R programming language heard, but perhaps avoid them. R is a statistical computing and graphics programming language that allows you to clean, analyze, and graph your data. It is often used by researchers in various disciplines to estimate and present results, as well as by teachers for Statistics and research methods used. It's free, which makes it an attractive option, but it relies on programming code - rather than drop-down menus or buttons - to get the job done. Programming languages ​​can be intimidating. Maybe you like the comfort and familiarity of the statistics program you've been working with. Maybe you don't have the time to learn a new skill. Maybe you just don't know where to start. These are all valid reasons to postpone the use of R. But we use R in research and teaching, and we believe the benefits far outweigh the time and effort required to get started. We're here not only to convince you to use R, but also to provide you with some resources to do so.

Reasons for using R

One of the strongest features of R is that it is open source, meaning H. Anyone can access the underlying code used to run the program and add their own code for free. This means that R:

If you're reading this article, you've probably heard of the R programming language, but perhaps avoid it. R is a statistical computing and graphics programming language that allows you to clean, analyze, and graph your data. It is widely used by researchers in various disciplines to estimate and present results, as well as by teachers of statistics and research methods. It's free, which makes it an attractive option, but it relies on programming code - rather than drop-down menus or buttons - to get the job done. Programming languages ​​can be intimidating. Maybe you want the comfort and familiarity of the statistics program you've been working with. Maybe you don't have the time to learn a new skill. Maybe you just don't know where to start. These are all valid reasons to postpone the use of R. But we use R in research and teaching, and we believe the benefits far outweigh the time and effort required to get started. We're here not only to convince you to use R, but also to provide you with some resources to do so.

  • will always be able to carry out the latest statistical analyzes as soon as someone thinks of them;
  • will correct its errors quickly and transparently;
  • and has brought together a community of programming and statistics nerds (aka useRs) that you can turn to when you need help.

Anyone can write their own R code, which means anyone can add to the huge list of R tools. Programmers submit their code to R in the form of “packages.” Some packages specialize in specific types of analysis, while other packages are much broader. Stephane Champely's "pwr" package, for example, specializes in carrying out performance analysis. In contrast, APS Fellow William R. Revelle's "psych" package can perform everything from descriptive statistics to item response theory to mediation analysis. At the beginning of 2017, almost 10,000 packages are available. And as soon as a new statistical approach is developed, someone creates a new package or adds new tools to an existing package.

Additionally, anyone can see the code used in a package. And there are many users who know what they are doing and can spot programming errors when they occur. Package authors will tell you that their email inboxes are flooded with emails from R colleagues who have encountered bugs in their code. This means that errors are found quickly and fixed quickly. As a user of R, you don't have to wait a year for a new version of a package to be released; new updates are available as authors make changes to their packages. And these updates are published, making the entire process transparent.

R programming - source: coursera.org

This dynamic between typical R users who want to explore data and package authors who want to provide new techniques is incredibly collaborative - so much so that R users find themselves in a community of researchers and programmers. For some, this interaction is limited to asking for help (often it's as simple as Googling a question). For those who believe their soulmate is another R user (of which there are many), there are meetup groups and entire conferences organized around R all over the country.

Now the question remains: what should you use R for? For everything. No, seriously, anything. Throw away SPSS, SAS, and STATA because R can do all the descriptive analysis, regression equations, (M)AN(C)OVA, and hierarchical linear modeling you want. You don't need to buy MPlus because R handles structural equation modeling. Save yourself the trouble of opening Excel because merging records, cleaning data, identifying important rows or columns, and even updating your grades can all be done in R. Save money on crayons because R creates any kind of chart or graph you can imagine, even if it's three-dimensional or interactive or both. R can be used with word processors such as LaTeX, allowing you to integrate your results directly into the manuscript itself. Do you work with Microsoft Word because your employees like to track changes? R creates APA-formatted tables, complete with significance stars and horizontal lines, and exports them as .doc files for your convenience. R can perform both frequentist and Bayesian statistics. R can leverage your multi-core processor and run analyzes in parallel. Search for "a little fun with R" and learn how to make a winking elephant. R can bootstrap, simulate, randomize, resample, multiply, impute and park your car. Well, R can't park your car - yet.

At a global scale, R can address many of the challenges of conducting reproducible research. A particular study may not be reproducible for a variety of reasons, but one of the simplest is that we often forget what exactly we did with our data to obtain our results. How did you create values ​​from your items - through averaging, summation, reverse scoring, or item response theory? Did you center variable two? Which participants did you exclude and according to what criteria? We often come back to our own data and ask ourselves, “Wait, what did I do here?” R can solve these problems because you use scripting to perform your analysis. Scripting means that you write code that will later be executed to manipulate data, perform analysis, and create graphics. In other words, when using R, you write a document that contains everything you did in analyzing your data, in the order in which you did it. In theory, you can make your code and data available to literally anyone in the world, and they can use the code and data to reproduce your results, statistics, and charts without having to do any extra work or thought. This ability to share your analyzes has been expanded by online databases such as the Open Science Framework, where you can make your analysis scripts and data from your research projects publicly available.

A final reason why you should useR is that R is increasingly being used as an industry standard in the field of data analysis, also known as "data science." Many companies (e.g., Facebook, Merck, Pfizer) that hire psychology PhDs hire candidates who have both solid statistics and programming skills. Learning R makes you a more attractive candidate when applying for non-academic positions, and teaching R opens up more career options for your students.

How to actually become a user

You might be thinking, "R sounds great, but I have absolutely no programming experience. How can I even get started with R?" No fear! Here you will find some concrete tips to help you on your way to becoming a UseR expert:

Install R and RStudio . The first step to becoming a user is installing the right software on your computer. In the old days (technically, pre-2012), the learning curve for R was incredibly steep because the only graphical window you could work with was a big blank white console - the kind of blank slate that every psychologist's heart is with Fear filled. Some really great engineers decided that this was terribly inefficient and developed a graphical user interface (GUI) called RStudio. This made R more user-friendly even for people without programming knowledge. We strongly recommend that you install RStudio in addition to R as it will make your life exponentially easier.

Learn the basics. There are some great tutorials freely available on the internet that are great for getting started with mastering R. We searched far and wide (all over the internet) and found a handful of useful resources, such as: B." Learning Statistics with R " by Dan Navarro and " YaRrr: A Pirate's Guide to R " by Nathaniel D. Phillips (see page 22 for full article). You can even learn R with accompanying cat GIFs. All of these tutorials can be found in our extensive list of R resources available online.

Explore the advanced techniques. At this point, the use of R depends on your research program and your own teaching needs. In our resource list, we have pointed out some packages that we use regularly, and we have included some packages that are useful for advanced statistical and graphical techniques. Start exploring these packages and dive into topics and tools that sound interesting to you. After a while you will come across new packages on their own. Keep an eye on the R-Bloggers.com website to stay up to date on new trends (such as the new package fivethirtyeight from Andrew Flowers, the quantitative editor of FiveThirtyEight.com ). The more you use R, the more you'll get out of it. If you know the R language well enough, you can even write your own functions and packages and make them available to the public for general use.

We hope this brief introduction has given you the tools and momentum to start using R for your analysis. R is an incredibly flexible and complex research tool, but once you master it, you can do (almost) anything.

You might find this interesting