For loop: Run the same code multiple times, for different values of a variable.
Loop = the whole thing, one loop of many iterations.
Iteration = one run through of the loop.
for ( index_variable in all_values ) {
{ CODE BLOCK is here within }
Here you write code that you want to run in one iteration, with whatever the current value of index_variable is
index_variable = will take iteratively each value of all_values
code code code
more code
code code
}
for (i in 1:5) {
# i is the index variable
print(i)
}
## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5
For the first iteration, i = 1:
i = 1
print(1)
## [1] 1
Second iteration, i=2:
i=2
print(2)
## [1] 2
…until the fifth and last iteration, i=5:
i=5
print(5)
## [1] 5
cat()
instead of print()
…all_my_favourite_things = c("bikes", "coffee", "brains")
for( one_thing in all_my_favourite_things ) {
cat("\nI love", one_thing)
}
##
## I love bikes
## I love coffee
## I love brains
for( one_thing in all_my_favourite_things ) {
if ( one_thing == all_my_favourite_things[ length(all_my_favourite_things) ] ) {
cat("\nFinally, I also love", one_thing)
} else {
cat("\nI love ", one_thing) }
}
##
## I love bikes
## I love coffee
## Finally, I also love brains
# we have to initialize an empty vector, so it exists in the environment. Otherwise, R will not know where to store the output of the iteration!
parent_vector = NULL
all_values = c(10, 12, 28, 34)
for (i in all_values){
new_value = i*2 # run the operation on the current value of i
parent_vector = c(parent_vector, new_value) # add the new value to the end of the parent_vector
cat("\nThis is the parent vector:", parent_vector) # print something out to yourself
}
##
## This is the parent vector: 20
## This is the parent vector: 20 24
## This is the parent vector: 20 24 56
## This is the parent vector: 20 24 56 68
When you need to run the same code, on many different people/samples/conditions/etc., and you find yourself copy-pasting the same code over and over again.
Loops can take a long time to run. Lots of people generally caution to avoid loops. Yes, if you have a massive data set (1000’s of observations of 100’s of variables), you will want to find ways to do things using “vectorized” code, i.e.the apply family (sapply(), lapply(), etc.)…
However, in my experience, I use loops as needed. They are human-readable and logical to me, I’m okay waiting ~2 min if needed for my loop to run.
“If the loop isn’t the bottleneck, it’s almost always more readable that way to me, so I do it.” - Cory on Stack Overflow
Before putting any code in the loop, make sure you are looping through your variable correctly. You can do this by first just printing out the value of each iteration, with no other code: i.e. for(i in 1:5) { print(i) }
When building, test the internal code on just one iteration, without running the whole loop. You can do this by assigning i=1 in your environment, and running all the code within the loop line-by-line on i=1. If this works, then you can let the loop go on all values of i (i.e. i=1:5
)!
Within the loop, have code that prints out messages to yourself. This is you know where the loop is at while it’s going. If it has an error, these messages can also help you notice what step is causing the error.
Add in some conditionals within the loop as error detection… for example, if you know something should be length == 5, then add an if statement (i.e. if length != 5
then break
will stop the loop or next
will move to the next iteration).
Notice how when you run the loop, things will save to your environment… with each iteration it overwrites these values, so what you see in your environment after the loop has finished is only the LAST iteration.
How do you save things within the loop? CONCATENATE WITH A PARENT VECTOR/DATA FRAME so you can save the results from each iteration.
You can SAVE files/plots to a directory within each loop, using commands such as write.csv()
or ggsave()
. Remember, you’ll have to create a file_name variable so you can save it with an informative name for each iteration…
Loop through the following character vector, calculate the number of characters in each word, and print out the following sentence: "There are __ characters in the word __."
all_words = c("bikes", "biology", "coffee", "serendipity")
Loop through the following numeric vector. If it is an even number, store it in an “even number” parent vector. If it is an odd number, store it in an “odd number” parent vector. If is is NA, move on to the next iteration. ONLY print a message to yourself showing you what iteration you’re on, but nothing else.
all_nums = c(10,12,28,34,NA,NA,11,11)