Control flow

## Prologue As we discussed in Into to Coding, programming is the process of writing sequential steps, which are executed by your machine interpreter, to achieve some desired outcome. Hence programming is all about controlling what happens with your inputs, depending on what your input(s) is(are). Let's say you need a code to tell whether an input, a numeric, is even or odd. To do that your code has to divide the number by 2. If there is no remainder, it should tell you that the input is even and if there is a remainder it should tell you that it is even. Your code has to execute the same steps in the same order for every input it ever receives. So, the idea is your code has to be able to decide how to proceed depending on the input it has, i.e., it has to be able to test if there is a remainder or not, depending on the results of such test it has to declare the odd or even status of the input. And you, the programmer, have to design your code, with the detailed recipe, so the code can do its job on its own, as long as it receives the right type of inputs it is supposed to receive. This is the core idea of programming, and the tools of a programming language which enables the users to design the 'flow' of the algorithm, is generally called the control flow. It has two main families, conditional statements and loops. We shall discuss them in detail in this page. ## Conditional statements Conditional statements are the statements which dictates the conditions in which a chunk of code is run. In R, like many other language, such as Python, C, Matlab, conditional statements constitute of `if` and `else`. We shall discuss both in the following sections. ### Stand-alone `if` statements Okay - so first the syntax - then everything else. This is how a simple (or, stand-alone) `if` statement looks like: ```r if (condition) { # Some code here # The code within curly brackets are # Executed only if condition is met } ``` When you are writing an `if` statement, type `if () {}` first. After that, fill in the condition in the round bracket immediately after the `if` keyword, and the executable code within the curly brackets after the closed round bracket. In our experience, doing this has some advantage. The advantages will be automatically clear to you at the end of this page, we shall also explicitly discuss the benefits of this practice after a few sections. Let's take a look at a simple example. Let's say we need our code to add 1 to the input numeric, if it is below 10. If it is equal to or above 10 it should not do anything. Let's try to write this code. For writing this, we need a way to enable our R code to test if a numeric variable is larger than another numeric variable. For that, we have to be familiar with the relational operators in R. #### Relational operators Relational operators compare two numbers and determine whether an expression of comparison is true or false. Let's say, we set `a <- 3`. Then a comparison expression would be `a < 1`. Where `<`, is a relational operator. The output of such a comparison using a relational operator is a logical: `TRUE` or `FALSE`. The different relational operators in R is listed in the following table: | Relational operator | Description | | ---- | ---- | | `==` | Equal to | | `!=` | Not equal to | | `<` | Less than | | `>` | Greater than | | `<=` | Less than or equal to | | `>=` | Greater than or equal to | Let's test some of these. ```r 4 > 9 ``` ``` > [1] FALSE ``` So, the statement that 4 is greater than 9 is false. Another example: ```r j <- 4 < 9 print(j) ``` ``` > [1] TRUE ``` The statement that 4 is less than 9 is true and the output logical can be assigned to another variable. Similar operations can be run on vectors also. Check the next example: ```r v <- c(4, 6, 8) u <- 5 v < u ``` ``` > [1] TRUE FALSE FALSE ``` This comparison operation can be also done in an element-wise fashion: ```r v <- c(4, 6, 8) u <- c(5, 7, 9) v < u ``` ``` [1] TRUE TRUE TRUE ``` Make sure you execute such operation between two vectors of same length. If they are not of same length, you will receive a warning. ```r v <- c(4, 2, 8, 7, 5) u <- c(5, 1, 9, 3) v < u ``` ``` > [1] TRUE FALSE TRUE FALSE FALSE Warning message: In v < u : longer object length is not a multiple of shorter object length ``` The of the output logical vector will be same as the longer vector in the comparison statement, i.e., `v` in the above example. The question is if `u` has four elements, what is the fifth element in `v` being compared with? It is being compared with first element of `u`. If you change the relational operator in comparison statement from `<` to `<=` you will be able to test the nature of the comparison of the fifth element: ```r v <- c(4, 2, 8, 7, 5) u <- c(5, 1, 9, 3) v <= u ``` ``` [1] TRUE FALSE TRUE FALSE TRUE Warning message: In v <= u : longer object length is not a multiple of shorter object length ``` Do you see it? The fifth element in the logical output is `TRUE` now - because, the fifth element of `v` and first element of `u` are equal. If the lengths of `v` and `u` are multiples of each other you will stop getting the warning: ```r v <- c(4, 2, 8, 7, 5, 0, 10, 3) u <- c(5, 1, 9, 3) v <= u ``` ``` > [1] TRUE FALSE TRUE FALSE TRUE TRUE FALSE TRUE ``` Do you now see what is going on here? So, the bottom line is when you are comparing vectors of unequal lengths, you have to be super careful and you should know what you are doing. #### `if` statement example Equipped with the relational operators, we can now come back to our planned example in the previous section. We wanted a code snippet which will add 1 to our numeric input if it is below 10. Let's build it now. Let's say our input numeric variable is `k`. ```r k <- 5 ``` Now, the `if` statement. First the basic syntax: ``` if () {} ``` Then we put the comparison statement in the round brackets: ``` if (k < 10) {} ``` Then we add the executable code in the curly brackets, and our `if` statement is ready. ```r if (k < 10) { k <- k + 1 } print(k) ``` So, what do we see when we run this? ```r > [1] 6 ``` `k` had the value of 5. Hence `(k < 10)` is `TRUE`. Hence the code `k <- k + 1` is executed and `k` becomes 6 now. What happens if `k` is 16? Check yourself! What if we need to test multiple conditions, say, we need one variable, `k`, to be more than 10 and another variable, `l` to be less than 33 - only then we execute our code of adding 1 to both. We can achieve this in two ways: - Nested conditional statements - Using logical operators We shall discuss these next. #### Nested conditional statements Nesting is a general idea in writing code. It means you are putting one statement withing another statement of similar type. Let's say you have one `if` statement: ``` if () {} ``` Showing this with the basic syntax - and then we add another `if` statement within it: ``` if () { if () { # Code is executed if both if statements are ture } } ``` Now, let's go back to our planned example, where we wanted two conditional statements: `k > 10` and `l < 33`: ```r k <- 15 l <- 23 if (k > 10) { if (l < 33) { k <- k + 1 l <- l + 1 } } print(k) print(l) ``` ``` > [1] 16 > [1] 24 ``` What if, `k <- 9`? Try it yourself! There can be smart ways of combining two conditional statements using a class of operators known as logical operators. We discuss those next. #### Logical operators Logical operators are operators in R which work on logical inputs and give the logical outputs. These are classically known as the Boolean operators in computer science parlance. Check out the following table for the description of different logical operators in R: | Logical operator | Name | Description | | ---- | ---- | ---- | | `&` | AND | operates on two logicals `(X & Y)`, where `X` and `Y` are single element logical inputs. If both are `TRUE`, the output is `TRUE`, otherwise the output is `FALSE`. | | \| | OR | operates on two logicals `(X \| Y)`, where `X` and `Y` are single element logical inputs. If at least one is `TRUE`, the output is `TRUE` | | `!` | NOT | operates on one logical `(!X)`, where `X` is single element logical inputs. | | `&&` | AND | operates on two logicals `(X & Y)`, where `X` and `Y` are multiple element logical vectors. | | \|\| | OR | operates on two logicals `(X \|\| Y)`, where `X` and `Y` are multiple element logical vectors. | Why different operators for single element logical inputs and multiple element logical vector inputs? Let's clear that up first. With `&`: ```r X <- TRUE Y <- TRUE Z <- X & Y print(Z) ``` ``` > [1] TRUE ``` With `&&`: ```r X <- TRUE Y <- TRUE Z <- X && Y print(Z) ``` ``` > [1] TRUE ``` Wait, what?! `&` and `&&` both worked on single element logical inputs and gave same output. What is the point of these two operators then? Let's try on multiple element logical vectors then. With `&`: ```r X <- c(TRUE, FALSE, TRUE) Y <- c(TRUE, TRUE, FALSE) Z <- X & Y print(Z) ``` ``` > [1] TRUE FALSE FALSE ``` With `&&`: ```r X <- c(TRUE, FALSE, TRUE) Y <- c(TRUE, TRUE, FALSE) Z <- X && Y print(Z) ``` ``` > [1] TRUE ``` So, the difference is relatively more clear when we tried the multiple element logical vectors. `&` operated on each element of two vectors and provided the outputs, but `&&` operated only on the first element of the logical vectors. If `&` can work on both single element and multiple element vectors, why have `&&` at all? Let's try these in the context of `if` statements - that might clear things up a bit more. `if` statement with `&&`: ```r X <- c(TRUE, FALSE, TRUE) Y <- c(TRUE, TRUE, FALSE) if (X && Y) { print("X and Y are true") } ``` ``` > [1] "X and Y are true" ``` `if` statement with `&`: ```r X <- c(TRUE, FALSE, TRUE) Y <- c(TRUE, TRUE, FALSE) if (X & Y) { print("X and Y are true") } ``` ``` > [1] "X and Y are true" Warning message: In if (X & Y) { : the condition has length > 1 and only the first element will be used ``` Here, you receive a warning, which says condition has a length greater than one. What does it mean? `if` statements are meant for one element logicals, either `TRUE` or `FALSE`. If it receives a vector of logicals, it evaluates the first element, and ignores the rest, while posting a warning for the user. `&&` can be useful for such rare cases where you have a vector of logicals but you know only the first element of the array matters. In most use cases you should be using `&`, but knowing what `&&` is, helps. Especially, `!` does not have a `!!` form. `!!` simply applies the NOT operation twice. When using a NOT operator, within an `if` statement, stay alert. Now that we know what logical operators are, we can come back to our planned example where, we wanted two conditional statements: `k > 10` and `l < 33`: ```r k <- 15 l <- 23 if (k > 10 & l < 33) { k <- k + 1 l <- l + 1 } print(k) print(l) ``` ``` > [1] 16 > [1] 24 ``` You can imagine now that the conditional statements can be a mixture of arithmetic, relational and logical operators. For instance, in the last code chunk, we have both relational and logical operators. It can get much more complex than that, and the `if` statement in R allows such flexibilities. To write complex conditional statements we need to know the order of precedence of execution of arithmetic, relational and logical operators in R, and that is what we discuss next. #### Order of precedence When your conditional expression has a mixture of arithmetic, logical and relational operators, the output is a function of the order of precedence of the operations. Here we provide a chart of precedence of operators in R, where 1 has the highest priority: | Precedence | Operation | | ---- | ---- | | 1 | parenthesis | | 2 | | | 3 | | | 4 | | | 5 | | | 6 | | Provide examples of complex conditional statements. #### Logical functions in R Apart from the logical operators, there is an army of logical functions in R. Discuss `is.numeric`, `is.character`, `is.na`, `any`, `all` etc. Show examples with if statements. #### Relational operators for indexing vectors ### `if-else` statements `if` statements work when the condition is `TRUE`. What if, you need something to execute if the condition is `FALSE`? Of course, we can stack up two `if` statements, and reverse the condition in the second `if` statement. But there exists a cleaner way of doing that, and that is called an `if-else` statement. This how the statement looks like: ``` if (condition) { # Execute code if condition is true } else { # Execute code if condition is false } ``` Let's pick up one of our previous examples of `if` statements and extend that to `if-else`: ```r k <- 10 if (k < 10) { k <- k + 1 } else { k <- k - 1 } print(k) ``` So, if `k` greater less than 10 we add 1 to it, but if `k` is less than or equal to 10 we subtract 1 from it. ``` > [1] 9 ``` You see what happened? `k` is not less than 10, so the code under the `else` statement was executed. The `else` statement can be written without the curly braces - such syntax is supported in R for making the code more concise - but we do not generally recommend that, since we believe, that reduces readability of the code. ### Stacks and nests You can stack your `if` and `else` statements or nest statements within each other to achieve complex decision processes. ```r # generate an example of nesting and stacking here ``` ### `ifelse` function ### The `switch` function ## Loops Another major tool to control the flow of your algorithm, apart from the conditional statements, are loops. Loops tell your interpreter to repeatedly execute some code. Each consecutive repeat is called a 'pass'. Each pass updates some variables and/or collect freshly calculated variables in some data structure. R gives us two types of loops: the `for` loops and the `while` loops. We discuss those in detail now. There are a few explicit commands to control these loops: `break` and `next`. There exists another loop called `repeat`, which is used with the `break` command. We discuss those at the end of this page. ### `for` loops ### `while` loops ### `break` command ### `next` command ### `repeat` loops