############################ Brisk introduction to Python ############################ This is an introduction designed for those of us who already know a `dynamic programming language `_ fairly well. MATLAB and the R language are `examples of dynamic programming languages `_. `Kiyotaka Nemoto `_ kindly made a `Japanese translation of this page `_. .. testsetup:: import os .. nbplot:: :include-source: false from __future__ import print_function, division ********************* How to read this page ********************* Read this page in one shot from beginning to end. We go through all the basic Python data types, stopping from time to time to talk about relevant features of the Python language. While you are reading, I suggest you open an IPython_ console, and type or copy / paste the code fragments, to run them. To copy / paste, click the ``>>>`` symbol at the right of each code cell to remove the leading characters and output, before selecting the cell contents. By the end of the page you should have an idea of the Python landscape to orient you, as you start to learn the language. ******* Numbers ******* There are two types of numbers in Python: integer and floating point. In Python, an integer is an *object* of type ``int``, and a float is an object of type ``float``. .. nbplot:: >>> a = 99 >>> type(a) >>> b = 99.0 >>> type(b) You can create ints and floats by using ``int`` and ``float`` like this: .. nbplot:: >>> float('1') 1.0 >>> float(1) 1.0 >>> int('1') 1 >>> int(1) 1 ``+``, ``-``, ``*`` or ``/`` on a mix of floats and ints, give floats: .. nbplot:: >>> a + b 198.0 >>> a * b 9801.0 Dividing an int by an int also gives a float |--| but this is only true by default for Python >= 3 (see [#py2-division]_): .. nbplot:: >>> 1 / 2 0.5 If you only want the integer part of the division, use ``//`` .. nbplot:: >>> 1 // 2 0 >>> 1.0 // 2.0 0.0 Python has built-in function called ``round``: .. nbplot:: >>> round(5.0 / 2.0) 2 The ``%`` operator on numbers gives you the remainder of integer division (also known as the modulus): .. nbplot:: >>> 5 % 2 1 .. nbplot:: >>> 5.0 % 2.0 1.0 ************** True and False ************** ``True`` and ``False`` are special objects in Python. They are of type ``bool`` (for Boolean). .. nbplot:: >>> type(True) >>> type(False) >>> True == False False >>> True == True True >>> False == False True You can use the logical operators ``and``, ``or`` and ``not`` to express logic about Boolean values: .. nbplot:: >>> True and True True >>> True and False False >>> True or False True >>> False or False False >>> not True False >>> True and not False True **** None **** ``None`` is also a special object in Python. By convention, Python often uses ``None`` to mean that no valid value resulted from an operation, or to signal that we don't have a value for a parameter. .. nbplot:: >>> type(None) Unlike most other values in Python, the default display output from None, is nothing: .. nbplot:: >>> None ****** Equals ****** As for MATLAB and R, ``=`` is for assignment, ``==`` is for testing equality. .. nbplot:: >>> a = 1 >>> a 1 >>> a == 1 True Like R, Python uses ``!=`` for testing that objects are *not* equal. This is different from MATLAB, which uses ``~=``: .. nbplot:: >>> a != 1 False ************************************* "If" statements, blocks and indention ************************************* A conditional statement in Python looks like this: .. nbplot:: >>> my_var = 10 >>> if my_var == 10: ... print("The conditional is True!") ... print("my_var does equal 10") ... The conditional is True! my_var does equal 10 The first line of the conditional statement, that contains the conditional test, ends in a colon. Call this the *if test*. There follow some lines *indented* relative to the ``if`` test. Call these indented lines the *if block*. Python executes the statements in the ``if`` block only when the ``if`` test evaluates to True. For example, in this case, the ``if`` test evaluates to False, and the block does not execute: .. nbplot:: >>> my_var = 11 >>> # This time the conditional evaluates to False >>> if my_var == 10: # the "if test" ... # The indented lines are the "if block" ... print("The conditional is True!") ... print("my_var does equal 10") ... The first line that returns to the same level of indentation as the ``if`` test line, closes the ``if`` block. Unless the ``if`` block has a further indented block (for example, another ``if`` block), then all the lines in the block must have the same indentation. See note [#if-statement-equivalents]_ for equivalent ``if`` statements in R and MATLAB. The ``if`` block may be followed by another block where the conditional is ``else:``. This block will only run if the initial conditional test evaluates to False. .. nbplot:: >>> my_var = 11 >>> if my_var == 10: ... print("The conditional is True!") ... print("my_var does equal 10") ... else: ... print("The conditional is False!") ... print("my_var does not equal 10") ... The conditional is False! my_var does not equal 10 There may be other conditional tests, with associated conditional blocks. These tests use the contraction ``elif conditional_test``, where ``elif`` is a contraction for ``else if``: .. nbplot:: >>> my_var = 12 >>> if my_var == 10: ... print("The conditional is True!") ... print("my_var does equal 10") ... elif my_var == 11: ... print("The second conditional is True!") ... print("my_var does equal 11") ... elif my_var == 12: ... print("The third conditional is True!") ... print("my_var does equal 12") ... else: ... print("All conditionals are False!") ... print("my_var does not equal 10, 11 or 12") ... The third conditional is True! my_var does equal 12 ****************** "While" statements ****************** ``while`` statements are another example with an initial test followed by an indented block. Here's an example where we find the largest `Fibonacci number `_ less than 1000: .. nbplot:: >>> last_but_1 = 0 >>> fibonacci = 1 >>> while fibonacci < 1000: ... last_but_2 = last_but_1 ... last_but_1 = fibonacci ... fibonacci = last_but_2 + last_but_1 ... >>> print("Largest Fibonacci < 1000 is", last_but_1) Largest Fibonacci < 1000 is 987 Notice the initial *while test*: ``while fibonacci < 1000:``, followed by the indented *while block*. Unlike the ``if`` statement, Python will continue to run the statements in the ``while`` block until the conditional in the ``while`` test evaluates to False. ***** Lists ***** Make a list like this: .. nbplot:: >>> my_list = [9, 4, 7, 0, 8] >>> my_list [9, 4, 7, 0, 8] .. nbplot:: >>> type(my_list) A list element can be any type of object, including another list: .. nbplot:: >>> mixed_list = [9, 3.0, True, my_list] >>> mixed_list [9, 3.0, True, [9, 4, 7, 0, 8]] .. nbplot:: >>> type(mixed_list) A Python list is like a cell array in MATLAB, or a ``list`` in R. .. _for-loops: "for" loops and iteration ========================= We can *iterate* over a list. To iterate, means to fetch one element after another from some container, such as a list. We can use a ``for`` loop to iterate over a list: .. nbplot:: >>> for e in my_list: ... print(e) ... 9 4 7 0 8 The ``for`` loop has the same form as ``if`` statements and ``while`` loops, with a first line ending in a colon, followed by an indented block. The first line in the ``for`` loop is of form: ``for loop_variable in container:``. The *container* is the container from which we will fetch the elements. At each iteration of the ``for`` loop, Python gets a new element from the container to put into the *loop variable*. For each element in the container, Python executes the *for block*. Note [#for-loop-equivalents]_ shows equivalent ``for`` loops in Python, R and MATLAB. See :ref:`ranges` for a common way of writing a ``for`` loop that iterates over a sequence of integers. Lists are sequences =================== A `sequence `_ is a category of Python objects that have a defined element order, have a length, are iterable, can be indexed with integers, and *sliced* (see below). If object ``s`` is a sequence, then: * ``s`` has a length that can be found with ``len(s)``; * we can iterate over the elements in ``s`` with ``for element in s: # do something with element``; * we can return the element at position ``n`` with ``s[n]``; * we can get another sequence by *slicing* ``s``. For example, ``s[0:n]`` will give a new sequence containing the first ``n`` elements of ``s``. .. nbplot:: :include-source: false >>> # We check that our list is an instance of the type Sequence. >>> import collections >>> isinstance(my_list, collections.Sequence) True .. nbplot:: >>> # Has a length >>> len(my_list) 5 .. nbplot:: >>> # Is iterable >>> for e in my_list: ... print(e) 9 4 7 0 8 .. nbplot:: >>> # Can be indexed >>> my_list[1] 4 >>> # Can be sliced >>> my_list[0:2] [9, 4] Python indices are 0-based ========================== Indices for Python sequences start at 0. For Python, the first element is at index 0, the second element is at index 1, and so on: .. nbplot:: >>> my_list[0] 9 >>> my_list[1] 4 Negative indices ================ Negative numbers as indices count back from the end of the list. For example, use index ``-1`` to return the last element in the list: .. nbplot:: >>> my_list [9, 4, 7, 0, 8] >>> my_list[-1] 8 The -1 index above is therefore equivalent to: .. nbplot:: >>> my_list[len(my_list) - 1] 8 Here is the third from last element: .. nbplot:: >>> my_list[-3] 7 .. _everything-is-a-pointer: Every Python variable is a pointer ================================== In Python, variable names point to the memory location of an object. Therefore, Python variables can be called *pointers*. If you are running standard Python, you can see the memory location that a variable points to with the ``id()`` function. The following will give some long integer giving the memory location on your computer: .. nbplot:: >>> id(my_list) # doctest: +SKIP 4467820488 When you do ``another_variable = a_variable``, you are telling the name ``another_variable`` to point to the same object as the name ``a_variable``. The variable therefore points to the same memory location: .. nbplot:: >>> another_list = my_list >>> another_list [9, 4, 7, 0, 8] >>> id(another_list) # doctest: +SKIP 4467820488 >>> id(another_list) == id(my_list) True Lists are mutable ================= A list is a *mutable* object. Mutable means, that we can change the elements in the list, without creating a new list. .. nbplot:: >>> my_list[1] = 99 >>> my_list [9, 99, 7, 0, 8] Because lists are mutable, you need to keep in mind that :ref:`everything-is-a-pointer`: .. nbplot:: >>> another_list = my_list >>> another_list [9, 99, 7, 0, 8] >>> id(another_list) == id(my_list) True Because ``my_list`` points to the same object as ``another_list``, when you modify (the object pointed to by) ``my_list``, we also modify the value of ``another_list``, because ``my_list`` and ``another_list`` point at the same list: .. nbplot:: >>> my_list[1] = 101 >>> another_list [9, 101, 7, 0, 8] Adding lists ============ Adding two lists with ``+`` returns a new list that is the concatenation of the two lists: .. nbplot:: >>> new_list = my_list + [False, 1, 2] >>> new_list [9, 101, 7, 0, 8, False, 1, 2] Appending and removing elements =============================== You can append elements with the ``append`` method. A method is a function attached to the object. See :ref:`functions` for more on functions in Python. We can see that ``append`` is a method by displaying the value of ``my_list.append``: .. nbplot:: >>> my_list.append To call the method, we add parentheses, surrounding any arguments we want to pass into the method. In this case we want to pass in the element to append: .. nbplot:: >>> my_list.append(20) >>> my_list [9, 101, 7, 0, 8, 20] Note that the ``append`` method does *not* return the list, it just changes the list in-place. Python returns ``None`` from the ``append`` method: .. nbplot:: >>> result = my_list.append(42) >>> result == None True This is also true for some other methods that modify the list in-place, such as the ``sort`` method: .. nbplot:: >>> new_list = [10, 1, 3] >>> result = new_list.sort() >>> # Return value is None >>> result == None True >>> # But the original list now in ascending order from sort >>> new_list [1, 3, 10] You can remove elements from the list with the ``pop`` method: .. nbplot:: >>> # Remove and return the last element of the list >>> my_list.pop() 42 >>> my_list [9, 101, 7, 0, 8, 20] >>> # Remove and return the third element of the list >>> my_list.pop(2) 7 >>> my_list [9, 101, 0, 8, 20] .. _slicing: Slicing ======= You can return slices from any sequence, including lists, by putting a slice specifier in square brackets. For example, this returns the first 3 elements of the list: .. nbplot:: >>> my_list[0:3] [9, 101, 0] The first number after the square bracket and before the colon is the *start* index. In this case we start at the first element (element at index 0). The second number, after the colon, is the *stop* index. This is the end index *plus one*. So we return elements at index 0, 1 and 2. That is, elements *up to, but not including* 3. If you omit the first number (the start index) Python assumes 0: .. nbplot:: >>> my_list[:3] [9, 101, 0] If you omit the second number, Python assumes the length of the list as the stop index. .. nbplot:: >>> my_list[2:] [0, 8, 20] >>> my_list[2:len(my_list)] [0, 8, 20] You can omit both numbers, in which case you return all the elements of the list. This can be useful if you want to make a new list that contains the same elements as the first: .. nbplot:: >>> another_list = my_list[:] >>> another_list [9, 101, 0, 8, 20] Because this is a new list object, you can change the original list without changing the new list: .. nbplot:: >>> my_list[1] = 999 >>> another_list [9, 101, 0, 8, 20] You can also specify a second colon, and a third number. This third number is the *step size*. For example, to get every second element of the list: .. nbplot:: >>> my_list[0:len(my_list):2] [9, 0, 20] >>> # Length of list assumed as stop index if omitted >>> my_list[0::2] [9, 0, 20] You can use negative numbers for the ``start`` and ``stop`` indices. As for indexing, negative ``start`` and ``stop`` values count back from the end of the list: .. nbplot:: >>> my_list [9, 999, 0, 8, 20] >>> my_list[-4:-2] [999, 0] Negative numbers for the ``step`` count backwards from the ``start`` to the ``stop`` index: .. nbplot:: >>> my_list[4:1:-1] [20, 8, 0] If you have a negative step size, and you don't specify the start index, then the start index defaults to the last element in the list. If you don't specify the stop index, it defaults to one prior to index 0: .. nbplot:: >>> my_list [9, 999, 0, 8, 20] >>> my_list[-1:1:-1] [20, 8, 0] >>> my_list[:1:-1] [20, 8, 0] >>> my_list[-2::-1] [8, 0, 999, 9] One consequence that is worth remembering is that the following idiom gives you a reversed copy of the list: .. nbplot:: >>> my_list[::-1] [20, 8, 0, 999, 9] ****** Tuples ****** Tuples are almost the same as lists, except they are not mutable. That is, you cannot change the elements of a tuple, or change the number of elements. .. nbplot:: >>> my_tuple = (9, 4, 7, 0, 8) >>> my_tuple (9, 4, 7, 0, 8) .. nbplot:: :raises: TypeError >>> my_tuple[1] = 99 Traceback (most recent call last): ... TypeError: 'tuple' object does not support item assignment .. nbplot:: :raises: AttributeError >>> # This raises an AttributeError, because tuples have no append method >>> my_tuple.append(20) Traceback (most recent call last): ... AttributeError: 'tuple' object has no attribute 'append' Here's an empty tuple: .. nbplot:: >>> empty_tuple = () >>> empty_tuple () A tuple with two elements: .. nbplot:: >>> two_tuple = (1, 5) >>> two_tuple (1, 5) There is a little complication when making a tuple with one element: .. nbplot:: >>> not_a_tuple = (1) >>> not_a_tuple 1 This is because Python can't tell that you meant this to be a tuple, rather than an expression with parentheses round it: .. nbplot:: >>> not_a_tuple = (1 + 5 + 3) >>> not_a_tuple 9 To tell Python that you mean this to be a length-one tuple, add a comma after the element, and before the closing parenthesis: .. nbplot:: >>> one_tuple = (1,) >>> one_tuple (1,) ******* Strings ******* Make a string like this: .. nbplot:: >>> my_string = 'interesting text' >>> my_string 'interesting text' You can use single quotes or double quotes for your string, the two strings are the same: .. nbplot:: >>> another_string = "interesting text" >>> another_string 'interesting text' >>> my_string == another_string True Convert other objects to strings using ``str``: .. nbplot:: >>> # Convert integer to string >>> str(9) '9' >>> # Convert floating point value to string >>> str(1.2) '1.2' Strings are sequences ===================== Like lists, strings are sequences (have length, can be iterated, can index, can slice). .. nbplot:: >>> # Length >>> len(my_string) 16 >>> # Iterable >>> for c in my_string: ... print(c) i n t e r e s t i n g t e x t >>> # Can index >>> my_string[1] 'n' >>> # Can slice >>> my_string[1:5] 'nter' Strings are immutable ===================== Unlike lists, strings are immutable. You cannot change the characters within a string: .. nbplot:: :raises: TypeError >>> my_string[1] = 'N' Traceback (most recent call last): ... TypeError: 'str' object does not support item assignment Adding strings ============== .. nbplot:: >>> my_string + ' with added insight' 'interesting text with added insight' String methods ============== Strings have lots of interesting methods. In IPython, try tab-complete on a string variable name, followed by a period |--| e.g. type ``my_string.``, followed by the tab key. See also the `list of string methods in the Python docs `_. One interesting method is ``replace``. It returns a new string that is a copy of the input, but replacing instances of one string with another: .. nbplot:: >>> another_string = my_string.replace('interesting', 'extraordinary') >>> another_string 'extraordinary text' Notice that the original string has not changed (it's immutable): .. nbplot:: >>> my_string 'interesting text' Use the ``split`` method to break a string into a list of strings. By default, ``split`` will split the string at any white space (spaces, tab characters or line breaks): .. nbplot:: >>> my_string.split() ['interesting', 'text'] Pass a character to ``split`` to split the string at that character: .. nbplot:: >>> another_example = 'one:two:three' >>> another_example.split(":") ['one', 'two', 'three'] The ``strip`` method returns a new string with spaces, tabs and end of line characters removed from the beginning and end of the string: .. nbplot:: >>> # A string with a newline character at the end >>> my_string = ' a string\n' >>> my_string ' a string\n' >>> my_string.strip() 'a string' Inserting values into strings ============================= See: :doc:`string_formatting`. .. _ranges: ***** Range ***** ``range`` in Python 3 returns a *range object*. It is a sequence, and so it is rather like a list [#py2-range]_. When you use ``range`` with one argument, the argument value is the ``stop`` index. For example, to make a ``range`` object generating the numbers from 0 up to *but not including* 5: .. nbplot:: >>> my_range = range(5) >>> my_range range(0, 5) You can make a range object into a list by using ``list``: .. nbplot:: >>> list(range(5)) [0, 1, 2, 3, 4] A ``range`` object is a sequence: .. nbplot:: >>> # Has a length >>> len(my_range) 5 >>> # Is iterable >>> for e in my_range: ... print(e) 0 1 2 3 4 >>> # Can be indexed >>> my_range[1] 1 >>> # Can be sliced >>> my_range[0:2] range(0, 2) Set the start element for ``range`` by passing two arguments: .. nbplot:: >>> my_range = range(1, 7) >>> my_range range(1, 7) >>> list(my_range) [1, 2, 3, 4, 5, 6] Set the step size with a third argument: .. nbplot:: >>> my_range = range(1, 7, 2) >>> my_range range(1, 7, 2) >>> list(my_range) [1, 3, 5] One common use of ``range`` is to iterate over a sequence of numbers in a ``for`` loop: .. nbplot:: >>> for i in range(5): ... print(i) ... 0 1 2 3 4 **** Sets **** Sets are collections of unique elements, with no defined order. Python reserves the right to order set elements in any way it chooses: .. nbplot:: >>> # Only unique elements collected in the set >>> my_set = set((5, 3, 1, 3)) >>> my_set # doctest: +SKIP {1, 5, 3} Because there is no defined order, you cannot index into a set: .. nbplot:: :raises: TypeError >>> my_set[1] Traceback (most recent call last): ... TypeError: 'set' object does not support indexing You can add elements to a set with the ``add`` method: .. nbplot:: >>> my_set.add(10) >>> my_set # doctest: +SKIP {1, 3, 5, 10} Because set elements must be unique, if you add an element already in the set, this does not change the set: .. nbplot:: >>> my_set.add(5) >>> my_set # doctest: +SKIP {1, 3, 5, 10} You can iterate over a set, but the order of elements is arbitrary. You cannot rely on the same order in any two runs of your program: .. nbplot:: >>> # The order of elements is arbitrary >>> for element in my_set: # doctest: +SKIP ... print(element) 1 3 5 Look at the `methods of the set object `_ for interesting operations such as ``difference``, ``union``, ``intersection`` etc. ************************************* Sets, lists and tuples are containers ************************************* A *container* is a Python object for which you can test an element for membership. So, if an object ``c`` is a container then we can test if an element is in the container with ``true_or_false = element in c``. Be careful |--| the word ``in`` has different meanings in ``for element in c:`` and ``true_or_false = element in c``. With ``for element in c:``, ``in`` is a part of the ``for`` loop syntax. With ``true_or_false = element in c``, ``in`` triggers a test of membership, returning True or False. .. nbplot:: >>> 5 in my_set True >>> 11 in my_set False You can use ``not in`` to test if an element is not in a container: .. nbplot:: >>> 11 not in my_set True Lists and tuples are also containers: .. nbplot:: >>> 9 in [9, 4, 7, 0, 8] True >>> 3 in (1, 3, 5) True ************ Dictionaries ************ A dictionary is an unordered collection of key / value pairs. The *key* is something that identifies the element, and the *value* is the value corresponding to the particular key. .. nbplot:: >>> # This is an empty dictionary >>> software = {} Here we insert a new key / value mapping into the dictionary. The key is a string |--| ``'MATLAB'`` |--| and the corresponding value is an integer 50: .. nbplot:: >>> software['MATLAB'] = 50 >>> software {'MATLAB': 50} Now we insert another key / value mapping: .. nbplot:: >>> software['Python'] = 100 >>> software #doctest: +SKIP {'Python': 100, 'MATLAB': 50} Get the value corresponding to a key by indexing the dictionary with the key: .. nbplot:: >>> software['Python'] 100 We can iterate over the keys in the dictionary, but the order of the keys is arbitrary. Python returns the keys in any order it chooses, and we can't rely on the order being the same in any two runs of our program: .. nbplot:: >>> for key in software.keys(): #doctest: +SKIP ... print(key) MATLAB Python We can also iterate over the values, with the same constraint, that the order is arbitrary: .. nbplot:: >>> for value in software.values(): #doctest: +SKIP ... print(value) ... 50 100 We can use the ``items`` method to iterate over the key / value pairs. In this case each element is a tuple of length two, where the first element is the key and the second element is the value: .. nbplot:: >>> for key_value in software.items(): #doctest: +SKIP ... print(key_value) ('MATLAB', 50) ('Python', 100) One way to construct a dictionary is with curly brackets, using colons to separate the key and value, and commas to separate the pairs: .. nbplot:: >>> software = {'MATLAB': 50, 'Python': 100} >>> software #doctest: +SKIP {'Python': 100, 'MATLAB': 50} Keys must be unique. A later key / value pair will overwrite an earlier key / value pair that had the same key: .. nbplot:: >>> software = {'MATLAB': 50, 'Python': 100, 'MATLAB': 45} >>> software # doctest: +SKIP {'Python': 100, 'MATLAB': 45} Dictionaries are containers =========================== Dictionaries are also containers. Python takes the elements in the container to be the dictionary keys. This is a convenient way to test if you already have a key in a dictionary: .. nbplot:: >>> 'MATLAB' in software True >>> 'happiness' in software False ************************************** "for", "while", "continue" and "break" ************************************** ``for`` statements and ``while`` statement are *loops*, because Python keeps executing the ``for`` or ``while`` block until the ``for`` runs out of elements or the ``while`` condition is False. You can break out of a loop using the ``break`` statement: .. nbplot:: >>> for i in range(10): ... if i == 6: ... break ... print(i) ... 0 1 2 3 4 5 The ``continue`` statement short-circuits execution of the current iteration of the ``for`` or ``while`` block, to continue with the next iteration: .. nbplot:: >>> for i in range(10): ... if i == 6: ... continue ... print(i) 0 1 2 3 4 5 7 8 9 See :doc:`on_loops` for more on loops and ``break``. .. _functions: ********* Functions ********* Here we define our first function in Python: .. nbplot:: >>> def my_function(an_argument): ... return an_argument + 1 The function definition begins with the ``def`` keyword followed by a space. There follows the name of the function ``my_function``. Next we have an open parenthesis, followed by a specification of the arguments that the function expects to be passed to it. In this case, the function expects a single argument. For this function, the value of the input argument will be attached to the name ``an_argument`` when the function starts to execute. Last, we have an indented block, with code that will run when the function is called. We can return a value from the function using the ``return`` statement. .. nbplot:: >>> my_function(10) 11 We called ``my_function`` by appending the opening parenthesis, and the arguments, followed by the closing parenthesis. The function began to execute with the variable ``an_argument`` set to 10. It returned 10 + 1 = 11. A function need not accept any arguments: .. nbplot:: >>> def my_second_function(): ... return 42 ... >>> my_second_function() 42 A function does not need to have a ``return`` statement. If there is no return statement, the function returns ``None``: .. nbplot:: >>> def function_with_no_return(): ... # Function with no return statement ... a = 1 ... >>> function_with_no_return() == None True A function can have more than one argument: .. nbplot:: >>> def my_third_function(first_argument, second_argument): ... return first_argument + second_argument ... >>> my_third_function(10, 42) 52 Default values for function arguments ===================================== The function definition can give a default value for a function argument: .. nbplot:: >>> def my_fourth_function(first_argument, extra_argument=101): ... return first_argument + extra_argument This function, like ``my_third_function``, has two arguments, and we can call it the same way that we call ``my_third_function``: .. nbplot:: >>> my_fourth_function(10, 42) 52 But, we can also omit the second argument, because it has a default value. In that case the argument will get its default value: .. nbplot:: >>> my_fourth_function(10) # Pass one argument, get default for second 111 So far we have passed in arguments by position, the first argument in our call becoming the first argument in the function, and so on. We can also pass in arguments by name. For example, we could pass in ``extra_argument`` by giving the parameter name and value, like this: .. nbplot:: >>> my_fourth_function(10, extra_argument=202) 212 Passing arguments this way can make the code easier to read, because the name of the argument often gives a good clue as to its purpose in the function. It can also be useful with functions having many parameters with default values; in that case using the argument name makes it easier to pass in one or few values that are different from the defaults. Functions are objects too ========================= Remember that everything in Python is an object. The function is itself an object, where the name of the function is a variable, that refers to the function: .. nbplot:: >>> my_fourth_function .. nbplot:: >>> type(my_fourth_function) We call the function by adding the open parenthesis followed by the arguments and the close parenthesis: .. nbplot:: >>> my_fourth_function(10) 111 We can make a new name to point to this same function as easily as we can could with any other Python variable: .. nbplot:: >>> another_reference_to_func4 = my_fourth_function >>> type(another_reference_to_func4) >>> # We call this function using the new name >>> another_reference_to_func4(10) 111 ******* Sorting ******* The Python function ``sorted`` returns a sorted list from something that Python can iterate over: .. nbplot:: >>> sorted('adcea') ['a', 'a', 'c', 'd', 'e'] .. nbplot:: >>> sorted((1, 5, 3, 2)) [1, 2, 3, 5] In order to do the sorting, Python compares the elements with ``one_element < another_element``. For example, to do the sort above, Python needed results like: .. nbplot:: >>> 3 < 5 True Sometimes you want to order the objects in some other way than simply comparing the elements. If so, then you can define a *sort function*. A sort function is a function that accepts an element as its argument, and returns a *sort value* for that element. Python does the sorting, not on the elements themselves, but on the returned sort value for each element. For example, let's say we have first and last names stored as tuples: .. nbplot:: >>> people = [('JB', 'Poline'), ('Matthew', 'Brett'), ('Mark', 'DEsposito')] By default, Python compares tuples by comparing the first value first, then the second value, and so on. This means for our case that we are sorting on the first name: .. nbplot:: >>> ('Matthew', 'Brett') > ('Mark', 'DEsposito') True .. nbplot:: >>> sorted(people) [('JB', 'Poline'), ('Mark', 'DEsposito'), ('Matthew', 'Brett')] That may not be what you want. You might want to sort by the last name, which is the second value in the tuple. In that case you can make a sort function, that accepts the element as an input (the tuple in this case), and returns a value: .. nbplot:: >>> def get_last_name(person): ... return person[1] # The last name Remember everything in Python is an object. The function we have just defined is also an object, with name ``get_last_name``: .. nbplot:: >>> get_last_name We can pass this value to the ``sorted`` function as a sort function. We will pass this in using the sort function parameter name, which is ``key``: .. nbplot:: >>> sorted(people, key=get_last_name) [('Matthew', 'Brett'), ('Mark', 'DEsposito'), ('JB', 'Poline')] ***** Files ***** You can open a file in several different *modes*. The mode specifies whether you want to read or write the file, and whether the data in the file is, or will be, text data (string) or binary data (bytes). For example, here we open a file for Writing Text (``wt``): .. nbplot:: >>> my_file = open("a_text_file.txt", "wt") If we had wanted to write binary (byte) data, we would have used ``wb`` for the mode (Write Binary). As usual, you can explore this new file object in IPython by appending the object name with a period, and pressing the tab key to get a list of attributes and methods |--| e.g. ``myfile.`` followed by tab. To write to a file, use the ``write`` method. .. nbplot:: >>> # Write a line of text with a newline character at the end >>> # The method returns the number of characters written >>> my_file.write("MATLAB is good for matrices\n") 28 >>> # Another line >>> my_file.write("Python is good for coding\n") 26 You should close the file when you've finished with it: .. nbplot:: >>> my_file.close() To read a file, open the file in read mode: .. nbplot:: >>> # Open file in Read Text mode >>> my_file2 = open("a_text_file.txt", "rt") You can read all the contents in one shot by calling the ``read`` method without arguments: .. nbplot:: >>> contents = my_file2.read() >>> print(contents) MATLAB is good for matrices Python is good for coding Remember to close the file afterwards: .. nbplot:: >>> my_file2.close() An open text file object is also *iterable*, meaning, that you can ask the file object to return its contents line by line, in a ``for`` loop. Let's open the file again to show this in action: .. nbplot:: >>> my_file2 = open("a_text_file.txt", "rt") >>> for line in my_file2: # iterating over the file object ... print("Line is:", line) ... Line is: MATLAB is good for matrices Line is: Python is good for coding >>> my_file2.close() .. testcleanup:: os.unlink('a_text_file.txt') .. rubric:: Footnotes .. [#py2-division] Python 3 returns a floating point value from dividing two integers, but the default for Python 2 is to return the integer part of the division. Thus, in Python 2 ``1 / 2`` returns the same result as ``1 // 2`` i.e. ``0``. If your code may run on Python 2, remember to add the statement ``from __future__ import division`` at the top of your code files, to make sure you get the Python 3 behavior when dividing integers. .. [#if-statement-equivalents] Here is an ``if`` statement in Python: .. code-block:: python # Python my_var = 10 if my_var == 10: print("The conditional is True!") print("my_var does equal 10") The equivalent in R is: .. code-block:: R # R my_var = 10 if (my_var == 10) { print("The conditional is True!") print("my_var does equal 10") } For MATLAB: .. code-block:: MATLAB % MATLAB my_var = 10; if my_var == 10 disp('The conditional is True!'); disp('my_var does equal 10'); end For Python, the indentation defines the block. For R and MATLAB, indentation is an optional way of formatting the code to make it look nicer. In R and MATLAB, the same code without indentation will run just as well, but most people find the code harder to read: .. code-block:: R # R my_var = 10 if (my_var == 10) { # Indentation is optional print("The conditional is True!") print("my_var does equal 10") } .. code-block:: MATLAB % MATLAB my_var = 10; if my_var == 10 % Indentation is optional disp('The conditional is True!'); disp('my_var does equal 10'); end .. [#for-loop-equivalents] Here is a ``for`` loop in Python: .. code-block:: python # Python for element in [9, 4, 7, 0, 8]: print(element) The equivalents in R and MATLAB are: .. code-block:: R # R for (element in list(9, 4, 7, 0, 8)) { print(element) } .. code-block:: MATLAB % MATLAB for element = {9, 4, 7, 0, 8} disp(e); end .. [#py2-range] In Python 2, ``range`` returns a list. You can often use a Python 3 range object in the same way you could use a list, so this difference between Python 2 and 3 may not matter for the person using the code. There are things that you can do with lists that you cannot do with ranges, such as adding. For example ``range(4) + range(5)`` will work in Python 2 (adding lists), but fail in Python 3 (you cannot add range objects). .. include:: links_names.inc