Where does Python look for modules?#
See:
Let’s say we have written a Python module and saved it as a_module.py
, in
a directory called code
.
We would normally do this with a text editor, but, for illustration, here we write out the module file using the Jupyter / IPython %%file
magic command:
%%file code/a_module.py
""" This is a_module
"""
def a_func():
return 99
print('Finished importing a_module.py')
Writing code/a_module.py
We also have a script called a_script.py
in a directory called scripts
:
%%file scripts/a_script.py
""" This is a_script
"""
import a_module
print('Result of a_func is:', a_module.a_func())
Writing scripts/a_script.py
At the moment, a_script.py
will fail with:
run scripts/a_script.py
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
File ~/work/dipy-textbook/dipy-textbook/scripts/a_script.py:4
1 """ This is a_script
2 """
----> 4 import a_module
6 print('Result of a_func is:', a_module.a_func())
ModuleNotFoundError: No module named 'a_module'
Above we ran the script within the Python process of the notebook, but we can also run the script in the terminal. Here we are using the %%bash
command at the top of the cell to run a terminal on Linux or Mac. This may not work on Windows.
Notice that running the script this way gives the same error, for the same reason:
%%bash
python3 scripts/a_script.py
Traceback (most recent call last):
File "/home/runner/work/dipy-textbook/dipy-textbook/scripts/a_script.py", line 4, in <module>
import a_module
ModuleNotFoundError: No module named 'a_module'
---------------------------------------------------------------------------
CalledProcessError Traceback (most recent call last)
Cell In[4], line 1
----> 1 get_ipython().run_cell_magic('bash', '', 'python3 scripts/a_script.py\n')
File /opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/site-packages/IPython/core/interactiveshell.py:2475, in InteractiveShell.run_cell_magic(self, magic_name, line, cell)
2473 with self.builtin_trap:
2474 args = (magic_arg_s, cell)
-> 2475 result = fn(*args, **kwargs)
2477 # The code below prevents the output from being displayed
2478 # when using magics with decodator @output_can_be_silenced
2479 # when the last Python token in the expression is a ';'.
2480 if getattr(fn, magic.MAGIC_OUTPUT_CAN_BE_SILENCED, False):
File /opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/site-packages/IPython/core/magics/script.py:153, in ScriptMagics._make_script_magic.<locals>.named_script_magic(line, cell)
151 else:
152 line = script
--> 153 return self.shebang(line, cell)
File /opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/site-packages/IPython/core/magics/script.py:305, in ScriptMagics.shebang(self, line, cell)
300 if args.raise_error and p.returncode != 0:
301 # If we get here and p.returncode is still None, we must have
302 # killed it but not yet seen its return code. We don't wait for it,
303 # in case it's stuck in uninterruptible sleep. -9 = SIGKILL
304 rc = p.returncode or -9
--> 305 raise CalledProcessError(rc, cell)
CalledProcessError: Command 'b'python3 scripts/a_script.py\n'' returned non-zero exit status 1.
When Python hits the line import a_module
, it tries to find a package or a
module called a_module
. A package is a directory containing modules, but we
will only consider modules for now. A module is a file with a matching
extension, such as .py
. So, Python is looking for a file a_module.py
, and
not finding it.
We will see the same effect at the interactive Python console, or in Jupyter or IPython:
import a_module
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[5], line 1
----> 1 import a_module
ModuleNotFoundError: No module named 'a_module'
Python looks for modules in ‘sys.path’#
Python has a simple algorithm for finding a module with a given name, such as
a_module
. It looks for a file called a_module.py
in the directories
listed in the variable sys.path
.
import sys
# Show sys.path
sys.path
['/home/runner/work/dipy-textbook/dipy-textbook',
'/opt/hostedtoolcache/Python/3.10.11/x64/lib/python310.zip',
'/opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10',
'/opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/lib-dynload',
'',
'/opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/site-packages']
The a_module.py
file is in the code
directory, and this directory is
not in the sys.path
list.
sys.path
is just a Python list, like any other:
type(sys.path)
list
That means we can make the import work in our notebook, by appending the
code
directory to the sys.path
list:
sys.path.append('code')
# Now the import will work
import a_module
Finished importing a_module.py
There are various ways of making sure a directory is always on the Python
sys.path
list when you run Python, including.
One of them is making the module part of an installable package, and install it — see: making a Python package — but we don’t cover that here.
Now we have imported the module into this Python process, the import will work correctly in the script, executed within this Python process:
run scripts/a_script.py
Result of a_func is: 99
However, if we run the script in its own new terminal, we still get the error, because we aren’t using the notebook Python process, and we therefore haven’t successfully imported a_module.py
:
%%bash
python3 scripts/a_script.py
Traceback (most recent call last):
File "/home/runner/work/dipy-textbook/dipy-textbook/scripts/a_script.py", line 4, in <module>
import a_module
ModuleNotFoundError: No module named 'a_module'
---------------------------------------------------------------------------
CalledProcessError Traceback (most recent call last)
Cell In[10], line 1
----> 1 get_ipython().run_cell_magic('bash', '', 'python3 scripts/a_script.py\n')
File /opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/site-packages/IPython/core/interactiveshell.py:2475, in InteractiveShell.run_cell_magic(self, magic_name, line, cell)
2473 with self.builtin_trap:
2474 args = (magic_arg_s, cell)
-> 2475 result = fn(*args, **kwargs)
2477 # The code below prevents the output from being displayed
2478 # when using magics with decodator @output_can_be_silenced
2479 # when the last Python token in the expression is a ';'.
2480 if getattr(fn, magic.MAGIC_OUTPUT_CAN_BE_SILENCED, False):
File /opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/site-packages/IPython/core/magics/script.py:153, in ScriptMagics._make_script_magic.<locals>.named_script_magic(line, cell)
151 else:
152 line = script
--> 153 return self.shebang(line, cell)
File /opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/site-packages/IPython/core/magics/script.py:305, in ScriptMagics.shebang(self, line, cell)
300 if args.raise_error and p.returncode != 0:
301 # If we get here and p.returncode is still None, we must have
302 # killed it but not yet seen its return code. We don't wait for it,
303 # in case it's stuck in uninterruptible sleep. -9 = SIGKILL
304 rc = p.returncode or -9
--> 305 raise CalledProcessError(rc, cell)
CalledProcessError: Command 'b'python3 scripts/a_script.py\n'' returned non-zero exit status 1.
As a crude solution to the problem above, you can do what we’ve done here, and
put the directory containing the module into the Python sys.path
list, at
the top of the files that need it:
%%file scripts/a_script.py
""" This is a_script
We've made sure a_module is on the Python path this time.
"""
import sys
sys.path.append('code')
import a_module
print('Result of a_func is:', a_module.a_func())
Overwriting scripts/a_script.py
Then:
%%bash
python3 scripts/a_script.py
Finished importing a_module.py
Result of a_func is: 99
The simple append
above will only work when running the script from a
directory containing the code
subdirectory. For example, here we are
running a few commands in the terminal, to show that the script fails if we
run it from another directory:
%%bash
mkdir another_dir
cd another_dir
# Run the script, but from the new directory.
python3 ../scripts/a_script.py
Traceback (most recent call last):
File "/home/runner/work/dipy-textbook/dipy-textbook/another_dir/../scripts/a_script.py", line 9, in <module>
import a_module
ModuleNotFoundError: No module named 'a_module'
---------------------------------------------------------------------------
CalledProcessError Traceback (most recent call last)
Cell In[13], line 1
----> 1 get_ipython().run_cell_magic('bash', '', 'mkdir another_dir\ncd another_dir\n# Run the script, but from the new directory.\npython3 ../scripts/a_script.py\n')
File /opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/site-packages/IPython/core/interactiveshell.py:2475, in InteractiveShell.run_cell_magic(self, magic_name, line, cell)
2473 with self.builtin_trap:
2474 args = (magic_arg_s, cell)
-> 2475 result = fn(*args, **kwargs)
2477 # The code below prevents the output from being displayed
2478 # when using magics with decodator @output_can_be_silenced
2479 # when the last Python token in the expression is a ';'.
2480 if getattr(fn, magic.MAGIC_OUTPUT_CAN_BE_SILENCED, False):
File /opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/site-packages/IPython/core/magics/script.py:153, in ScriptMagics._make_script_magic.<locals>.named_script_magic(line, cell)
151 else:
152 line = script
--> 153 return self.shebang(line, cell)
File /opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/site-packages/IPython/core/magics/script.py:305, in ScriptMagics.shebang(self, line, cell)
300 if args.raise_error and p.returncode != 0:
301 # If we get here and p.returncode is still None, we must have
302 # killed it but not yet seen its return code. We don't wait for it,
303 # in case it's stuck in uninterruptible sleep. -9 = SIGKILL
304 rc = p.returncode or -9
--> 305 raise CalledProcessError(rc, cell)
CalledProcessError: Command 'b'mkdir another_dir\ncd another_dir\n# Run the script, but from the new directory.\npython3 ../scripts/a_script.py\n'' returned non-zero exit status 1.
This is because the directory code
that we specified is a relative path,
and therefore Python looks for the code
directory in the current working
directory.
To make the hack work when running the code from any directory, you could use some path manipulation on the file variable:
%%file scripts/a_script.py
""" This is a_script
Another more general way of making sure the code directory is on the Python
path.
"""
from pathlib import Path
# Directory containing this script.
MY_DIRECTORY = Path(__file__).parent
# Code directory is in the directory above the one containing the script.
CODE_DIRECTORY = MY_DIRECTORY / '..' / 'code'
print('code directory is', str(CODE_DIRECTORY))
# Put this directory on the path.
# sys.path expects strings, not Path objects.
import sys
sys.path.append(str(CODE_DIRECTORY))
import a_module
print('Result of a_func is:', a_module.a_func())
Overwriting scripts/a_script.py
Now the module import does work from this directory, or from another_dir
%%bash
# Running from this directory
python3 scripts/a_script.py
code directory is /home/runner/work/dipy-textbook/dipy-textbook/scripts/../code
Finished importing a_module.py
Result of a_func is: 99
%%bash
# From another_directory
cd another_dir
python3 ../scripts/a_script.py
code directory is /home/runner/work/dipy-textbook/dipy-textbook/another_dir/../scripts/../code
Finished importing a_module.py
Result of a_func is: 99