You may have seen the line (usually at the end of a Python script):
if __name__ == '__main__':
# some other code
PythonAnd wondered what it does, and why it is there. After reading this you will never wonder why again!
What does it do?
In short, if __name__ == '__main__':
only runs the body of the if
block if it the file that was directly called (e.g. python my_script.py
), not if it is imported (e.g. import my_script
).
When would I use this?
Let’s say you have a script that processes some large datasets. You have the parts broken up into separate functions for a clean design, it may look something like this (let’s call it my_model.py
):
import numpy as np
import pandas as pd
def read_dataset(dataset_path):
return pd.read_csv(dataset_path)
def train_model(dataset):
# training code here...
return model
def test_model(model):
# test the model
dataset = read_dataset('my_data.csv')
model = train_model(dataset)
test_model(model)
PythonThis works great when it is all in a single file, but let’s say you want to use the read_dataset
function in a different file. In your new file you write from my_model import read_dataset
and your new file takes a long time to run… this is because when you import the function it is also reading the dataset, training the model, and testing the model!
You can modify my_model.py
so that this doesn’t happen, like this:
import numpy as np
import pandas as pd
def read_dataset(dataset_path):
return pd.read_csv(dataset_path)
def train_model(dataset):
# training code here...
return model
def test_model(model):
# test the model
if __name__ == '__main__': # added this line and put the following 3 lines in the block
dataset = read_dataset('my_data.csv')
model = train_model(dataset)
test_model(model)
PythonSee the difference? Now, when you run python my_model.py
the script will read the dataset, train the model, and test the model. But, when you import this file these functions won’t run; however, you can still use the functions!
Why does it work this way?
Every module (i.e. file) in Python has the __name__
attribute set. If the module is the entry point (what you typed in the command prompt) then the __name__
is going to be '__main__'
. Otherwise, it is going to be the name of the file and any packages that it is a part of.
For example, if you have the following directory structure:
my_python_package/
__init__.py
cool_subpackage/
__init__.py
super_cool_module.py
And you have import my_python_package.cool_subpackage.super_cool_module
the __name__
inside super_cool_module
will be my_python_package.cool_subpackage.super_cool_module
.
Leave a Reply