You are a conscientious developer and want to ensure that your Python program isn’t using too much memory, because ya know, it is the right thing to do. Or perhaps you are analyzing a large dataset and have already run out of RAM on your laptop… so you need to optimize your objects to see where all of the memory is being used.
If you find yourself in one of the cases above (or something else entirely!) then you are in luck because Python has a builtin function to retrieve the memory usage (in bytes) of an object. It is called sys.getsizeof
(official Python reference) and can simply be used like so:
import sys
a = 42
sys.getsizeof(a) # 28 bytes
b = 2 ** 32
sys.getsizeof(b) # 32 bytes
c = 2 ** 64
sys.getsizeof(c) # 40 bytes
PythonThat was easy! Not so fast…
sys.getsizeof
works great for primitive objects and builtin types, but what about a custom class that you wrote? Let’s look at an example of a simple silly class:
class HiddenDataClass:
def __init__(self):
self.data = []
PythonTry instantiating this class and checking the size of it:
import sys
class HiddenDataClass:
def __init__(self):
self.data = []
hidden_data = HiddenDataClass()
sys.getsizeof(hidden_data) # 48 bytes
PythonThen if you actually add to hidden_data.data
and check the size again, it doesn’t change! Check it out:
hidden_data.data = [a for a in range(2 ** 12)]
sys.getsizeof(hidden_data) # 48 bytes
PythonBut this is misleading, because if you run sys.getsizeof(hidden_data.data)
it will say there are 3,3048 bytes, which is way more than the 48 bytes sys.getsizeof(hidden_data)
returns!
What do you do about it? Introducing the __sizeof__
method
You can fix this by adding a __sizeof__
method to the class to specify where the data is stored. You can do this like so:
class HiddenDataClass:
def __init__(self):
self.data = []
def __sizeof__(self):
return sys.getsizeof(self.data)
PythonNow when you run sys.getsizeof(hidden_data)
it will accurately reflect the memory used by the objects in the class!
Leave a Reply