Understand Python Data Model Through Coding and Debugging

Tetsuya Hirata
6 min readJun 5, 2021

--

Photo by Steve001 on Pixabay

At first, let’s quickly summarize Python Data Model based on the Python documents. The below quote in the document is the core idea of Python Data Model that you should not forget.

Objects are Python’s abstraction for data. All data in a Python program is represented by objects or by relations between objects. (In a sense, and in conformance to Von Neumann’s model of a “stored program computer”, code is also represented by objects.)

There are three key points below that we need to keep in mind.

1)Python Data Model:
Objects are the abstract representations of the data.
Objects consist of identity(id), type, and value.
- id is the unique number of each object which means object address in the memory, and it is created with object.
- type is the data type, in other word the data category.
- value is the data itself.

2)Object Types:
Immutable: impossible to add values to the object having the same id
Mutable: possible to add values to the object having the same id
Iterable: possible to extract values
Sequence: possible to extract values in order

3)Summary of the Data Types and the Object Types:
— — —
int, bool, float: immutable
str: immutable, iterable, sequence
tuple: immutable, iterable, sequence
— — —
list: mutable, iterable, sequence
dict: mutable, iterable
set: mutable, iterable
— — —

Before starting to write code, I suggest to install JupyterLab for coding and debugging.

$ pip install jupyterlab

Mutable vs Immutable on Default Argument Coding to understand Python Data Model

Notice1: drawbacks of immutable object
- It is possible to change values in the same object whenever the function is executed.
- There are chances to unconsciously use extra memory.

Notice2: features of set and dict
- Not allowed to add the same values which means that set object and key object stay unique
- In order to add the same values in the dict object, you need to add different key to the same values.

Notice3: memory size of each data type
int > str > tuple > list > set > dict

Notice4: combination with non-iterable object and tuple.
- Whenever tuple and ‘=’ are used, the values are iterable.
- ‘+=’or’=’ with tuple needs iterable object.

Coding Examples of the Immutable Objects:

import sysdef immutable1(k='', v='10'):
k+=v
print('object id: '+ str(id(k)))
print('memory: ' + str(sys.getsizeof(k)))
print('output: '+ str(k))
immutable1()
immutable1()
object id: 4444478960
memory: 51
output: 10
object id: 4444478960
memory: 51
output: 10
-----
def immutable2(k=0, v=10):
k+=v
print(id(k))
print(sys.getsizeof(k))
print(k)
immutable2()
immutable2()
object id: 4437834320
memory: 28
output: 10
object id: 4437834320
memory: 28
output: 10
-----
def tuple_immutable3(k=(), v=tuple(10)):
k+=v
print('object id: '+ str(id(k)))
print('memory: ' + str(sys.getsizeof(k)))
print('output: '+ str(k))
---------------------------------------------------------------------------

TypeError Traceback (most recent call last)

<ipython-input-8-b7fcf44f6867> in <module>
----> 1 def tuple_immutable3(k=(), v=tuple(10)):
2 k+=v
3 print('object id: '+ str(id(k)))
4 print('memory: ' + str(sys.getsizeof(k)))
5 print('output: '+ str(k))


TypeError: 'int' object is not iterable
-----
def tuple_immutable4(k=(), v=tuple('111')):
k+=v
print('object id: '+ str(id(k)))
print('memory: ' + str(sys.getsizeof(k)))
print('output: '+ str(k))
tuple_immutable4()
tuple_immutable4()
object id: 4477693248
memory: 64
output: ('1', '1', '1')
object id: 4477693248
memory: 64
output: ('1', '1', '1')

Coding Examples of the Mutable Objects:

def mutable1(k=[], v=10):
k.append(v)
print('object id: '+ str(id(k)))
print('memory: ' + str(sys.getsizeof(k)))
print('output: '+ str(k))
mutable1()
mutable1()
mutable1()
mutable1()
mutable1()
object id: 4506089856
memory: 88
output: [10]
object id: 4506089856
memory: 88
output: [10, 10]
object id: 4506089856
memory: 88
output: [10, 10, 10]

object id: 4506089856
memory: 88
output: [10, 10, 10, 10]
object id: 4506089856
memory: 120
output: [10, 10, 10, 10, 10]

-----
def mutable2(k=None, v=10):
k = []
k.append(v)
print('object id: '+ str(id(k)))
print('memory: ' + str(sys.getsizeof(k)))
print('output: '+ str(k))
mutable2()
mutable2()
mutable2()
mutable2()
mutable2()
object id: 4506095680
memory: 88
output: [10]
object id: 4506038976
memory: 88
output: [10]
object id: 4506095808
memory: 88
output: [10]
object id: 4506090880
memory: 88
output: [10]
object id: 4506095680
memory: 88
output: [10]
-----
def mutable3(k=set(), v=10):
k.add(v)
print('object id: '+ str(id(k)))
print('memory: ' + str(sys.getsizeof(k)))
print('output: '+ str(k))
mutable3()
mutable3()
mutable3()
mutable3()
mutable3()
object id: 4507180384
memory: 216
output: {10}
object id: 4507180384
memory: 216
output: {10}
object id: 4507180384
memory: 216
output: {10}
object id: 4507180384
memory: 216
output: {10}
object id: 4507180384
memory: 216
output: {10}
-----
def mutable4(k={}, v=10):
k["v"] = v
print('object id: '+ str(id(k)))
print('memory: ' + str(sys.getsizeof(k)))
print('output: '+ str(k))
mutable4()
mutable4()
mutable4()
mutable4()
mutable4()
object id: 4507220928
memory: 232
output: {'v': 10}
object id: 4507220928
memory: 232
output: {'v': 10}
object id: 4507220928
memory: 232
output: {'v': 10}
object id: 4507220928
memory: 232
output: {'v': 10}
object id: 4507220928
memory: 232
output: {'v': 10}

Mutable vs Immutable on Early Return Coding to understand Python Data Model

Notice1: None
- Data type of None object is ‘Nonetype’.
- None is different from other empty data type.

Notice2: ‘=’ vs ‘is’
- ‘=’ compares only values
- ‘is’ compares values and id
- Mutable object create different id, so be cautious when you compare the values by using ‘=’

Coding Examples of the None Objects of each data type:

A = 'A'
B = ''
C = False
D = 0
E = []
F = set()
G = tuple()
H = {}
Nothing = Noneif not A:
print('A is not empty') # No output
if not B:
print('B is empty') # D is empty
if not C:
print('C is empty') # C is empty
if not D:
print('D is empty') # D is empty
if not E:
print('E is empty') # E is empty
if not F:
print('F is empty') # F is empty
if not G:
print('G is empty') # G is empty
if not H:
print('H is empty') # H is empty
if not Nothing:
print('Nothing is none') # Nothing is none
------
if A is None:
print('A is not empty') # No output
if B is None:
print('B is empty') # No output
if C is None:
print('C is empty') # No output
if D is None:
print('D is empty') # No output
if E is None:
print('E is empty') # No output
if F is None:
print('F is empty') # No output
if G is None:
print('G is empty') # No output
if H is None:
print('H is empty') # No output
if Nothing is None:
print('Nothing is none') # Nothing is none
------
if A == None:
print('A is not empty') # No output
if B == None:
print('B is empty') # No output
if C == None:
print('C is empty') # No output
if D == None:
print('D is empty') # No output
if E == None:
print('E is empty') # No output
if F == None:
print('F is empty') # No output
if G == None:
print('G is empty') # No output
if H == None:
print('H is empty') # No output
if Nothing == None:
print('Nothing is none') # Nothing is none

Coding Examples of the comparison between “=” and “is” operations of each data type:

# Mutable: even if the object has the same value, the id in the object is going to be different. 
list1 = [1, 2, 3]
list2 = [1, 2, 3]
list3 = None
list4 = []
print(id(list1)) # 4375933568
print(id(list2)) # 4375934016

print(id(list3)) # 4335214224
print(id(list4)) # 4375933760
# Same values and different id
print(list1 == list2) # [1, 2, 3] == [1, 2, 3] -> True
print(list1 is list2) # [1, 2, 3] is [1, 2, 3] -> False
# Same values and same id
print(list3 == None) # None == None -> True
print(list3 is None) # None is None -> True
# Same values and different id
print(list4 == []) # [] == [] -> True
print(list4 is []) # [] == [] -> False
------
# Immutable: if the object has the same value, the id in the object is going to be the same.
int1 = 1
int2 = 1
int3 = None
int4 = 0
print(id(int1)) # 4335356608
print(id(int2)) # 4335356608

print(id(int3)) # 4335214224
print(id(int4)) # 4335356576
# Same values and id
print(int1 == int2) # 1 == 1 -> True
print(int1 is int2) # 1 is 1 -> True
# Same values and id
print(int3 == None) # None == None -> True
print(int3 is None) # None is None -> True
# Different value and id
print(int3 == int4) # None == 0 -> False
print(int3 is int4) # None is 0 -> False

I recommend that you write the above code by yourself and get deeper understanding. These above sample codes are insightful to know the principles of Python.

--

--

Tetsuya Hirata

Software engineer working mostly at the intersection of data science and engineering. @JesseTetsuya