Sequences Grouping
Container Sequences
Container sequences can hold items of different types
list, tuple, collections.depue
Flat Sequences
Flat Sequences hold items of one type
str, bytes, bytearray, memoryview, array.array
- Container sequences hold references to the objects they contain, which may be of any type
- Flat sequences physically store the value of each item, which has its own memory space
- Flat sequences are more compact than Container sequences, while it’s limited to holding primitive values like string, number, bytes
Mutable Sequences
list, bytearray, array.array, collection.deque, memoryview
Immutable Sequences
tuple, str, bytes
List
Listcomps
listcomps are short of list comprehesions, which is used to build sequences quickly. listcomps can enhance the readability of the python code.
1 | # example one |
example two is more readable than example one, because its intent is explicit and its length is short
listcomps do everything the map and filter functions do, without using the python lambda.1
2
3
4
5
6# use listcomps
symbols = 'feifeiyu'
ascii = [ord(sy) for sy in symbols if ord(sy) > 105]
# use map filter
ascii2 = list(filter(lambda sy: sy > 105, map(ord, symbols)))
Cartesian Products
Listcomps can generate lists from the cartesian products of two or more iterables. The items that make up the catesian products are tuples made from items from every input iterable.
1 | # use listcomps |
Generator Expressions
Tuples, arrays and other types of sequences can be initialized by listcomp, but a genexp (short of generator expressions) saves the memory because it yields items one by one using the iterator protocol instead of building a whole list just to feed another constructor.
Genexp use the same syntax as listcomps, but are closed in parenthesis rather than brackets.
1 | # example one |
Tuples
Tuples are not just immutable lists, they do double-duty: they can be used ad immutable lists and also as records with field namses.
Tuples as records: each item in the tuple holds the data for one field and the postion of the item gives its meaning.
If a tuple is just used ad immutable list, the quantity and the order of the items may or may not be important. But when using a tuple as collection of fields, the number of items is often fixed and their order is always vital.
1 | # examples of tuples used as records |
Tuple Unpacking
Parallel assignment is the most visible form of tuple unpacking, that is assigning items from an iterable to a tuple of variables.1
2lax_coord = (33.9425, -118.4080)
latitude, longitude = lax_coord #tuple unpacking
Swapping the values of variables without using temporary variable is also a application of tuple unpacking1
2# swap values of two variable
b, a = a, b
Another example of tuple unpacking is prefixing an argument with a star when calling a function1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16divmod(20, 8) # output (2, 4)
# use tuple
t = (20, 8)
divmod(*t) # use * to upack tuple, output (2, 4)
# output upacking
quo, rem = divmod(*t) # quo = 2, rem = 4
# Defining function parameters with *args to grab arbitrary excess arguments is a classic python feature
# dummy variable _
_, rem = divmod(*t)
# rem = 4,
# nested tuple unpacking
name, cc, pop, (lat, long) = ('Tokyo', 'JP', 36.933, (35.689, 139.691))
# restult: name = 'Tokyo', cc = 'JP', pop = 36.933, lat = 35.689, long 139.691
Sometimes we only care about certain parts of a tuple when upacking, a dummy variable like _ is used ad placeholder.
In python 3, use * to parallel assignment as well1
2
3
4
5
6a, b, *rest = range(5)
# output a = 0, b = 1, rest = [2,3,4]
a, b, *rest = range(3)
# output a = 0, b = 1, rest = [2]
a, b, *rest = range(2)
# output a = 0, b = 1, rest = []
Named tuples
The collections.namedtuple function is a factory that produces subclasses of tuple enhanced with field names and a class name1
2
3
4
5
6
7
8
9
10
11
12
13
14
15from collections import namedtuple
# Two paramters are required to create a named tuple: a class name and a list of field names
City = namedtuple('City', 'name country population coordinates')
# or City = namedtuple('City', ['name', 'country', 'population', 'coordinates'])
# field names can be given as an iterable of strings or as a single string that space-delimited string
# _fields is a tuple with the field names of the named tuple
# City._fields => ('name', 'country', 'population', 'coordinates')
tokyo = City('Tokyo', 'JP', 36.933, (35.689, 139.691))
# tokyo value: City(name='Tokyo', country='JP', population=36.933, coordinates=(35,689, 139.691))
# tokyo.name => 'Tokyo', tokyo.coordinates => (35,689, 139.691)
# _make() lets you instantiate a named tuple from a iterable; City(*tokyo)
# _asdict() returns a collections.OrderedDict built from the named tuple instance
Slicing
A common feature of list, tuple, str and all sequence types in python is the support of slicing operations.
1 | l = [1,2,3,4,5,6] |
Slice objects
To specify a stride or step in slice: s[a:b:c] a is the start postion, b is the stop postion, c is the step1
2
3
4
5
6s = 'feifeiyu'
# step = 3,
s[::3] # output => ffy
# if the step is negative, returned items is reversed
s[::-3] # output => uee
s[::-1] # output => uyiefief, reverse the string
The built-in sequence types in python are one-dimensional, so they support only one index or slice, and not a tuple of them.
The [] operator can take multiple indexes or slices separated by commonas. For instance, in the external NumPy package, where items of a 2-dimensional numpy.ndarray can be fetched using the syntax a[i, j] and a 2-dimensional slice obtained with an expresssion like a[m:n, k:l].
Assigning to Slices
Mutable sequences can be grafted, excised and otherwise modified in-place using slice notation on the left side of an assignment statement or as the target of a del statement.1
2
3
4
5l = range(10) # output => [0,1,2,3,4,5,6,7,8,9]
l[2:5] = [20, 30] # output => [0,1,20,30,5,6,7,8,9]
del l[5:7] # output => [0,1,20,30,5,8,9]
l[3::2] = [11, 22] # output => [0,1,20,11,5,22,9]
l[2:5] = [100] # outpu => [0,1,100,22,9]
Using + and * with sequences
Operator + is used to concatenate two operands that have same sequence type. And the two oprands are not modified but a new sequnces is created as result of the concatenation.
Operator * is used to concatenate multiple copies of the same sequence, multiply it by an integer. A new sequence is created as result.
1 | # operator + |
Building lists of lists
Sometimes we need to initialize a list with a certain number of nested list. The best way of doing this is with a listcomp
1 | # create a list of with 3 lists of 3 items each. |
A tempting but wrong shortcut is doing like below.1
2
3
4
5
6
7
8
9# The outer list is made of three references to the same inner list.
board = [['-'] * 3] * 3
# board => [['_', '_', '_'], ['_', '_', '_'], ['_', '_', '_']]
# before board is changed, all seems right
# modify board
board[1][2] = 'x'
# board => [['_', '_', 'x'], ['_', '_', 'x'], ['_', '_', 'x']]
# all rows are aliases referring to the same object.
Augmented assignment with sequences
The augmented assignment operators: +=, *=, -= …
The sepcial method that makes += work is iadd(for “in-place addition”), is iadd is not implemented, python fall back to calling add.
1 | # number |
Note: Repeated concatenation of immutable sequence is inefficient, because instead of just appending new items, the interpreter has to copy the whole target sequence to create a new one with the new items concatenated.
1 | # example for augmented assignment for a mutable item in tuples |
Note:
- Putting mutable items in tuple is not a good idea
- Augmented assignment is not an atomic operation
list.sort and the sorted built-in function
The list.sort method sorts a list in-place, without making a copy. It returns None to remind us that it changes the target object, and does not create a new list.
In contrast, the built-in function sorted creates a new list and returns it. sorted accepts any iterable objects as arguments, include immutable sequences and generators.
Both list.sort and sorted take two optional, keyword-only arguments: key and reverse.
- reverse: if True, the items are returned in descending order. default is False
- key: A one-argument function that will be applied to each item to produce its sorting key.
1 | # examples of sorted |
Managing ordered sequences with bisect
The bisect module offers two main functions bisect and insort, they use the binary search algorithm to quickly find an insert items in any sorted sequence
bisect
bisect(haystack, needle) does a binary search for needle in haystack which must be a sorted sequence, to locate the position where needle can be inserted while maintaining haystack in ascending order. In other words, all items appearing up to that position are less or equal to the needle.
1 | # usage of bisect |
Inserting with bisect.insort
Sorting is expensive, so once a sequence has sorted, it’s good to keep it that way.
insort(seq, item) inserts item in to seq so as to keep seq in ascending order.
1 | # usage of |
Arrays
For numbbers, array.array is more efficient than a list, it supports all mutable sequence operations(include .pop, .insert, and .extend), and additional methods for fast loading and saving such as .frombytes and .tofile.
When creating an array a typecode must be provided which determine the underlying C type used to store each item in the array.
1 | from array import array |
array.tofile and array.fromfile are easy to write or read numbers from files, and they are very fast.1
2
3
4
5
6
7
8fp = open('float.bin', 'wb')
floats.tofile(fp)
fp.close()
floats2 = array('d')
fp = open('floats.bin', 'rb')
# read 50 numbers from binary file
floats.fromfile(fp, 50)
fp.close()
Memory Views
The built-in memoryview class is a shared-memory sequence type that lets you handle slices of arrays without copying bytes.
Using notation similar to the array module, the memoryview.cast method lets you change the way multiple bytes are read or written as units without moving bits around.
1 | # build a array of 5 short signed integers (typecode 'h') |
Deques and other queues
The .append and .pop methods make a list usable as stack or a queue. But inserting and removing from the left of a list is costly because the entire list must be shifted.
The class cllections.deque is a thread-safe double-ended queue designed for fast inserting and removing from both ends.
1 | from collections import deque |
Besides deques, other python standard library packages implement queues
queue
Provides the synchronized classes Queue, LifoQueue and PriorityQueue. These are used for safe communication between threads.
multiprocessing
Implements its own bounded Queue, very similar to queue.Queue but designed for inter-process communictaion.
asyncio
Newly added to python 3.4, asyncio provides Queue, LifoQueue, PriorityQueue and JoinableQueue with APIs inspired by the class queue and multiprocessing, but adapted for managing tasks in synchronous programming
heapq
In contrast to the previous three modules, heapq does not implement a queue class, but provides functions like heappush and heappop that let you use a mutable sequence as a heap queue or priority queue