
Python comes with a collection of built-in data types that make common data-wrangling operations easy. Among them is the list, a simple but versatile collection type.
With a Python list, you can group Python objects together in a one-dimensional row that allows objects to be accessed by position, added, removed, sorted, and subdivided.
Python list basics
Defining a list in Python is easy — just use the bracket syntax to indicate items in a list.
list_of_ints = [1, 2, 3]
Items in a list do not have to all be the same type, either. They can be any Python object. (Here, assume Three
is a function.)
list_of_objects = ["One", TWO, Three, {"Four":4}, None]
Note that having mixed objects in a list can have implications for sorting the list. We’ll go into this later.
The biggest reason to use a list is to able to find objects by their position in the list. To do this, you use Python’s index notation: a number in brackets, starting at 0, that indicates the position of the item in the list.
For the above example, list_of_ints[0]
yields 1
. list_of_ints[1]
yields 2
. list_of_objects[4]
would be the None
object.
Python list indexing
If you use a positive integer for the index, the integer indicates the position of the item to look for. But if you use a negative integer, then the integer indicates the position starting from the end of the list. For example, using an index of -1
is a handy way to grab the last item from a list no matter the size of the list.
list_of_ints[-1]
yields 3
. list_of_objects[-1]
yields None
.
You can also use an integer variable as your index. If x=0
, list_of_ints[x]
yields 1, and so on.
Adding and removing Python list items
Python has several ways you can add or remove items from a list.
.append()
inserts an item at the end of the list. For example,list_of_ints.append(4)
would turnlist_of_ints
into the list[1,2,3,4]
. Appends are fast and efficient; it takes about the same amount of time to append one item to a list no matter how long the list is..pop()
removes and returns the last item from the list. If we ranx = list_of_ints.pop()
on the originallist_of_ints
, x would contain the value3
. (You don’t have to assign the results of.pop()
to a value, though, if you don’t need it.).pop()
operations are also fast and efficient..insert()
inserts an item at some arbitrary position in the list. For example,list_of_ints.insert(0,10)
would turnlist_of_ints
into[10,1,2,3]
. Note that the closer you insert to the front of the list, the slower this operation will be, though you won’t see much of a slowdown unless your list has many thousands of elements or you’re doing the inserts in a tight loop..pop(x)
removes the item at the indexx
. Solist_of_ints.pop(0)
would remove the item at index 0. Again, the closer you are to the front of the list, the slower this operation can be..remove(item)
removes an item from a list, but not based on its index. Rather,.remove()
removes the first occurrence of the object you specify, searching from the top of the list down. For[3,7,7,9,8].remove(7)
, the first7
would be removed, resulting in the list[3,7,9,8]
. This operation too can slow down for a large list, since it theoretically has to traverse the entire list to work.
Slicing a Python list
Lists can be divided up into new lists, a process called slicing. Python’s slice syntax lets you specify which part of a list to carve off, and how to manipulate the carved-off portion.
You saw above how to use the bracket notation to get a single item from a list: my_list[2]
, for example. Slices use a variant of the same index notation (and following the same indexing rules): list_object[start:stop:step]
.
start
is the position in the list to start the slice.stop
is the position in the list where we stop slicing. In other words, that position and everything after it is omitted.step
is an optional “every nth element” indicator for the slice. By default this is1
, so the slice retains every element from the list it’s slicing from. Setstep
to2
, and you’ll select every second element, and so on.
Here are some examples. Consider this list:
slice_list = [1,2,3,4,5,6,7,8,9,0] slice_list[0:5] = [1, 2, 3, 4, 5]
(Note that we’re stopping at index 4, not index 5!)
slice_list[0:5:2] = [1, 3, 5]
If you omit a particular slice index, Python assumes a default. Leave off the start index, and Python assumes the start of the list:
slice_list[:5] = [1, 2, 3, 4, 5]
Leave off the stop index, and Python assumes the end of the list:
slice_list[4:] = [5, 6, 7, 8, 9, 0]
The step
element can also be negative. This lets us take slices that are reversed copies of the original:
slice_list[::-1] = [0, 9, 8, 7, 6, 5, 4, 3, 2, 1]
Note that you can slice in reverse by using start and stop indexes that go backwards, not forwards:
slice_list[5:2:-1] = [6, 5, 4]
Also keep in mind that slices of lists are copies of the original list. The original list remains unchanged.
Sorting a Python list
Python provides two ways to sort lists: You can generate a new, sorted list from the old one, or you can sort an existing list in-place. These options have different behaviours and different usage scenarios.
To create a new, sorted list, use the sorted()
function on the old list:
new_list = sorted(old_list)
This will sort the contents of the list using Python’s default sorting methods. For strings, the default is alphabetical order; for numbers, it’s ascending values. Note that the contents of the list need to be consistent for this to work. For instance, you can’t sort a mix of integers and strings, but you can sort a list that is all integers or all strings. Otherwise you’ll get a TypeError
in the sort operation.
If you want to sort a list in reverse, pass the reverse
parameter:
new_list = sorted(old_list, reverse=True)
The other way to sort, in-place sorting, performs the sort operation directly on the original list. To do this, use the list’s .sort()
method:
old_list.sort()
.sort()
also takes reverse
as a parameter, allowing you to sort in reverse.
Both sorted()
and .sort()
also take a key
parameter. The key
parameter lets you provide a function that can be used to perform a custom sorting operation. When the list is sorted, each element is passed to the key
function, and the resulting value is used for sorting. For instance, if we had a mix of integers and strings, and we wanted to sort them, we could use key
like this:
mixed_list = [1,"2",3,"4", None] def sort_mixed(item): try: return int(item) except: return 0 sorted_list = sorted(mixed_list, key = sort_mixed) print (sorted_list)
Note that this code wouldn’t convert each element of the list into an integer! Rather, it would use the integer value of each item as its sort value. Also note how we use a try/except
block to trap any values that don’t translate cleanly into an integer, and return 0
for them by default.
Python lists are not arrays
One important thing to know about lists in Python is that they aren’t “arrays.” Other languages, like C, have one-dimensional or multi-dimensional constructions called arrays that accept values of a single type. Lists are heterogenous; they can accept objects of any type.
What’s more, there is a separate array
type in Python. The Python array
is designed to emulate the behaviour of an array in C, and it’s meant chiefly to allow Python to work with C arrays. The array
type is useful in those cases, but in almost every pure-Python case you’ll want to use lists.
When to use Python lists (and when not to)
So when are Python lists most useful? A list is best when:
- You need to find things quickly by their position in a collection. Accessing any position in a list takes the same amount of time, so there is no performance penalty for looking up even the millionth item in a list.
- You’re adding and removing to the collection mainly by appending to the end or removing from the end, in the manner of a stack. Again, these operations take the same amount of time regardless of the length of the list.
A Python list is less suitable when:
- You want to find an item in a list, but you don’t know its position. You can do this with the
.index()
property. For instance, you could uselist_of_ints.index(1)
to find the index of the first occurrence of the number1
inlist_of_ints
. Speed should not be not an issue if your list is only a few items long, but for lists thousands of items long, it means Python has to search the entire list. For a scenario like this, use a dictionary, where each item can be found using a key, and where the lookup time will be the same for each value. - You want to add or remove items from any position but the end. Each time you do this, Python must move every other item after the added or removed item. The longer the list, the greater the performance issue this becomes. Python’s
deque
object is a better fit if you want to add or remove objects freely from either the start or the end of the list.