Up to this point we have been concerned mainly with tools to access and operate on array data with NumPy. This section covers algorithms related to sorting values in NumPy arrays. These algorithms are a favorite topic in introductory computer science courses: if you’ve ever taken one, you probably have had dreams (or, depending on your temperament, nightmares) about insertion sorts, selection sorts, merge sorts, quick sorts, bubble sorts, and many, many more. All are means of accomplishing a similar task: sorting the values in a list or array.
For example, a simple selection sort repeatedly finds the minimum value from a list, and makes swaps until the list is sorted. We can code this in just a few lines of Python:
As any first-year computer science major will tell you, the selection sort is useful for its simplicity, but is much too slow to be useful for larger arrays. For a list of
N values, it requires
N loops, each of which does on order
∼N comparisons to find the swap value. In terms of the “big-O” notation often used to characterize these algorithms (see Big-O Notation [prossimamente]), selection sort averages
O[N2]: if you double the number of items in the list, the execution time will go up by about a factor of four.
Even selection sort, though, is much better than my all-time favorite sorting algorithms, the
This silly sorting method relies on pure chance: it repeatedly applies a random shuffling of the array until the result happens to be sorted. With an average scaling of
N factorial) this should–quite obviously–never be used for any real computation.
Fortunately, Python contains built-in sorting algorithms that are much more efficient than either of the simplistic algorithms just shown. We’ll start by looking at the Python built-ins, and then take a look at the routines included in NumPy and optimized for NumPy arrays.
Ordinamenti veloci in NumPy:
Although Python has built-in sort and sorted functions to work with lists, we won’t discuss them here because NumPy’s
np.sort function turns out to be much more efficient and useful for our purposes. By default
np.sort uses an
quicksort algorithm, though
heapsort are also available. For most applications, the default
quicksort is more than sufficient.
To return a sorted version of the array without modifying the input, you can use
If you prefer to sort the array in-place, you can instead use the
sort method of arrays:
A related function is
argsort, which instead returns the indices of the sorted elements:
The first element of this result gives the index of the smallest element, the second value gives the index of the second smallest, and so on. These indices can then be used (via fancy indexing) to construct the sorted array if desired:
Ordinare per righe o colonne
A useful feature of NumPy’s sorting algorithms is the ability to sort along specific rows or columns of a multidimensional array using the axis argument. For example:
Keep in mind that this treats each row or column as an independent array, and any relationships between the row or column values will be lost!
Pausa 😊 poi continerà.