Creare arrays strutturati
Structured array data types can be specified in a number of ways. Earlier, we saw the dictionary method:
For clarity, numerical types can be specified using Python types or NumPy dtypes instead:
A compound type can also be specified as a list of tuples:
If the names of the types do not matter to you, you can specify the types alone in a comma-separated string:
The shortened string format codes may seem confusing, but they are built on simple principles. The first (optional) character is
>, which means “little endian” or “big endian,” respectively, and specifies the ordering convention for significant bits. The next character specifies the type of data: characters, bytes, ints, floating points, and so on (see the table below). The last character or characters represents the size of the object in bytes.
Character Description Example 'b' Byte np.dtype('b') 'i' Signed integer np.dtype('i4') == np.int32 'u' Unsigned integer np.dtype('u1') == np.uint8 'f' Floating point np.dtype('f8') == np.int64 'c' Complex floating point np.dtype('c16') == np.complex128 'S', 'a' String np.dtype('S5') 'U' Unicode string np.dtype('U') == np.str_ 'V' Raw data (void) np.dtype('V') == np.void
Ancora sui tipi composti avanzati
It is possible to define even more advanced compound types. For example, you can create a type where each element contains an array or matrix of values. Here, we’ll create a data type with a mat component consisting of a 3×3 floating-point matrix:
Now each element in the
X array consists of an id and a 3×3 matrix. Why would you use this rather than a simple multidimensional array, or perhaps a Python dictionary? The reason is that this NumPy
dtype directly maps onto a C structure definition, so the buffer containing the array content can be accessed directly within an appropriately written C program. If you find yourself writing a Python interface to a legacy C or Fortran library that manipulates structured data, you’ll probably find structured arrays quite useful!
RecordArrays: arrays strutturati con il turbo
NumPy also provides the
np.recarray class, which is almost identical to the structured arrays just described, but with one additional feature: fields can be accessed as attributes rather than as dictionary keys. Recall that we previously accessed the ages by writing:
ho dovuto ricostruire l’array, ovviamente 😉
If we view our data as a record array instead, we can access this with slightly fewer keystrokes:
The downside is that for record arrays, there is some extra overhead involved in accessing the fields, even when using the same syntax. We can see this here:
Whether the more convenient notation is worth the additional overhead will depend on your own application.
Ma c’è Pandas
This section on structured and record arrays is purposely at the end of this chapter, because it leads so well into the next package we will cover: Pandas. Structured arrays like the ones discussed here are good to know about for certain situations, especially in case you’re using NumPy arrays to map onto binary data formats in C, Fortran, or another language. For day-to-day use of structured data, the Pandas package is a much better choice, and we’ll dive into a full discussion of it in the chapter that follows.
OK Jake; aspettiamo Pandas 😀