Category Archives: NumPy

NumPy – 14 – Le basi degli arrays di NumPy – 3


Continuo da qui, copio qui.

Ridimensionare gli arrays
Another useful type of operation is reshaping of arrays. The most flexible way of doing this is with the reshape method. For example, if you want to put the numbers 1 through 9 in a 3×3 grid, you can do the following:


Note that for this to work, the size of the initial array must match the size of the reshaped array. Where possible, the reshape method will use a no-copy view of the initial array, but with non-contiguous memory buffers this is not always the case.

Another common reshaping pattern is the conversion of a one-dimensional array into a two-dimensional row or column matrix. This can be done with the reshape method, or more easily done by making use of the newaxis keyword within a slice operation:


We will see this type of transformation often throughout the remainder of the book. Da ricordarselo; io sarei per l’altra versione.

Concatenazione e suddivisione di arrays
All of the preceding routines worked on single arrays. It’s also possible to combine multiple arrays into one, and to conversely split a single array into multiple arrays. We’ll take a look at those operations here.

Concatenazione di arrays
Concatenation, or joining of two arrays in NumPy, is primarily accomplished using the routines np.concatenate, np.vstack, and np.hstack. np.concatenate takes a tuple or list of arrays as its first argument, as we can see here:


You can also concatenate more than two arrays at once:


It can also be used for two-dimensional arrays:


For working with arrays of mixed dimensions, it can be clearer to use the np.vstack (vertical stack) and np.hstack (horizontal stack) functions:


Similary, np.dstack will stack arrays along the third axis.

Suddividere arrays
The opposite of concatenation is splitting, which is implemented by the functions np.split, np.hsplit, and np.vsplit. For each of these, we can pass a list of indices giving the split points:


Notice that N split-points, leads to N + 1 subarrays. The related functions np.hsplit and np.vsplit are similar:


Similarly, np.dsplit will split arrays along the third axis.


NumPy – 13 – Le basi degli arrays di NumPy – 2


Continuo dal post precedente, sempre copiando qui.

Suddividere arrays, costruire sub-arrays
Just as we can use square brackets to access individual array elements, we can also use them to access subarrays with the slice notation, marked by the colon (:) character. The NumPy slicing syntax follows that of the standard Python list; to access a slice of an array x, use this:



A potentially confusing case is when the step value is negative. In this case, the defaults for start and stop are swapped. This becomes a convenient way to reverse an array:


Sub-arrays con arrays-multidimensionali
Multi-dimensional slices work in the same way, with multiple slices separated by commas.


Finally, subarray dimensions can even be reversed together:


Accedere a righe e colonne dell’array
One commonly needed routine is accessing of single rows or columns of an array. This can be done by combining indexing and slicing, using an empty slice marked by a single colon (:):


In the case of row access, the empty slice can be omitted for a more compact syntax:


Visualizzazione di subarrays senza copiare
One important –and extremely useful– thing to know about array slices is that they return views rather than copies of the array data. This is one area in which NumPy array slicing differs from Python list slicing: in lists, slices will be copies. Consider our two-dimensional array from before:


Now if we modify this subarray, we’ll see that the original array is changed! Observe:


This default behavior is actually quite useful: it means that when we work with large datasets, we can access and process pieces of these datasets without the need to copy the underlying data buffer.

Creare copie di arrays
Despite the nice features of array views, it is sometimes useful to instead explicitly copy the data within an array or a subarray. This can be most easily done with the copy() method:


If we now modify this subarray, the original array is not touched:


Continua 😀

NumPy – 12 – Le basi degli arrays di NumPy – 1


Copio qui, continuando da qui.

Data manipulation in Python is nearly synonymous with NumPy array manipulation: even newer tools like Pandas (Chapter 3 [prossimamente]) are built around the NumPy array. This section will present several examples of using NumPy array manipulation to access data and subarrays, and to split, reshape, and join the arrays. While the types of operations shown here may seem a bit dry and pedantic, they comprise the building blocks of many other examples used throughout the book. Get to know them well!

We’ll cover a few categories of basic array manipulations here:

  • Attributes of arrays: Determining the size, shape, memory consumption, and data types of arrays
  • Indexing of arrays: Getting and setting the value of individual array elements
  • Slicing of arrays: Getting and setting smaller subarrays within a larger array
  • Reshaping of arrays: Changing the shape of a given array
    Joining and splitting of arrays: Combining multiple arrays into one, and splitting one array into many

Attributi degli arrays di NumPy
First let’s discuss some useful array attributes. We’ll start by defining three random arrays, a one-dimensional, two-dimensional, and three-dimensional array. We’ll use NumPy’s random number generator, which we will seed with a set value in order to ensure that the same random arrays are generated each time this code is run:


Each array has attributes ndim (the number of dimensions), shape (the size of each dimension), and size (the total size of the array):


Another useful attribute is the dtype, the data type of the array (which we discussed previously [post precedente]):


Other attributes include itemsize, which lists the size (in bytes) of each array element, and nbytes, which lists the total size (in bytes) of the array:


Indicizzazione degli arrays, accedere singoli elementi
If you are familiar with Python’s standard list indexing, indexing in NumPy will feel quite familiar. In a one-dimensional array, the ith value (counting from zero) can be accessed by specifying the desired index in square brackets, just as with Python lists:


To index from the end of the array, you can use negative indices:


In a multi-dimensional array, items can be accessed using a comma-separated tuple of indices:


Values can also be modified using any of the above index notation:


Keep in mind that, unlike Python lists, NumPy arrays have a fixed type. This means, for example, that if you attempt to insert a floating-point value to an integer array, the value will be silently truncated. Don’t be caught unaware by this behavior!


La storia è ancora lunga, pausa 😉


NumPy – 11 – Comprendere i tipi di dati in Python


Continuando da qui copio qui.

Un post teorico, mi sa che per capirci qualcosa quando lo rivedrò devo copiare tutto –o almeno parecchio 😙

Effective data-driven science and computation requires understanding how data is stored and manipulated. This section outlines and contrasts how arrays of data are handled in the Python language itself, and how NumPy improves on this. Understanding this difference is fundamental to understanding much of the material throughout the rest of the book.

Users of Python are often drawn-in by its ease of use, one piece of which is dynamic typing. While a statically-typed language like C or Java requires each variable to be explicitly declared, a dynamically-typed language like Python skips this specification. For example, in C you might specify a particular operation as follows:

/* C code */
int result = 0;
for(int i=0; i<100; i++){
    result += i;

While in Python the equivalent operation could be written this way:

# Python code
result = 0
for i in range(100):
    result += i

Notice the main difference: in C, the data types of each variable are explicitly declared, while in Python the types are dynamically inferred. This means, for example, that we can assign any kind of data to any variable:

# Python code
x = 4
x = "four"

Here we’ve switched the contents of x from an integer to a string. The same thing in C would lead (depending on compiler settings) to a compilation error or other unintented consequences:

/* C code */
int x = 4;
x = "four";  // FAILS

This sort of flexibility is one piece that makes Python and other dynamically-typed languages convenient and easy to use. Understanding how this works is an important piece of learning to analyze data efficiently and effectively with Python. But what this type-flexibility also points to is the fact that Python variables are more than just their value; they also contain extra information about the type of the value. We’ll explore this more in the sections that follow.

L’integer di Python è più di un integer
The standard Python implementation is written in C. This means that every Python object is simply a cleverly-disguised C structure, which contains not only its value, but other information as well. For example, when we define an integer in Python, such as x = 10000, x is not just a “raw” integer. It’s actually a pointer to a compound C structure, which contains several values. Looking through the Python 3.4 source code, we find that the integer (long) type definition effectively looks like this (once the C macros are expanded):

struct _longobject {
    long ob_refcnt;
    PyTypeObject *ob_type;
    size_t ob_size;
    long ob_digit[1];

A single integer in Python 3.4 actually contains four pieces:

  • ob_refcnt, a reference count that helps Python silently handle memory allocation and deallocation
  • ob_type, which encodes the type of the variable
  • ob_size, which specifies the size of the following data members
  • ob_digit, which contains the actual integer value that we expect the Python variable to represent.

This means that there is some overhead in storing an integer in Python as compared to an integer in a compiled language like C, as illustrated in the following figure:


Here PyObject_HEAD is the part of the structure containing the reference count, type code, and other pieces mentioned before.

Notice the difference here: a C integer is essentially a label for a position in memory whose bytes encode an integer value. A Python integer is a pointer to a position in memory containing all the Python object information, including the bytes that contain the integer value. This extra information in the Python integer structure is what allows Python to be coded so freely and dynamically. All this additional information in Python types comes at a cost, however, which becomes especially apparent in structures that combine many of these objects.

La list di Python è più di una list
Let’s consider now what happens when we use a Python data structure that holds many Python objects. The standard mutable multi-element container in Python is the list. We can create a list of integers as follows:


Because of Python’s dynamic typing, we can even create heterogeneous lists:


But this flexibility comes at a cost: to allow these flexible types, each item in the list must contain its own type info, reference count, and other information –that is, each item is a complete Python object. In the special case that all variables are of the same type, much of this information is redundant: it can be much more efficient to store data in a fixed-type array. The difference between a dynamic-type list and a fixed-type (NumPy-style) array is illustrated in the following figure:


At the implementation level, the array essentially contains a single pointer to one contiguous block of data. The Python list, on the other hand, contains a pointer to a block of pointers, each of which in turn points to a full Python object like the Python integer we saw earlier. Again, the advantage of the list is flexibility: because each list element is a full structure containing both data and type information, the list can be filled with data of any desired type. Fixed-type NumPy-style arrays lack this flexibility, but are much more efficient for storing and manipulating data.

Fixed-type arrays in Python
Python offers several different options for storing data in efficient, fixed-type data buffers. The built-in array module (available since Python 3.3) can be used to create dense arrays of a uniform type:


Here 'i' is a type code indicating the contents are integers.
Much more useful, however, is the ndarray object of the NumPy package. While Python’s array object provides efficient storage of array-based data, NumPy adds to this efficient operations on that data. We will explore these operations in later sections; here we’ll demonstrate several ways of creating a NumPy array.
We’ll start with the standard NumPy import, under the alias np:


Creare arrays da liste Python
First, we can use np.array to create arrays from Python lists:


Remember that unlike Python lists, NumPy is constrained to arrays that all contain the same type. If types do not match, NumPy will upcast if possible (here, integers are up-cast to floating point):


If we want to explicitly set the data type of the resulting array, we can use the dtype keyword:


Finally, unlike Python lists, NumPy arrays can explicitly be multi-dimensional; here’s one way of initializing a multidimensional array using a list of lists:


The inner lists are treated as rows of the resulting two-dimensional array.

Creare nuovi arrays
Especially for larger arrays, it is more efficient to create arrays from scratch using routines built into NumPy. Here are several examples:


e –mica finito 😉


e ancora


e infine


Tipi standard di NumPy
NumPy arrays contain values of a single type, so it is important to have detailed knowledge of those types and their limitations. Because NumPy is built in C, the types will be familiar to users of C, Fortran, and other related languages.

The standard NumPy data types are listed in the following table. Note that when constructing an array, they can be specified using a string:

np.zeros(10, dtype='int16')

Or using the associated NumPy object:

np.zeros(10, dtype=np.int16)

Data type  Description
bool_      Boolean (True or False) stored as a byte
int_       Default integer type (same as C long; normally either int64 or int32)
intc       Identical to C int (normally int32 or int64)
intp       Integer used for indexing (same as C ssize_t; normally either int32 or int64)
int8       Byte (-128 to 127)
int16      Integer (-32768 to 32767)
int32      Integer (-2147483648 to 2147483647)
int64      Integer (-9223372036854775808 to 9223372036854775807)
uint8      Unsigned integer (0 to 255)
uint16     Unsigned integer (0 to 65535)
uint32     Unsigned integer (0 to 4294967295)
uint64     Unsigned integer (0 to 18446744073709551615)
float_     Shorthand for float64.
float16    Half precision float: sign bit, 5 bits exponent, 10 bits mantissa
float32    Single precision float: sign bit, 8 bits exponent, 23 bits mantissa
float64    Double precision float: sign bit, 11 bits exponent, 52 bits mantissa
complex_   Shorthand for complex128.
complex64  Complex number, represented by two 32-bit floats
complex128 Complex number, represented by two 64-bit floats

More advanced type specification is possible, such as specifying big or little endian numbers; for more information, refer to the NumPy documentation. NumPy also supports compound data types, which will be covered [prossimamente].


NumPy – 10 – Introduzione di NumPy


Continuando da qui oggi finalmente qui.

This chapter […] outlines techniques for effectively loading, storing, and manipulating in-memory data in Python. The topic is very broad: datasets can come from a wide range of sources and a wide range of formats, including be collections of documents, collections of images, collections of sound clips, collections of numerical measurements, or nearly anything else. Despite this apparent heterogeneity, it will help us to think of all data fundamentally as arrays of numbers.

For example, images –particularly digital images– can be thought of as simply two-dimensional arrays of numbers representing pixel brightness across the area. Sound clips can be thought of as one-dimensional arrays of intensity versus time. Text can be converted in various ways into numerical representations, perhaps binary digits representing the frequency of certain words or pairs of words. No matter what the data are, the first step in making it analyzable will be to transform them into arrays of numbers.

For this reason, efficient storage and manipulation of numerical arrays is absolutely fundamental to the process of doing data science. We’ll now take a look at the specialized tools that Python has for handling such numerical arrays: the NumPy package, and the Pandas package [prossimamente].

This chapter will cover NumPy in detail. NumPy (short for Numerical Python) provides an efficient interface to store and operate on dense data buffers. In some ways, NumPy arrays are like Python’s built-in list type, but NumPy arrays provide much more efficient storage and data operations as the arrays grow larger in size. NumPy arrays form the core of nearly the entire ecosystem of data science tools in Python, so time spent learning to use NumPy effectively will be valuable no matter what aspect of data science interests you.

Install Anaconda, dice Jake, uhmmm… chissà forse, NumPy dovrei averlo…


Uh! devo installare Anaconda 😊

[considerate che qui ci sia una lunga pausa, sto installando]

OK, fatto, seguendo le indicazioni trovate qui: Download Anaconda Now.

Tarocca un po l’environment ma per Numpy kwesto&altro 😉 e ora


By convention, you’ll find that most people in the SciPy/PyData world will import NumPy using np as an alias:


Throughout this chapter, and indeed the rest of the book, you’ll find that this is the way we will import and use NumPy.

Un promemoria sulla documentazione
[D]on’t forget that IPython gives you the ability to quickly explore the contents of a package (by using the tab-completion feature), as well as the documentation of various functions using the ? character.

For example, to display all the contents of the numpy namespace, you can type this:


Nota: dopo il punto c’è Tab, nèh!
And to display NumPy’s built-in documentation, you can use np? e inoltre info più dettagliate qui.


NumPy – 9 – Risorse aggiuntive di IPython

Continuo con IPython, oggi qui.

[W]e’ve just scratched the surface of using IPython to enable data science tasks. Much more information is available both in print and on the Web, and here we’ll list some other resources that you may find helpful.

Risorse nel Web
The IPython website links to documentation, examples, tutorials, and a variety of other resources.

The nbviewer website shows static renderings of any IPython notebook available on the internet.

A Gallery of Interesting IPython Notebooks: This ever-growing list of notebooks, powered by nbviewer, shows the depth and breadth of numerical analysis you can do with IPython. It includes everything from short examples and tutorials to full-blown courses and books composed in the notebook format!

E poi cercando si trovano video e altro ancora.
Ci sono poi libri, Jake non ne cita di free, se volete l’elenco è là.
E poi c’è l’help come raccontato qui.

😊 Finito il capitolo su IPython, adesso si comincia per davvero con Numpy 😊


NumPy – 8 – Profiling e timing


Sempre su IPython copio qui continuando da qui.

Uno scrive, probabilmente non nel modo più efficiente, sapete com’è, dice Knuth “We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil“.

But once you have your code working, it can be useful to dig into its efficiency a bit.
IPython provides access to a wide array of functionality for this kind of timing and profiling of code. Here we’ll discuss the following IPython magic commands:

  • %time: Time the execution of a single statement
  • %timeit: Time repeated execution of a single statement for more accuracy
  • %prun: Run code with the profiler
  • %lprun: Run code with the line-by-line profiler
  • %memit: Measure the memory use of a single statement
  • %mprun: Run code with the line-by-line memory profiler

The last four commands are not bundled with IPython–you’ll need to get the line_profiler and memory_profiler extensions, which we will discuss in the following sections.

Controllare i tempi di porzioni di codice con %timeit e %time

For %time as with %timeit, using the double-percent-sign cell magic syntax allows timing of multiline scripts:


For more information on %time and %timeit, as well as their available options, use the IPython help functionality (i.e., type %time? at the IPython prompt).

Profiling script interi con %prun
Python contains a built-in code profiler (which you can read about in the Python documentation), but IPython offers a much more convenient way to use this profiler, in the form of the magic function %prun.


The result is a table that indicates, in order of total time on each function call, where the execution is spending the most time. In this case, the bulk of execution time is in the list comprehension inside sum_of_lists. From here, we could start thinking about what changes we might make to improve the performance in the algorithm.
For more information on %prun, as well as its available options, use the IPython help functionality (i.e., type %prun? at the IPython prompt).

È possibile profilare singole linee con %lprun. Bisogna installare line_profiler, cosa che non faccio.
L’uso della memoria si profila con %memit e %mprun, previa installazione di memory_profiler.

Sono tutte operazioni specialistiche, da approfondire se dovessero servire, per adesso le metto tra le cose da fare 😉


NumPy – 7 – Errori e debugging


Copio qui continuando da qui.

Code development and data analysis always require a bit of trial and error, and IPython contains tools to streamline this process. This section will briefly cover some options for controlling Python’s exception reporting, followed by exploring tools for debugging errors in code.

Controllare le eccezioni con %xmode
Most of the time when a Python script fails, it will raise an Exception. When the interpreter hits one of these exceptions, information about the cause of the error can be found in the traceback, which can be accessed from within Python. With the %xmode magic function, IPython allows you to control the amount of information printed when the exception is raised. Consider the following code:


Using the %xmode magic function (short for Exception mode), we can change what information is printed.

%xmode takes a single argument, the mode, and there are three possibilities: Plain, Context, and Verbose. The default is Context, and gives output like that just shown before. Plain is more compact and gives less information:




This extra information can help narrow-in on why the exception is being raised. So why not use the Verbose mode all the time? As code gets complicated, this kind of traceback can get extremely long. Depending on the context, sometimes the brevity of Default mode is easier to work with.

Debugging quando non basta la traceback
IPython con il comando magigo %debug è perhaps the most convenient interface to debugging. If you call it after hitting an exception, it will automatically open an interactive debugging prompt at the point of the exception. The ipdb prompt lets you explore the current state of the stack, explore the available variables, and even run Python commands!


E questo è solo l’inizio, consente di andare oltre


This allows you to quickly find out not only what caused the error, but what function calls led up to the error.
If you’d like the debugger to launch automatically whenever an exception is raised, you can use the %pdb magic function to turn on this automatic behavior:


Finally, if you have a script that you’d like to run from the beginning in interactive mode, you can run it with the command %run -d, and use the next command to step through the lines of code interactively.

Lista (parziale) dei comandi di debugging
There are many more available commands for interactive debugging than we’ve listed here; the following table contains a description of some of the more common and useful ones:

Command     Description
list        Show the current location in the file
h(elp) 	    Show a list of commands, or find help on a specific command
q(uit) 	    Quit the debugger and the program
c(ontinue)  Quit the debugger, continue in the program
n(ext) 	    Go to the next step of the program
<enter>     Repeat the previous command
p(rint)     Print variables
s(tep) 	    Step into a subroutine
r(eturn)    Return out of a subroutine

For more information, use the help command in the debugger, or take a look at ipdb’s online documentation.

Potrei cominciare con i ricordi sulle lotte con i debugger ma poi divento noioso 😡 Invece la vita è bella, dai 😄


NumPy – 6 – IPython e i comandi della shell


Continuo da qui a copiare qui.

When working interactively with the standard Python interpreter, one of the frustrations is the need to switch between multiple windows to access Python tools and system command-line tools. Si riferisce agli utenti normali; io invece ho sempre almeno due terminali aperti. Vi ho mai raccontato di quando il terminale era solo alfanumerico, niente finestre e allora si usava & –e ph (phantom) sul Pr1me–, scomodo ma bei tempi, ero giovane. CMQ… IPython bridges this gap, and gives you a syntax for executing shell commands directly from within the IPython terminal. The magic happens with the exclamation point: anything appearing after ! on a line will be executed not by the Python kernel, but by the system command-line.

Introduzione rapida alla shell
Mi sa che salto, niente di nuovo, anzi… come dicevo 😊

Comandi di shell in IPython
Shell commands can not only be called from IPython, but can also be made to interact with the IPython namespace. For example, you can save the output of any shell command to a Python list using the assignment operator:


Note that these results are not returned as lists, but as a special shell return type defined in IPython:


Sembra una lista ma ha funzionalità in più come si può scoprire nell’help di IPython.

Communication in the other direction–passing Python variables into the shell–is possible using the {varname} syntax:


The curly braces contain the variable name, which is replaced by the variable’s contents in the shell command.

Comandi magici relativi alla shell
Non si può usare !cd perché i comandi sono eseguiti in una sub-shell. Ma se proprio vuoi c’è %cd


In fact, by default you can even use this without the % sign:


This is known as an automagic function, and this behavior can be toggled with the %automagic magic function.

Besides %cd, other available shell-like magic functions are %cat, %cp, %env, %ls, %man, %mkdir, %more, %mv, %pwd, %rm, and %rmdir, any of which can be used without the % sign if automagic is on. This makes it so that you can almost treat the IPython prompt as if it’s a normal shell: This access to the shell from within the same terminal window as your Python session means that there is a lot less switching back and forth between interpreter and shell as you write your Python code.

Chissà se funziona anche per gli aliases? No, non esattamente come vorrei: non considera quelli definiti da me. E poi, dai, basta avere un altro terminale aperto; io ne ho sempre almeno due, senza contare quello di tilda 😊


Uh! funziona con gli shell scripts 😄


NumPy – 5 – Input, output e history di IPython


Continuo da qui a impratichirmi con la REPL di IPython copiando qui. Non ho ancora capito come fare a dirgli di considerare che voglio Python 3 e non il 2.x ma no credo dipenda da me. Il tempo aggiusterà tutto, spero  😊

Abbiamo visto che i comandi precedenti sono accessibili con i tasti freccia o C-p e C-n (d’ora in poi userò le abbreviazioni Emacs), ma ce ne sono altre.

Gli oggetti In e Out di IPython
I comandi precedenti sono memorizzati in In e gli output in Out, esempio:


sì, version 2.7; lo stesso con Out:


Note that not all operations have outputs: for example, import statements and print statements don’t affect the output. The latter may be surprising, but makes sense if you consider that print is a function that returns None; for brevity, any command that returns None is not added to Out.

Questo può tornare utile, per esempio:


Uso di _ come e outputs precedenti
A differenza della REPL nativa di Python che consente di richiamare con _ solo l’ultimo output con IPython ci sono tutti, aumentando il numero di _:


Sopprimere l’output
Sometimes you might wish to suppress the output of a statement (this is perhaps most common with the plotting commands that we’ll explore in Introduction to Matplotlib). Or maybe the command you’re executing produces a result that you’d prefer not like to store in your output history, perhaps so that it can be deallocated when other references are removed. The easiest way to suppress the output of a command is to add a semicolon to the end of the line:


se l’è presa, non volevo, ancora ‘mici? 🌷 😊

Comandi magici relativi


Notare che quanto detto sopra per i comandi non memorizzati vale solo per la versione 3.x di Python.

Other similar magic commands are %rerun (which will re-execute some portion of the command history) and %save (which saves some set of the command history to a file).

OK 😊 ma quando arriviamo alle cose sexy? 😀