NumPy – 45 – indicizzazione gerarchica – 2

Continuo da qui, copio qui.

Metodi di creazione di MultiIndex
The most straightforward way to construct a multiply indexed Series or DataFrame is to simply pass a list of two or more index arrays to the constructor. For example:

The work of creating the MultiIndex is done in the background.

Similarly, if you pass a dictionary with appropriate tuples as keys, Pandas will automatically recognize this and use a MultiIndex by default:

Nevertheless, it is sometimes useful to explicitly create a MultiIndex; we’ll see a couple of these methods here.

MultiIndex via costruttori espliciti
For more flexibility in how the index is constructed, you can instead use the class method constructors available in the pd.MultiIndex. For example, as we did before, you can construct the MultiIndex from a simple list of arrays giving the index values within each level:

You can construct it from a list of tuples giving the multiple index values of each point:

You can even construct it from a Cartesian product of single indices:

Similarly, you can construct the MultiIndex directly using its internal encoding by passing levels (a list of lists containing available index values for each level) and labels (a list of lists that reference these labels):

Any of these objects can be passed as the index argument when creating a Series or Dataframe, or be passed to the reindex method of an existing Series or DataFrame.

nomi dei livelli del MultiIndex
Sometimes it is convenient to name the levels of the MultiIndex. This can be accomplished by passing the names argument to any of the above MultiIndex constructors, or by setting the names attribute of the index after the fact:

With more involved datasets, this can be a useful way to keep track of the meaning of various index values.

MultiIndex per colonne
In a DataFrame, the rows and columns are completely symmetric, and just as the rows can have multiple levels of indices, the columns can have multiple levels as well. Consider the following, which is a mock-up of some (somewhat realistic) medical data:

Here we see where the multi-indexing for both rows and columns can come in very handy. This is fundamentally four-dimensional data, where the dimensions are the subject, the measurement type, the year, and the visit number. With this in place we can, for example, index the top-level column by the person’s name and get a full DataFrame containing just that person’s information:

For complicated records containing multiple labeled measurements across multiple times for many subjects (people, countries, cities, etc.) use of hierarchical rows and columns can be extremely convenient!


Posta un commento o usa questo indirizzo per il trackback.



Inserisci i tuoi dati qui sotto o clicca su un'icona per effettuare l'accesso:

Logo di

Stai commentando usando il tuo account Chiudi sessione /  Modifica )

Google photo

Stai commentando usando il tuo account Google. Chiudi sessione /  Modifica )

Foto Twitter

Stai commentando usando il tuo account Twitter. Chiudi sessione /  Modifica )

Foto di Facebook

Stai commentando usando il tuo account Facebook. Chiudi sessione /  Modifica )

Connessione a %s...

Questo sito utilizza Akismet per ridurre lo spam. Scopri come vengono elaborati i dati derivati dai commenti.

%d blogger hanno fatto clic su Mi Piace per questo: