JavaScript 25 – funzioni di ordine superiore – 1

Continuo da qui, copio qui.

Nuovo capitolo che Marijn inizia con due cit. bellissimissime, da legere (là). Poi parte…

A large program is a costly program, and not just because of the time it takes to build. Size almost always involves complexity, and complexity confuses programmers. Confused programmers, in turn, tend to introduce mistakes (bugs) into programs. A large program also provides a lot of space for these bugs to hide, making them hard to find.

Let’s briefly go back to the final two example programs in the introduction. The first is self-contained and six lines long.

var total = 0, count = 1;
while (count <= 10) {
  total += count;
  count += 1;
}
console.log(total);

The second relies on two external functions and is one line long.

console.log(sum(range(1, 10)));

Which one is more likely to contain a bug?

If we count the size of the definitions of sum and range, the second program is also big—even bigger than the first. But still, I’d argue that it is more likely to be correct.

It is more likely to be correct because the solution is expressed in a vocabulary that corresponds to the problem being solved. Summing a range of numbers isn’t about loops and counters. It is about ranges and sums.

The definitions of this vocabulary (the functions sum and range) will still involve loops, counters, and other incidental details. But because they are expressing simpler concepts than the program as a whole, they are easier to get right.

Per chi inizia è difficile capire tutta l’importanza di queste poche righe. E non basta leggerle memorizzarle, si deve imparare a proprie spese, l’esperienza.

Astrazione
In the context of programming, these kinds of vocabularies are usually called abstractions. Abstractions hide details and give us the ability to talk about problems at a higher (or more abstract) level.

As an analogy, compare these two recipes for pea soup:

Put 1 cup of dried peas per person into a container. Add water until the peas are well covered. Leave the peas in water for at least 12 hours. Take the peas out of the water and put them in a cooking pan. Add 4 cups of water per person. Cover the pan and keep the peas simmering for two hours. Take half an onion per person. Cut it into pieces with a knife. Add it to the peas. Take a stalk of celery per person. Cut it into pieces with a knife. Add it to the peas. Take a carrot per person. Cut it into pieces. With a knife! Add it to the peas. Cook for 10 more minutes.

And the second recipe:

Per person: 1 cup dried split peas, half a chopped onion, a stalk of celery, and a carrot.

Soak peas for 12 hours. Simmer for 2 hours in 4 cups of water (per person). Chop and add vegetables. Cook for 10 more minutes.

The second is shorter and easier to interpret. But you do need to understand a few more cooking-related words—soak, simmer, chop, and, I guess, vegetable.

When programming, we can’t rely on all the words we need to be waiting for us in the dictionary. Thus, you might fall into the pattern of the first recipe—work out the precise steps the computer has to perform, one by one, blind to the higher-level concepts that they express.

It has to become second nature, for a programmer, to notice when a concept is begging to be abstracted into a new word.

Astrazione per l’attraversamento di un array
Plain functions, as we’ve seen them so far, are a good way to build abstractions. But sometimes they fall short.

In the previous chapter, this type of for loop made several appearances (file ar1.js):

var array = [1, 2, 3];
for (var i = 0; i < array.length; i++) {
  var current = array[i];
  console.log(current);

It’s trying to say, “For each element in the array, log it to the console”. But it uses a roundabout way that involves a counter variable i, a check against the array’s length, and an extra variable declaration to pick out the current element. Apart from being a bit of an eyesore, this provides a lot of space for potential mistakes. We might accidentally reuse the i variable, misspell length as lenght, confuse the i and current variables, and so on.

So let’s try to abstract this into a function. Can you think of a way?

Well, it’s easy to write a function that goes over an array and calls console.log on every element (ar2.js).

function logEach(array) {
  for (var i = 0; i < array.length; i++)
    console.log(array[i]);
}

var arr = [1, 2, 3];
logEach(arr);

But what if we want to do something other than logging the elements? Since “doing something” can be represented as a function and functions are just values, we can pass our action as a function value (ar3.js).

function forEach(array, action) {
  for (var i = 0; i < array.length; i++)
    action(array[i]);
}

forEach(["Wampeter", "Foma", "Granfalloon"], console.log);

In some browsers, calling console.log in this way does not work. You can use alert instead of console.log if this example fails to work.

Often, you don’t pass a predefined function to forEach but create a function value on the spot instead (ar4.js).

var numbers = [1, 2, 3, 4, 5];
var sum = 0;
numbers.forEach(function(number) {
  sum += number;
});
console.log(sum);

Nota: ho modificato il codice di Marijn; probabilmente dovuto a qualche cambiamento intervenuto con il procedere delle versioni.

This looks quite a lot like the classical for loop, with its body written as a block below it. However, now the body is inside the function value, as well as inside the parentheses of the call to forEach. This is why it has to be closed with the closing brace and closing parenthesis.

Using this pattern, we can specify a variable name for the current element (number), rather than having to pick it out of the array manually.

In fact, we don’t need to write forEach ourselves. It is available as a standard method on arrays. Since the array is already provided as the thing the method acts on, forEach takes only one required argument: the function to be executed for each element.

To illustrate how helpful this is, let’s look back at a function from the previous chapter [qui]. It contains two array-traversing loops.

function gatherCorrelations(journal) {
  var phis = {};
  for (var entry = 0; entry < journal.length; entry++) {
    var events = journal[entry].events;
    for (var i = 0; i < events.length; i++) {
      var event = events[i];
      if (!(event in phis))
        phis[event] = phi(tableFor(event, journal));
    }
  }
  return phis;
}

Working with forEach makes it slightly shorter and quite a bit cleaner.

function gatherCorrelations(journal) {
  var phis = {};
  journal.forEach(function(entry) {
    entry.events.forEach(function(event) {
      if (!(event in phis))
        phis[event] = phi(tableFor(event, journal));
    });
  });
  return phis;
}

Nota: da verificare per la versione corrente.

:mrgreen:

NumPy – 57 – tabelle con pivot – 2

Continuo da qui, copio qui.

Esempio: data di nascita
As a more interesting example, let’s take a look at the freely available data on births in the United States, provided by the Centers for Disease Control (CDC). This data can be found here  (this dataset has been analyzed rather extensively by Andrew Gelman and his group; see, for example, this blog post):


e poi…

We can start to understand this data a bit more by using a pivot table. Let’s add a decade column, and take a look at male and female births as a function of decade:

We immediately see that male births outnumber female births in every decade. To see this trend a bit more clearly, we can use the built-in plotting tools in Pandas to visualize the total number of births by year (see Introduction to Matplotlib for a discussion of plotting with Matplotlib [prossimamente]):

ed ecco:

With a simple pivot table and plot() method, we can immediately see the annual trend in births by gender. By eye, it appears that over the past 50 years male births have outnumbered female births by around 5%.

Ulteriori esplorazione dei dati
Though this doesn’t necessarily relate to the pivot table, there are a few more interesting features we can pull out of this dataset using the Pandas tools covered up to this point. We must start by cleaning the data a bit, removing outliers caused by mistyped dates (e.g., June 31st) or missing values (e.g., June 99th). One easy way to remove these all at once is to cut outliers; we’ll do this via a robust sigma-clipping operation:

This final line is a robust estimate of the sample mean, where the 0.74 comes from the interquartile range of a Gaussian distribution (You can learn more about sigma-clipping operations in a book I coauthored with Željko Ivezić, Andrew J. Connolly, and Alexander Gray: “Statistics, Data Mining, and Machine Learning in Astronomy” (Princeton University Press, 2014)).

With this we can use the query() method (discussed further in High-Performance Pandas: eval() and query()) [prossimamente] to filter-out rows with births outside these values:

Next we set the day column to integers; previously it had been a string because some columns in the dataset contained the value ‘null‘:

Finally, we can combine the day, month, and year to create a Date index (see Working with Time Series [prossimamente]). This allows us to quickly compute the weekday corresponding to each row:

Using this we can plot births by weekday for several decades:

ottengo

Nota: dimenticato l’istruzione plt.ylabel('mean births by day').

Apparently births are slightly less common on weekends than on weekdays! Note that the 1990s and 2000s are missing because the CDC data contains only the month of birth starting in 1989.

Another intersting view is to plot the mean number of births by the day of the year. Let’s first group the data by month and day separately:

Focusing on the month and day only, we now have a time series reflecting the average number of births by date of the year. From this, we can use the plot method to plot the data. It reveals some interesting trends:

ed ecco:

In particular, the striking feature of this graph is the dip in birthrate on US holidays (e.g., Independence Day, Labor Day, Thanksgiving, Christmas, New Year’s Day) although this likely reflects trends in scheduled/induced births rather than some deep psychosomatic effect on natural births. For more discussion on this trend, see the analysis and links in [stesso link pecedente] on the subject. We’ll return to this figure in Example:-Effect-of-Holidays-on-US-Births [prossimamente], where we will use Matplotlib’s tools to annotate this plot.

Looking at this short example, you can see that many of the Python and Pandas tools we’ve seen to this point can be combined and used to gain insight from a variety of datasets. We will see some more sophisticated applications of these data manipulations in future sections!

:mrgreen:

JavaScript 24 – strutture di dati – oggetti e arrays – 9

Continuo da qui, copio qui.

Finisco con gli esercizi 😎

Comparazione spinta
The == operator compares objects by identity. But sometimes, you would prefer to compare the values of their actual properties.

Write a function, deepEqual, that takes two values and returns true only if they are the same value or are objects with the same properties whose values are also equal when compared with a recursive call to deepEqual.

To find out whether to compare two things by identity (use the === operator for that) or by looking at their properties, you can use the typeof operator. If it produces “object” for both values, you should do a deep comparison. But you have to take one silly exception into account: by a historical accident, typeof null also produces “object“.

Posso barare vero?
L’esercizio è di quelli non facili. Si può fare, certo, specie se si vedono i suggerimenti, Si tratta di controllare che siano entrambi oggetti non nulli e che siano uguali fino alla fine. Non metto la mia versione ma quella più chiara trovata su StackOverflow.

Non dico di copiare sempre da SO ma a volte… 😯

deepEqual

function deepEqual(a, b) {
  if (a === b) return true;

  if (a == null || typeof a != "object" ||
      b == null || typeof b != "object")
    return false;

  var propsInA = 0, propsInB = 0;

  for (var prop in a)
    propsInA += 1;

  for (var prop in b) {
    propsInB += 1;
    if (!(prop in a) || !deepEqual(a[prop], b[prop]))
      return false;
  }

  return propsInA == propsInB;
}

var obj = {here: {is: "an"}, object: 2};
console.log(deepEqual(obj, obj));
console.log(deepEqual(obj, {here: 1, object: 2}));
console.log(deepEqual(obj, {here: {is: "an"}, object: 2}));

:mrgreen:

NumPy – 56 – tabelle con pivot – 1

Continuo da qui, copio qui.

We have seen how the GroupBy abstraction lets us explore relationships within a dataset. A pivot table is a similar operation that is commonly seen in spreadsheets and other programs that operate on tabular data. The pivot table takes simple column-wise data as input, and groups the entries into a two-dimensional table that provides a multidimensional summarization of the data. The difference between pivot tables and GroupBy can sometimes cause confusion; it helps me to think of pivot tables as essentially a multidimensional version of GroupBy aggregation. That is, you split-apply-combine, but both the split and the combine happen across not a one-dimensional index, but across a two-dimensional grid.

Illustrare le tabelle pivot
For the examples in this section, we’ll use the database of passengers on the Titanic, available through the Seaborn library (see Visualization With Seaborn [prossimamente]):

This contains a wealth of information on each passenger of that ill-fated voyage, including gender, age, class, fare paid, and much more.

Tabelle pivot a mano
To start learning more about this data, we might begin by grouping according to gender, survival status, or some combination thereof. If you have read the previous section, you might be tempted to apply a GroupBy operation–for example, let’s look at survival rate by gender:

This immediately gives us some insight: overall, three of every four females on board survived, while only one in five males survived!

This is useful, but we might like to go one step deeper and look at survival by both sex and, say, class. Using the vocabulary of GroupBy, we might proceed using something like this: we group by class and gender, select survival, apply a mean aggregate, combine the resulting groups, and then unstack the hierarchical index to reveal the hidden multidimensionality. In code:

This gives us a better idea of how both gender and class affected survival, but the code is starting to look a bit garbled. While each step of this pipeline makes sense in light of the tools we’ve previously discussed, the long string of code is not particularly easy to read or use. This two-dimensional GroupBy is common enough that Pandas includes a convenience routine, pivot_table, which succinctly handles this type of multi-dimensional aggregation.

Sintassi delle tabelle pivot
Here is the equivalent to the preceding operation using the pivot_table method of DataFrames:

This is eminently more readable than the groupby approach, and produces the same result. As you might expect of an early 20th-century transatlantic cruise, the survival gradient favors both women and higher classes. First-class women survived with near certainty (hi, Rose!), while only one in ten third-class men survived (sorry, Jack!).

Tabelle pivot multilivello
Just as in the GroupBy, the grouping in pivot tables can be specified with multiple levels, and via a number of options. For example, we might be interested in looking at age as a third dimension. We’ll bin the age using the pd.cut function:

We can apply the same strategy when working with the columns as well; let’s add info on the fare paid using pd.qcut to automatically compute quantiles:

The result is a four-dimensional aggregation with hierarchical indices (see Hierarchical Indexing [qui]), shown in a grid demonstrating the relationship between the values.

Opzioni addizionali per le tabelle pivot
The full call signature of the pivot_table method of DataFrames is as follows:

# call signature as of Pandas 0.18
DataFrame.pivot_table(data, values=None, index=None, 
                      columns=None, aggfunc='mean', 
                      fill_value=None, margins=False,
                      dropna=True, margins_name='All')

uhmmmm… mi da errore name ‘DataFrame’ is not defined.

We’ve already seen examples of the first three arguments; here we’ll take a quick look at the remaining ones. Two of the options, fill_value and dropna, have to do with missing data and are fairly straightforward; we will not show examples of them here.

The aggfunc keyword controls what type of aggregation is applied, which is a mean by default. As in the GroupBy, the aggregation specification can be a string representing one of several common choices (e.g., ‘sum’, ‘mean’, ‘count’, ‘min’, ‘max’, etc.) or a function that implements an aggregation (e.g., np.sum(), min(), sum(), etc.). Additionally, it can be specified as a dictionary mapping a column to any of the above desired options:

Notice also here that we’ve omitted the values keyword; when specifying a mapping for aggfunc, this is determined automatically.

At times it’s useful to compute totals along each grouping. This can be done via the margins keyword:

Here this automatically gives us information about the class-agnostic survival rate by gender, the gender-agnostic survival rate by class, and the overall survival rate of 38%. The margin label can be specified with the margins_name keyword, which defaults to “All“.

:mrgreen:

JavaScript 23 – strutture di dati – oggetti e arrays – 8

Continuo da qui, copio qui.

Sempre sugli esercizi 😎

Una lista
Objects, as generic blobs of values, can be used to build all sorts of data structures. A common data structure is the list (not to be confused with the array). A list is a nested set of objects, with the first object holding a reference to the second, the second to the third, and so on.

var list = {
  value: 1,
  rest: {
    value: 2,
    rest: {
      value: 3,
      rest: null
    }
  }
};

The resulting objects form a chain, like this:

A nice thing about lists is that they can share parts of their structure. For example, if I create two new values {value: 0, rest: list} and {value: -1, rest: list} (with list referring to the variable defined earlier), they are both independent lists, but they share the structure that makes up their last three elements. In addition, the original list is also still a valid three-element list (file ls0.js).

var list = {
  value: 1,
  rest: {
    value: 2,
    rest: {
      value: 3,
      rest: null
    }
  }
};

console.log('list (orig.):', list);

list = {value: 0, rest: list}
console.log('\nlist (con 0):', list);

list = {value: -1, rest: list}
console.log('\nlist (con -1):', list);

resta da scoprire perché non la visualizza tutta, prossimamente, forse 😯

Write a function arrayToList that builds up a data structure like the previous one when given [1, 2, 3] as argument, and write a listToArray function that produces an array from a list. Also write the helper functions prepend, which takes an element and a list and creates a new list that adds the element to the front of the input list, and nth, which takes a list and a number and returns the element at the given position in the list, or undefined when there is no such element.

If you haven’t already, also write a recursive version of nth.

arrayToList

function arrayToList(arr) {
    var list = {};
    
    var arev = arr.reverse();
    for (var c = 0; c < arev.length; c++) {
        list = {value: arev[c], rest: list};
    };
    return list;
}

array = ['A', 'B', 'C'];
console.log(array);
console.log(arrayToList(array));

Se non copio arr in arev l’originale viene sovrascritto globalmente; al solito dipende se serve.

listToArray

function listToArray(ls) {
    var arr = [];
    for (var node = ls; node; node = node.rest) {
        arr.push(node.value);
    }
    arr.pop(); // toglie il null finale
    return arr;
}

var lst = { value: 'A', 
            rest: { value: 'B', 
                    rest: { value: 'C', 
                            rest: {} } } }

console.log(listToArray(lst));

prepend

function prepend(ele, lst) {

    return {value: ele, rest: lst};
}

console.log(prepend(10, prepend(20, null)));

nth

function nth(list, n) {

    var ls = list;
    for (var c = 1; c < n; c++) {
        if (!ls) {
            return undefined;
            break;
        }
        ls = ls.rest;
    }
    return ls 
}

list = { value: 'A', 
            rest: { value: 'B', 
                    rest: { value: 'C', 
                            rest: {} } } };

console.log(list);
console.log(nth(list, 1));
console.log(nth(list, 2));
console.log(nth(list, 3));
console.log(nth(list, 10));

nth mi ha fatto sudare e –confesso– sono ricorso a StackOverflow per chiarimenti 😡

:mrgreen:

SICP – cap. 2 – Strutture gerarchiche – 30 – esercizi

Continuo da qui, copio qui.

Exercise 2.24: Suppose we evaluate the expression (list 1 (list 2 (list 3 4))). Give the result printed by the interpreter, the corresponding box-and-pointer structure, and the interpretation of this as a tree (as in Figure 2.6).

Dovrebbe essere semplice, per comodità ricopio le figure:

L’espressione viene valutata così:

e quindi

Le Corbusier schizzava malissimo (cit.), me too 😎
Ecco Bill the Lizard e sicp-ex.
Ma oggi vince DreWiki: I’m too lazy to do the graphs for the other two parts.

:mrgreen:

NumPy – 55 – aggregare e raggruppare – 2

Continuo da qui, copio qui.

Devo recuperare l’elaborazione del post precedente, fatto; non la riporto. (L’interattività della REPL è comoda ma a volte, come adesso…).

L’oggetto GroupBy
The GroupBy object is a very flexible abstraction. In many ways, you can simply treat it as if it’s a collection of DataFrames, and it does the difficult things under the hood. Let’s see some examples using the Planets data.

Perhaps the most important operations made available by a GroupBy are aggregate, filter, transform, and apply. We’ll discuss each of these more fully in “Aggregate, Filter, Transform, Apply” [post precedente], but before that let’s introduce some of the other functionality that can be used with the basic GroupBy operation.

indicizzazione per colonna
The GroupBy object supports column indexing in the same way as the DataFrame, and returns a modified GroupBy object. For example:

Here we’ve selected a particular Series group from the original DataFrame group by reference to its column name. As with the GroupBy object, no computation is done until we call some aggregate on the object:

This gives an idea of the general scale of orbital periods (in days) that each method is sensitive to.

iterare per gruppi
The GroupBy object supports direct iteration over the groups, returning each group as a Series or DataFrame:

This can be useful for doing certain things manually, though it is often much faster to use the built-in apply functionality, which we will discuss momentarily.

metodi di espletazione (dispatch)
Through some Python class magic, any method not explicitly implemented by the GroupBy object will be passed through and called on the groups, whether they are DataFrame or Series objects. For example, you can use the describe() method of DataFrames to perform a set of aggregations that describe each group in the data:

Looking at this table helps us to better understand the data: for example, the vast majority of planets have been discovered by the Radial Velocity and Transit methods, though the latter only became common (due to new, more accurate telescopes) in the last decade. The newest methods seem to be Transit Timing Variation and Orbital Brightness Modulation, which were not used to discover a new planet until 2011.

This is just one example of the utility of dispatch methods. Notice that they are applied to each individual group, and the results are then combined within GroupBy and returned. Again, any valid DataFrame/Series method can be used on the corresponding GroupBy object, which allows for some very flexible and powerful operations!

aggregate, filter, transform e apply
The preceding discussion focused on aggregation for the combine operation, but there are more options available. In particular, GroupBy objects have aggregate(), filter(), transform(), and apply() methods that efficiently implement a variety of useful operations before combining the grouped data.

For the purpose of the following subsections, we’ll use this DataFrame:

aggregazione
We’re now familiar with GroupBy aggregations with sum(), median(), and the like, but the aggregate() method allows for even more flexibility. It can take a string, a function, or a list thereof, and compute all the aggregates at once. Here is a quick example combining all these:

Another useful pattern is to pass a dictionary mapping column names to operations to be applied on that column:

filtraggio
A filtering operation allows you to drop data based on the group properties. For example, we might want to keep all groups in which the standard deviation is larger than some critical value:

The filter function should return a Boolean value specifying whether the group passes the filtering. Here because group A does not have a standard deviation greater than 4, it is dropped from the result.

trasformazione
While aggregation must return a reduced version of the data, transformation can return some transformed version of the full data to recombine. For such a transformation, the output is the same shape as the input. A common example is to center the data by subtracting the group-wise mean:

il metodo apply()
The apply() method lets you apply an arbitrary function to the group results. The function should take a DataFrame, and return either a Pandas object (e.g., DataFrame, Series) or a scalar; the combine operation will be tailored to the type of output returned.

For example, here is an apply() that normalizes the first column by the sum of the second:

apply() within a GroupBy is quite flexible: the only criterion is that the function takes a DataFrame and returns a Pandas object or scalar; what you do in the middle is up to you!

Specificare la key di suddivisione
In the simple examples presented before, we split the DataFrame on a single column name. This is just one of many options by which the groups can be defined, and we’ll go through some other options for group specification here.

una lista, array, serie o indice che fornisce la chiave di raggruppamento
The key can be any series or list with a length matching that of the DataFrame. For example:

Of course, this means there’s another, more verbose way of accomplishing the df.groupby('key') from before:

un dictionary o serie che mappa un indice in un gruppo
Another method is to provide a dictionary that maps index values to the group keys:

una funzione Python qualunque
Similar to mapping, you can pass any Python function that will input the index value and output the group:

una lista di keys valide
Further, any of the preceding key choices can be combined to group on a multi-index:

Esempio di raggruppamento
As an example of this, in a couple lines of Python code we can put all these together and count discovered planets by method and by decade:

This shows the power of combining many of the operations we’ve discussed up to this point when looking at realistic datasets. We immediately gain a coarse understanding of when and how planets have been discovered over the past several decades!

Here I would suggest digging into these few lines of code, and evaluating the individual steps to make sure you understand exactly what they are doing to the result. It’s certainly a somewhat complicated example, but understanding these pieces will give you the means to similarly explore your own data.

Già detto che Jake 🚀 rockz!?

:mrgreen:

cit. & loll – 39

Uh! giovedì 😯 e allora ecco 💥

Nodejs is costing millions per year to naive companies who are adopting it
::: gignico

Wow, there are still official Windows 95(!!!) drivers for our office printer
::: lunaryorn

blog post for kids: Fun at the UNIX Terminal
::: johnregehr

A pretty hot topic
::: BryanMMathers

Daniel Bobrow tells the story of a #lisp version of Eliza and a BBN vice president
::: RainerJoswig

As a general rule, the quality and depth of a blog post
Uh! primo 😜
::: lunaryorn

This guy is a software engineer
::: rvagg

How people cannot be fascinated by lisp languages
::: thek3nger

C is always faster than Python
::: jakevdp

1956. Operai spostano un disco fisso da 5MB
::: gabrieligm

I don’t even know what this is in reference to but I agree
::: speakman

There are 10 types of people in the world
::: biorhythmist

You are awesome
::: Google+

Let’s camera like it’s 1999! Sony Mavica MVC-FD51, 640×480 JPGs on HD floppy disks
se uno (hey it’s me!) se la ricorda è vecchio?
::: paulrickards

Computing is abstraction engineering
::: Jose_A_Alonso

Almost every time I talk to someone in a bad job
::: danluu

When drones go wrong
::: MIT_CSAIL

we have to stop holding each other’s beers
::: joss

Amo gli update di Windows
::: Flavio_MfM

I don’t get how this belittling nonsense (not to mention projection) goes unchallenged at #Google
::: Symbo1ics

Left unchecked, an experienced programmer’s productivity approaches infinity
::: StephanTLavavej

#LispAMovieTitle
::: josecalderon ::: josh_triplett ::: joeginder ::: joeginder ::: pt ::: peterseibel
::: jacobrothstein ::: martin_svanberg

Users don’t “fail to understand
::: MarinaMartin

Today I pass on my legacy
Peter rockz! 🚀 il suo Pratical Common Lisp è il migliore che c’è
::: peterseibel

I wish someone showed me this figure when I was in my teens / early twenties
::: jiforrest

Debugged
::: ThePracticalDev

:mrgreen:

JavaScript 22 – strutture di dati – oggetti e arrays – 7

Continuo da qui, copio qui.

Arcora esercizi 😯

Invertire un array
Arrays have a method reverse, which changes the array by inverting the order in which its elements appear. For this exercise, write two functions, reverseArray and reverseArrayInPlace. The first, reverseArray, takes an array as argument and produces a new array that has the same elements in the inverse order. The second, reverseArrayInPlace, does what the reverse method does: it modifies the array given as argument in order to reverse its elements. Neither may use the standard reverse method.

Thinking back to the notes about side effects and pure functions in the previous chapter, which variant do you expect to be useful in more situations? Which one is more efficient?

function reverseArray(arr) {
    var r_arr = [];
    var la = arr.length;
    for (var c = 0; c < la; c++) 
        r_arr.push(arr.pop());
    return r_arr;
}

console.log(reverseArray(["A", "B", "C"]));

Nota: occorre determinare la lunghezza dell’array arr (variabile la) prima di entrare nel ciclo perché questo la modifica.

Nei suggerimenti Marijn consiglia di usare unshift(); secondo me il mio codice è più semplice.

Reversing the array in place is harder. Eh, sì, ci ho messo parecchio 😡 poi mi sono arreso 😯 Pensavo di poter usare qualche funzione built-in invece no, serve un ciclo. Si scambiano di posto il primo e l’ultimo elemento dell’array, poi il secondo e il penultimo e così via fino a metà dell’array. Se il numero di elementi è dispari quello centrale resta al suo posto, per questo si usa floor() che tronca la divisione. Fortuna che in pratica queste cose non si devono fare 😄

function reverseArrayInPlace(arr) {
   for (var c = 0; c < Math.floor(arr.length / 2); c++) {
      var t = arr[c];
      arr[c] = arr[arr.length - 1 - c];
      arr[arr.length - 1 - c] = t;
   }
   return arr;
}

var array = [1, 2, 3, 4, 5];
reverseArrayInPlace(array);
console.log(array);

Il metodo più veloce è il primo ma per ragioni di economiasi deve usare il secondo, secondo me, ma non sempre perché l’array orifinale viene sovrascritto.

:mrgreen:

NumPy – 54 – aggregare e raggruppare – 1

Continuo da qui, copio qui.

Un esempio d’uso di Pandas su un argomento in cui Jake la sa lunga 🚀. Siccome è lungo lo suddivido in più posts.

An essential piece of analysis of large data is efficient summarization: computing aggregations like sum(), mean(), median(), min(), and max(), in which a single number gives insight into the nature of a potentially large dataset. In this section, we’ll explore aggregations in Pandas, from simple operations akin to what we’ve seen on NumPy arrays, to more sophisticated operations based on the concept of a groupby.

For convenience, we’ll use the same display magic function that we’ve seen in previous sections –non lo ricopio, è sempre lo stesso.

Dati relativi ai pianeti
Here we will use the Planets dataset, available via the Seaborn package (see Visualization With Seaborn [prossimamente]). It gives information on planets that astronomers have discovered around other stars (known as extrasolar planets or exoplanets for short). It can be downloaded with a simple Seaborn command:

This has some details on the 1,000+ extrasolar planets discovered up to 2014.

Aggregazione semplice con Pandas
Earlier, we explored some of the data aggregations available for NumPy arrays (“Aggregations: Min, Max, and Everything In Between” [qui]). As with a one-dimensional NumPy array, for a Pandas Series the aggregates return a single value:


For a DataFrame, by default the aggregates return results within each column:

By specifying the axis argument, you can instead aggregate within each row:

Pandas Series and DataFrames include all of the common aggregates mentioned in Aggregations: Min, Max, and Everything In Between [stesso link precedente]; in addition, there is a convenience method describe() that computes several common aggregates for each column and returns the result. Let’s use this on the Planets data, for now dropping rows with missing values:

This can be a useful way to begin understanding the overall properties of a dataset. For example, we see in the year column that although exoplanets were discovered as far back as 1989, half of all known expolanets were not discovered until 2010 or after. This is largely thanks to the Kepler mission, which is a space-based telescope specifically designed for finding eclipsing planets around other stars.

The following table summarizes some other built-in Pandas aggregations:

Aggregation       Description
count()           Total number of items
first(), last()   First and last item
mean(), median()  Mean and median
min(), max()      Minimum and maximum
std(), var()      Standard deviation and variance
mad()             Mean absolute deviation
prod()            Product of all items
sum()             Sum of all items

These are all methods of DataFrame and Series objects.

To go deeper into the data, however, simple aggregates are often not enough. The next level of data summarization is the groupby operation, which allows you to quickly and efficiently compute aggregates on subsets of data.

Raggruppamenti, GroupBy: split, apply, combine
Simple aggregations can give you a flavor of your dataset, but often we would prefer to aggregate conditionally on some label or index: this is implemented in the so-called groupby operation. The name “group by” comes from a command in the SQL database language, but it is perhaps more illuminative to think of it in the terms first coined by Hadley Wickham of Rstats fame: split, apply, combine.

split, apply, combine
A canonical example of this split-apply-combine operation, where the “apply” is a summation aggregation, is illustrated in this figure:

This makes clear what the groupby accomplishes:

  • The split step involves breaking up and grouping a DataFrame depending on the value of the specified key.
  • The apply step involves computing some function, usually an aggregate, transformation, or filtering, within the individual groups.
  • The combine step merges the results of these operations into an output array.

While this could certainly be done manually using some combination of the masking, aggregation, and merging commands covered earlier, an important realization is that the intermediate splits do not need to be explicitly instantiated. Rather, the GroupBy can (often) do this in a single pass over the data, updating the sum, mean, count, min, or other aggregate for each group along the way. The power of the GroupBy is that it abstracts away these steps: the user need not think about how the computation is done under the hood, but rather thinks about the operation as a whole.

As a concrete example, let’s take a look at using Pandas for the computation shown in this diagram. We’ll start by creating the input DataFrame:

The most basic split-apply-combine operation can be computed with the groupby() method of DataFrames, passing the name of the desired key column:

Notice that what is returned is not a set of DataFrames, but a DataFrameGroupBy object. This object is where the magic is: you can think of it as a special view of the DataFrame, which is poised to dig into the groups but does no actual computation until the aggregation is applied. This “lazy evaluation” approach means that common aggregates can be implemented very efficiently in a way that is almost transparent to the user.

To produce a result, we can apply an aggregate to this DataFrameGroupBy object, which will perform the appropriate apply/combine steps to produce the desired result:

The sum() method is just one possibility here; you can apply virtually any common Pandas or NumPy aggregation function, as well as virtually any valid DataFrame operation, as we will see in the following discussion.

Continua 😉

:mrgreen: