Matematica — Mathematics

Calcolo migrazione voti

Applicazione statistica alle votazioni per il calcolo delle migrazioni dei voti

Quando avvengono le votazioni politiche, di solito si fa grande pubblicità dei risultati: Il tale partito ha guadagnato tot voti, quell’altro partito ne ha persi tot altri e così via. Però dal punto di vista politico e matematico è interessante conoscere statisticamente non solo i semplici risultati finali, le percentuali di voti conseguiti dai vari partiti, ma anche i flussi dei voti da un partito ad un altro. Tanto per fare un esempio, se il partito A possedeva x_A voti nell’ultima votazione e a seguito di quella corrente ne ha acquistati δx_A, potrebbe essere utile sapere da quali partiti li ha acquistati e, se li ha persi, verso quali partiti sono confluiti. Scopo di questo articolo è di determinare le formule che permettono di calcolare in una votazione i flussi dei voti tra i partiti.

Sia N il numero di seggi in cui avvengono le votazioni, n il numero dei partiti iscritti e siano p_ij, con 1 <= i,j <= n, le probabilità di migrazione, ossia le probabilità secondo cui il votante che nella precedente votazione aveva votato per il partito i-esimo, voti successivamente per il partito j-esimo. Infine, siano u_k/i le percentuali di voti conseguiti nel k-esimo seggio dal partito i-esimo nella precedente votazione e v_k/i quelle conseguite nella votazione corrente. Con queste definizioni possiamo dire che nel seggio k-esimo della corrente votazione il partito i-esimo ha perso mediamente in percentuale nella presente votazione u_k/i ∙ p_ij voti in favore del partito j-esimo. Sommate su tutti i partiti si ottiene la percentuale dei voti più probabile ottenuta nella corrente votazione dal partito j-esimo nel seggio k-esimo:

Σ_i=1ⁿ u_k/i ∙ p_ij = v_k/j.

Qui si è sottolineato il simbolo v_k/j per esprimere il fatto che non rappresenta il valore reale della percentuale, ma quello ottenuto mediante l’uso delle probabilità. E’ da notare che le sommatorie includono anche i contributi da parte di coloro che sono rimasti fedeli al proprio partito, ossia i termini contenenti le probabilità diagonali p_jj. Per quanto riguarda l’intera votazione, comprensiva di tutti i seggi, valgono le seguenti relazioni:

u_i = Σ_k=1^N u_k/i, v_i = Σ_k=1^N v_k/i.

Ora il problema consiste nel determinare le probabilità di migrazione p_ij, le quali, moltiplicate per le percentuali dei votanti della precedente elezione e sommate su tutti i seggi, forniscono le valutazioni percentuali dei flussi migratori.

Σ_k=1^N u_k/i ∙ p_ij = v_i → j .

A questo scopo consideriamo per ogni partito la semisomma A_i( p_ji ), effettuata su tutti i seggi, dei quadrati delle differenze tra le percentuali calcolate nella votazione corrente mediante le probabilità di migrazione, v_k/i, e quelle effettivamente ottenute, v_k/i:

A_i( p_ji ) = Σ_k=1^N ( v_k/i – v_k/i )² / 2 =

= Σ_k=1^N ( Σ_j=1ⁿ u_k/j ∙ p_ji – v_k/i )² / 2.

Le probabilità di migrazione statisticamente più verosimili si ottengono imponendo la condizione che le A_i( p_ji ) siano minime, ossia che ∂A_i / ∂p_ji = 0 per ogni i, j:

1. ∂A_i / ∂p_ji = Σ_k=1^N ( Σ_l=1ⁿ u_k/l ∙ p_li – v_k/i ) ∙ u_k/j = 0.

La condizione (1) determina un sistema di n² equazioni lineari non omogenee nelle n² incognite p_ij. Ora bisogna imporre la condizione che le p_ij rappresentino effettivamente delle probabilità. Ciò equivale a richiedere che per ogni j valga la seguente relazione:

2. Σ_j=1ⁿ p_ij = 1.

La sommatoria è effettuata sul secondo indice, in quanto deve esprimere la totalità dei voti uscenti dal partito i-esimo. La condizione (2) comporta l’aggiunta al sistema (1) di n ulteriori equazioni lineari nelle incognite p_ij. Perciò il numero totale di equazioni da soddisfare sale a n ∙ (n + 1). Assumendo che le equazioni siano linearmente indipendenti, affinché il sistema risulti risolvibile risulta chiara la necessità di introdurre nel sistema di equazioni altre n incognite. Quali introdurre? Il significato stesso del sistema ce ne dà un’indicazione. Poiché i risultati che vogliamo determinare rappresentano stime di grandezze soggette a fluttuazioni statistiche, ci si aspetta che le probabilità di migrazione p_ij non soddisfino necessariamente le n² equazioni (1). Perciò ha senso introdurre delle nuove incognite δ_i, le quali tengano conto di tali discrepanze. Con queste aggiunte, le equazioni (1) assumono l’espressione:

3. Σ_k=1^N ( Σ_l=1ⁿ u_k/l ∙ p_li – v_k/i ) ∙ u_k/j = δ_i.

Ora il sistema costituito dalle equazioni (2) e (3), nelle incognite p_li e δ_j, è composto da n ∙ (n + 1) equazioni lineari con n ∙ (n + 1) incognite, ed è quindi risolvibile. A questo punto uno si potrebbe chiedere perché nella (3) si sia usato l'indice i, anziché i. Se osserviamo la definizione delle A_i, notiamo che per ogni indice i vi è associato un sistema di n equazioni. La scelta fatta equivale ad aggiungere un'incognita indipendente δ_i ad ognuno di questi sottosistemi.

Migration votes calculation

Statistical application to votes for calculating the migration votes

When political polls take place, usually the results are greatly publicized. A certain party won a tot number of votes, another one lost tot other votes, and so on. On the political and mathematical point of view, it is interesting to get the statistical knowledge not only of the simple final results, the percentages obtained by the various parties, but also of the flows of votes from one party to another. For example, if party A got x_A votes in the last poll and in the current poll gained δx_A, it might be useful to know from which parties it got those votes, while, if it lost them, towards which parties they got transferred. Purpose of this article is to determine the formulae that permit to calculate in a poll the flows of votes between parties.

Let N be the number of polling stations where polls take place, n the number of eligible parties, and let p_ij, with 1 <= i,j <= n, the migration probabilities, i.e. the probabilities according to which the voter that in the preceding poll had voted in favour of the i-th party, subsequently votes for the j-th. Finally, let u_k/i be be the percentages of votes the i-th party obtained in the k-th polling station in the last poll, and v_k/i those obtained in the current poll. With these definitions we can say that in the k-th polling station of the current poll the i-th party lost in the average a u_k/i ∙ p_ij percentage of votes in favour of the j-th party. Summation over all the parties produces the most probable percentage of votes obtained in the current poll by the j-th party in the k-th polling station:

Σ_i=1ⁿ u_k/i ∙ p_ij = v_k/j.

Here we underlined the symbol v_k/j in order to express the fact that it doesn’t represent the true percentage value, but the one obtained through the use of probabilities. It is to be noted that the summations also include the contributions from the voters that remained faithful to their previous party, i.e. the terms containing the diagonal probabilities p_jj. As for the entire poll, which includes all the polling stations, the following relations hold:

u_i = Σ_k=1^N u_k/i, v_i = Σ_k=1^N v_k/i.

Now the problem consists in determining the migration probabilities p_ij, which, when multiplied by the percentages of the voters of the preceding poll and summed over all polling stations, produce the percentage evaluations of the migratory flows.

Σ_k=1^N u_k/i ∙ p_ij = v_i → j .

For this purpose, let’s consider for each party the half sum A_i( p_ji ), made over all the polling stations, of the squares of the differences between the percentages calculated in the current vote by means of the migration probabilities, v_k/i, and those effectively obtained, v_k/i:

A_i( p_ji ) = Σ_k=1^N ( v_k/i – v_k/i )² / 2 =

= Σ_k=1^N ( Σ_j=1ⁿ u_k/j ∙ p_ji – v_k/i )² / 2.

The statistically most probable migration probabilities are then obtained by imposing the condition that the A_i( p_ji ) be minimal, i.e. that ∂A_i / ∂p_ji = 0 for each i,j:

1. ∂A_i / ∂p_ji = Σ_k=1^N ( Σ_l=1ⁿ u_k/l ∙ p_li – v_k/i ) ∙ u_k/j = 0.

Condition (1) determines a system of n² non-homogeneous linear equations in the n² unknowns p_ij. Now we must impose the condition that the p_ij really represent probabilities. That corresponds to require that for each i the following relations hold:

2. Σ_j=1ⁿ p_ij = 1.

Summation is made over the second index, since it must express the totality of votes getting out of the i-th party. Condition (2) requires the addition of n further linear equations in the unknowns p_ij to system (1). Thus the total number of equations to be satisfied becomes n ∙ (n + 1). By assuming that all the equations are linearly independent, the system solvability requires the introduction of n further unknowns. Which ones to introduce? The meaning of the system itself gives us the answer. Since the results we want to determine represent estimates of values subjected to statistical fluctuations, we expect that the migration probabilities p_ij do not necessarily satisfy the n² equations (1). Hence it makes sense to introduce new unknowns δ_i, in order to account for such discrepancies. With these additions, equations (1) get the following expression:

3. Σ_k=1^N ( Σ_l=1ⁿ u_k/l ∙ p_li – v_k/i ) ∙ u_k/j = δ_i.

Now the system of equations (2) and (3), in the unknowns p_li and δ_j, is made up of n ∙ (n + 1) linear equations with n ∙ (n + 1) unknowns, and is therefore solvable. At this point one might wonder why in equation (3) the index i was used, instead of j. By observing the definition of the A_is, one can note that to each index i is associated a system of n equations. Therefore, the choice made is equivalent to adding an independent unknown δ_i to each of these subsystems.