Protected by Copyscape Plagiarism Tool

Friday, April 30, 2010

The Pedestrian's Guide to Electoral Mathematics - Surveys and Slovin / Sloven's Formula Applications

Written by: RJ Nieto

Surveys. Surveys. Surveys.

I can't believe how clueless a lot of us are, and those that seem to be the smartest turn out to be on the other side of the fence. Here and there we see surveys, "counter-surveys", trending, and lawsuits arising from them, but do we really know what they are talking about in the first place?

If we don't, then we should shut up. But we don't wanna shut up, right? So let's try to get the big picture.

In a TV interview, Loren Legarda said:

"How can we believe the credibility of surveys, can 3000 respondents represent a voting population of 50 million people?"


I was shocked. At face value, what she said sounds reasonable, as she merely uses "common sense" to come up with such a notion. She's clueless.

Then I did some research.

On an entry entitled "Social Weather Station Survey Methods
by Billy Almarinez"
in Senator Kit Tatad's Website, the following were stated:

SWS/Pulse Asia Facts:

1. SWS and Pulse Asia (the main firms who do the surveys) do not reveal how they collect sample data.

2. The only things that they make public are the following:
a. the number of respondents
b. the margin of error
c. they "probably" use Slovin's Formula,which is "a generally accepted way
of how to determine the size appropriate for a sample to ensure better representation of the population of a known size."


Furthermore, with no clear way of how Almarinez got the results, Almarinez says the following in his paper:

Almarinez: "In the context of election-related surveys, that would indicate that the survey results may be representative of a population of 6,429 registered voters nationwide."

Now, as a UP Diliman Mathematics Major, I can't help but scratch my forehead.

His math is a little misleading.

Why? Let's do this systematically, from basic theory to computations to conclusion.

Question 1: What is Slovin's Formula and How is it applied in this situation?


Slovin's Formula
In Statistics, Slovin's formula is the generally accepted way to determine the appropriate sample size given the total population and a pre-specified margin of error. It is stated as:



n=[N]/[1+Ne^2]

where:

N = The total population,which in our context is 50 000 000, based on an estimate from the COMELEC.

e = the margin of error. A smaller e means a more accurate result

n = needed sample size. This is the number of respondents who have to be interviewed/analyzed in order to make accurate predictions.


Now, Let's do basic math:

We want to know how many people we need to analyze(n) so that we can accurately predict what a population (N) of 50 million voters will do come the elections, with a margin of error(e) of 2%

i.e.

n = 50 000 000 / [1 + 50 000 000 (0.02^2)] = 2 499.87501 ---> 2500 respondents.


Question 2: How many people do the survey firms analyze?

Pulse Asia and SWS surveys around 1800-2200 respondents for every release, so that begs the question


Question 3: For a country of 50 million voters, and considering that we need about 2500 to get an accurate prediction, why the heck did they interview so few?

I don't know.

Question 4: Does this make their results invalid?

Not necessarily. Lets do a little algebra:

n=[N]/[1+Ne^2] {Slovin's Formula}

n+nNe^2 = N {after cross-multiplying}

nNe^2 = N - n (transposing n to the right}

e^2 = (N - n) / nN {after dividing both sides by nN}

e^2 = (1/n) - (1/N) {after simplifying}

e = sqrt [(1/n) - (1/N)]


What does this mean?

This shows the margin of error given the number of respondents and the total population. If SWS and Pulse Asia interviews a minimum of 1800 people, and with a population of 50 000 000 voters, we get:


e = sqrt(1/1800 - 1/50 000 000) = 0.0235698018 = 2.36% margin of error


Which means that at worst, the margin of error is at 2.4%, which is still VERY LOW.

Question 5: Where did Loren Legarda go wrong?

Well, shes a incredibly smart woman, but apparently she's no smarter in Math than the average Juan. If she says that SWS analyzes 3000 samples for a population of 50 million, then with the same breath, says that its invalid.Seemingly, she does not have ANY idea about what she is talking about, because all mathematics is against her.

Question 6: Where did Professor Almarinez go wrong?


This is the saddest part of my article. Another round of Math please.

n=[N]/[1+Ne^2] {Slovin's Formula}

n+nNe^2 = N {after cross-multiplying}

nNe^2 - N = - n {transpose n to the right, N to the left}

N[ne^2 - 1] = - n {factor out N}

N = -n / [ne^2 - 1]


Now, lets see:

If SWS analyzes 1800 samples [worst-case scenario} with a proposed margin of e = 2% = 0.02,

N = -1800/(1800*.02^2-1) = 6 428.57143 ~ 6429 people, which is what Almarinez got.

But then, he did something wrong.

He assumed that the error is EXACTLY 2%, not putting in mind the extra decimal places after it.

Why? If I were someone who will publish news for public consumption, I would round figures to make it more readable.

Hence, 2% may range from anything between 1.50% and 2.49%, meaning that e can be "as bad as 2.49%" and can still be called 2% on the newspapers.

Through simple arithmetic, at the worst-case scenario of e = 2.49% = .0249

n = 50 000 000/(1+50 000 000*0.0249^2) = 1 612.82519 ~ 1613 respondents.

This means, for a population of 50 million, and error of 2.49%, we need to interview only 1613 people.

A little more tweaking and in reference to a calculation above, with e = 0.0235698018, we get n = 1799 ~ 1800.

So how did he go wrong?

He took things literally, without even asking SWS or Pulse Asia about the EXACT figure for the error margin. He did all the math right, but lacked common sense. In the field of statistics, theory is not everything. Intuition also plays a big part.

Final Questions:



1. From all these, are the SWS/Pulse Asia Surveys invalid?

Not necessarily. You see, a margin of about 2.5% is not bad.

In short, the surveys still make sense as far as theory and so Almarinez's point on SWS's and Pulse's incompliance with respect to the "dogma" that is Slovin's formula, is invalid. He did the basic math right, but the math that he used is too basic.

2. So the Math is right, where could the survey firms go wrong?

The Questionnaire.

If you were someone who cannot read, or in a hurry, or has not yet done some research, then someone approaches you and asks you about who you vote for, or was asked in front of a friend/boss/acquaintance who will see your answers, what would you do?

Some of the scenarios:

a. tick the first option, just to get it over with
b. tick a random option, again, just to get it over with.
c. tick on a candidate that is aligned to your companion's views, in order to avoid an argument
d. tick undecided to avoid an argument with your friend, but you know for yourself that your mind is already made up.

But we will have NO idea on how SWS and Pulse does data collection methods, all these stuff will be left to speculation.


3. So what do I do with those survey results?

The reasonable way to utilize those results will be to use them as a guide on how to plan further action. If you feel that your candidate is lagging behind based on what the surveys say, then act accordingly. If your candidate is ahead then continue whatever you're doing because you are probably doing it right.

But to simply say that Surveys tend to sway public opinions just because they do, is something that should not come from someone who has had sufficient secondary education.

To simply shrug science away, just because it is inconvenient for us, is a very very bad idea. Just look at what happened in Europe's middle ages. Do not let history repeat itself.

An Exercise in Futility

160-character tweets, instant messaging, unlimited-duration mobile phone calls, video blogs. All seem to come fast.

Why would I write? Why would I choose to type down a bajillion words, even if the chance of someone reading this eventually unmitigated piece of crap will probably be infinitesimally small?

I don't know, I just want to.

II am Rj, a 25 year old child-prodigy-turned-human-car-accident. I grew up in the suburbs of Manila, the Philippines, and now lives in a figurative oasis in the middle of a terrorist-infested island. I am someone whose life can be summed up into two words - twisted and crappy.

This blog will talk about everything and anything under the sun, and so, here it goes.