MANILA, Philippines — With just four days left until Election Day, some of the country’s top statisticians find themselves at loggerheads and wondering whether survey designs need to be updated to more accurately reflect the sentiment of the public in light of the survey results which indicate a victory for Ferdinand Marcos Jr.
Statistical experts like Romulo Virola, former Secretary General of the National Statistical Coordinating Board (NSCB), and Dr. Peter Cayton of the University of the Philippines, believe recent Pulse Asia surveys show Marcos well ahead of his rival. the closest, the vice-president. Leni Robredo, had underrepresented and overrepresented certain sectors.
Both think that those in class A and B as well as the 18-41 age group were underrepresented, while there was an overrepresentation of those in class D and E. Virola also thinks there was an under-representation of those who have reached university.
Cayton said overrepresentation or underrepresentation means that the “proportion of officers in a survey’s sample may be higher or lower than what is generally expected of a larger population”.
Virola clarified that he did not believe Pulse Asia was using a bad sampling method, but that the over-representation and under-representation were the result of its post-stratification process, which focused on regional stratifications on the profiles. socio-demographic groups (SDGs).
He noted that several studies and polls abroad have shown that age, class and level of education have greater impacts on voter preferences.
Virola attempted to address these “flaws” and reassessed the results of the Pulse Asia survey from March 16-21, 2022 showing a 56-24 gap between Marcos and Robredo.
He did this using the 2017 Socio-Economic Classification System (1SEC) developed by the UP School of Statistics (to adjust for under-representation of ABC classes); the educational attainment distribution of the Philippine Statistics Authority’s voting-age population (to adjust for under-representation of those who have attained college); and the use of Comelec data on registered voters by age to adjust for the under-representation of young voters.
Given that the numbers barely budged from March 16-21, 2022 to April 16-21, 2022, the survey and Pulse Asia have not changed its methodology since the first poll, “whatever the issue since the any beginning was still there,” Virola told the Inquirer.
He admitted, however, that his calculations were based on an “arbitrary” split of votes (60-40 in favor of Robredo) based on the assumption that there were relatively more Robredo supporters among young people as well as among those with a higher level of education and socio-economic status. backgrounds.
Those assumptions, he said, were based in part on Google Trends data showing massive interest in Robredo. “Even if these are arbitrary measures, I don’t think they are unreasonable given what is happening on the ground,” he said, referring to the massive gatherings in Robredo.
His calculations show that Marcos will still be ahead even after adjusting the national tally by socioeconomic class (53.7% vs. 29.3%) and level of education (48.8% vs. 31.2%).
However, adjusting the vote among people aged 18-41 and 42-57 shows Robredo narrowly taking the lead with 40.4% to 39.6%.
Virola’s calculations were intended to increase the gaps in Pulse Asia’s sampling. But opinions differ on whether oversampling or undersampling has meaningful implications for research design.
Cayton said it could mean “some inherent deviation, at the very least.”
“If a group is underrepresented and overrepresented, the estimates tend to be a bit more biased in that they favor the overrepresented group than the underrepresented group,” he said. “If the deviation is very large, it could affect the results in terms of reliability and accuracy.”
Cayton also tried to do ensemble methodologies that merged the Pulse Asia survey and Google Trends data on the assumption that big data could also be a reliable measure of public opinion.
His calculations also bring Marcos and Robredo to a statistical tie. But he is also the first to admit that “there are a lot of heavy assumptions under this model”.
Men Sta. Ana, coordinator of think tank Action for Economic Reforms, said the sampling used by Pulse Asia was “close to the true distribution”, especially since the demographic description of the respondents only emerged after the realization random survey.
“Random variation is not systematic bias. This happens precisely because the outcome is random,” he said. A well-designed random survey “will result in insignificant random variation”, he added.
Even without members of Classes A and B — who are notoriously difficult to interview and belong to the top 1% of households — in the mix, the variance would still be very low, Sta. Ana said.
Pulse Asia defended its methodology, which it had used for decades.
The margin of error for each SDG reflected the “variance for the SDG”, given its share in the total survey sample. It also corrected, “to a large extent what Dr. Virola finds to be undersampling/oversampling of specific SDGs,” Pulse Asia President Ronald Holmes said in a statement.
He dismissed claims that Pulse Asia had been “bought out” and its work compromised. Creating such doubts about scientific polls “only deepens polarization and distrust and contributes to the continued erosion of an already extremely weak democratic order,” Holmes said.
“Those who make these unfair and unfair criticisms bear the responsibility for their baseless accusations that fuel the spiral of misinformation and misinformation that plagues our society,” he said.
Research companies, like Pulse Asia, use location-based multistage probability sampling. Thus, data on socio-economic classes come after the survey when respondents are grouped into classes.
According to Jose Ramon Albert, senior researcher at the public think tank Philippine Institute for Development Studies (PIDS), it is impossible to sample households across socioeconomic and income groups because no one has a complete list.
“The Pulse Asia and SWS (Social Weather Stations) tables that list [socioeconomic status] are “afterthoughts” from the data collected through the survey, as are the tables on [PIDS] income groups. These are themselves data from the surveys,” Albert said.
Google rightly warns that the information available on its search trends page is not a substitute for polling data, as users may want to know more about a party or politician for a number of reasons, without intending to vote for them.
Online surveys and trends reflect public preference and interest at the time the survey was conducted or the data was collected, but people can and do change their minds until election day.
After the 2016 presidential elections, SWS exit polls showed that voters decided their choice for president a bit later: 18% of respondents said they made their choice only on Election Day himself; another 15 percent only decided during the May 1-8 period; 12% made their decision in April; 8% in March; and 46% in February or earlier.
—WITH AN INQUIRER RESEARCH REPORT
Pulse Asia believes latest survey showing Bongbong Marcos head may reflect May 9 poll result
Pulse Asia defends sampling methods amid criticism of latest survey
Pulse Asia: False Accusations, Baseless Claims to Inquiry Eroding PH Democracy
Subscribe to our daily newsletter
To subscribe to MORE APPLICANT to access The Philippine Daily Inquirer and over 70 titles, share up to 5 gadgets, listen to news, download as early as 4am and share articles on social media. Call 896 6000.