Data Critique

We worked with the “Norms and Decision-Making” data set from the World Bank Gender Data Portal and focused on the data covering women’s rights and capacities in 43 Sub-Saharan African countries, split into 5 sub-regions. We covered 10 indicators surrounding marriage and home life in relation to social norms and legal limitations to gender equality in Sub-Saharan Africa.

What information, events, or phenomena does your dataset illuminate?

These indicators can be broken down into four main categories: (1) decision-making, (2) legal rights, (3) knowledge, and (4) beliefs. Decision-making indicators have to do with women’s agency in choice. Next, legal rights indicators are focused on regulations that limit women’s capacities relative to men. Knowledge-based indicators cover issues such as knowledge of contraception methods and knowledge of HIV/AIDS. Finally, beliefs-related indicators include questions surrounding women’s beliefs of women’s rights. Data types within the set are primarily nominal (responses to yes/no questions), interval (collection years), and numeric (percentages). Different indicator cases correspond to different data types, and different sources as well.

 

How was the data generated?

The dataset was generated by the World Bank’s Gender Group and the Development Economics Data Group (DECDG) and is updated at least four times a year (April, July, September, and December). DECDG employed several API documentation such as Python, R, and Stata to collect the data. Stata automatically includes regions and income groups after selecting data, while R and Python require users to merge/join the region and income group data to the indicator data frame. R was used for data visualizations as well as statistical analyses, while Stata was used for statistical analyses. 

 

What are the original sources?

The dataset draws from three primary sources: (1) the World Bank’s Women, Business and the Law (WBL) project, (2) Demographic and Health Surveys (DHS) compiled by the United Nations Population Fund (UNFPA), and (3) Multiple Indicator Cluster Surveys (MICS) implemented under the United Nations Children’s Fund (UNICEF). WBL is a source for legal-rights indicators within the dataset. Running since 2009, it collects data on laws and regulations that impact gender-based economic opportunities in 190 economies. WBL’s research/data collection team is largely composed of women, who make up 30 of the 33-person team. Next, the UNFPA DHS are sourced for decision-making, knowledge, and beliefs indicators. The UNFPA is the UN agency for sexual and reproductive health, aimed at promoting gender equality. It was established in 1969 and has partners in over 150 countries. It is responsible for carrying out the DHS, “nationally-representative household surveys that provide data… in the areas of population, health, and nutrition.” Finally, the MICS implemented under UNICEF are used as a source for beliefs indicators in the dataset. These surveys have been carried out since the 1990s in 118 countries, generating “data on key indicators on the well-being of children and women” with a “trained fieldwork [team that conducts] face-to-face interviews with household members on a variety of topics.” Overall, it is important to note that this dataset draws from a variety of sources, each pertaining to certain indicators in particular. Funding for research and data collection processes plays a large factor in affecting data and its presentations. 


What organization funded the creation of the dataset ?

Our dataset is primarily funded by the World Bank Group (WBG), an entity that is composed of five other organizations. The International Bank for Reconstruction and Development (IBRD), and the International Development Association (IDA) act as its foundation. The other major participants are the International Finance Corporation, the Multilateral Investment Guarantee Agency, and the International Centre for Settlement of Investment Disputes. The World Bank Group’s mission statement is to “end extreme poverty and boost shared prosperity on a livable planet.”. Additionally, each data source is funded by numerous groups in turn. The WBL project has many contributors outside of the WBG including the Bill & Melinda Gates Foundation, the Childcare Incentive Fund, the Jobs Umbrella Multi-Donor Trust Fund, the Knowledge for Change Program, the State and Peacebuilding Fund, the United States Agency for International Development, the William and Flora Hewlett Fund, and the Human Rights, Inclusion and Empowerment Umbrella Trust Fund. Another source of our dataset is the MICS which is funded by UNICEF, which receives voluntary contributions from multiple sources such as various governments, intergovernmental organizations, foundations, private sector donations, and individual donations. The WBG and UNICEF are two of the most notable and major contributors to the funding of our dataset. Others include the United Nations Population Fund, the World Health Organization, and more. While multiple sources of funding support a well-rounded dataset with a limited number of biases, the potential for data biases is still important to consider.

 

What information is left out and cannot be revealed by the dataset?

Evidently, we must acknowledge the silences in our data set. First, the data noticeably lacks demographic breakdowns of the women participating in the DHS and MICS surveys. Similarly, the World Bank acknowledges that they use standardized assumptions in order to compare women’s statuses across different economies–like assuming that a woman is located in a main business city and in a marriage that is socially acceptable. These factors can affect the way women are treated by their community, so assuming that all women fall under the most ideal demographic or ignoring what demographics they belong to means we may not have accurate information about marginalized groups of women. Additionally, the data from the World Bank analyzes laws as they should apply to women based on how they are written (de jure), but not how they are actually implemented (de facto). There may be some women in more privileged positions who are able to circumvent laws or find loopholes, so the data does not account for that. 

 

What is your dataset’s ontology? If your dataset were your only source, what information would be left out?

The mix of nominal, numeric, and interval data collected by a number of international organizations using different methods presents a diverse perspective from several widely trusted sources. Had our dataset been our only source, we would have lost the context of the history and culture behind the results. In addition to that, we wouldn’t have been able to acknowledge how societal changes around the globe affected or influenced the region. Though our dataset is not without gaps and potential bias, it appeared to paint a thorough picture of the norms women subscribe to around the world.