Thursday, January 7, 2016

My Data

Data, Procedures and Measures

I have been using gapminder(gapminder.org) data for my analysis so far.
Elements of interest in the datasets are the following:

Electricity consumption, per capita(kWh)

This is a measure of per capita consumption of electricity in a given year and it is measured in kilowatt-hours(kWh). Electricity power measures the combined output of power and heat plants without transmission and transformation losses and own use by heat and power plants. The data analytic sample is made up of 214 countries and span the period 1981 - 2015.

Data Sources:
World Bank(http://data.worldbank.org/indicator/EG.USE.ELEC.KH.PC/countries).

This is the response variable -"relectricperperson" of my study. It measures residential electricity consumption per person during a given year across 214 countries. 

Gross Domestic Product per capita by Purchasing Power Parities(in US dollars at fixed 2011 prices)

These are aggregated data for countries and territories from myriad of sources.
The goal is to include as many countries and territories as possible and to measure their economic outlook. Some of the data are rough estimates for countries and territories for which no reliable data were found.

The income per person component of the dataset in question is based on GDP per capita and adjusted for  Purchasing Power Parity reflecting 2011 round of International Comparison Program(ICP).
The ICP estimates  based on regression analysis uses the GDP per capita of ICP as the dependent variable. The independent variables were Gross National Income(GNI) per capita by exchange rates and Gross enrollment in secondary school.

The predicted values from this model were used for countries lacking official ICP data, but which had observations for the two independent variables. For countries lacking GNI per capita, an alternate model using GDP per capita, by exchange rates was used to generate predicted values.

The data analytic sample consists of 201 countries and territories and span the period 1820 - 2015.

This is one of my explanatory variables, which is  measure of  per capita income adjusted for inflation using 2011 prices in US dollars.

========================================================================
Income group                        Definition           Estimated Avg. Income
========================================================================
Low Income                      $875 or less          $580
Lower middle income              $876 to $3465          $1918
Upper middle income              $3466 to 10725          $5625
High Income                      $10726 or more          $35131
========================================================================

Using IMF classification above, I categorize this data into relevant "Income Group".

Data Sources:
World Bank, UNSTAT(http://unstats.un.org/)
Maddison on-line(http://www.ggdc.net/maddison/maddison-project/home.htm)
CIA World Fact Book(https://www.cia.gov/library/publications/the-world-factbook/)
IMF(http://www.imf.org/external/datamapper/index.php)

Urban Population

This refers to people living in urban areas as defined by national statistical offices.
This aggregate date is calculated using World Bank population estimates and urban ratios from the United Nations World Urbanization Prospects(http://data.worldbank.org/indicator/SP.URB.TOTL).

The data analytic sample consists of 214 countries covering time period 1981- 2015.

As a lurking variable, "urbanrate" measures the percentage of the population living in urban areas of countries in our sample data. I created an additional variable- "ruralrate" and computed its value as 100 - "urbanrate". This variable refers to percentage of the population living in rural areas as defined by national statistical offices.

The original goal of this study is to measure the economic development outcomes of these countries.


No comments:

Post a Comment