The other day, I observed a colleague creating a country-year dataset by hand--using Excel to type out a list of countries and then manually add years. It took her eight or ten hours.
This is a little inefficient.
So I thought I'd give a very quick tutorial in how to do this in 10 seconds.
First, open Stata and create a new file. (For convenience, I'll refer to this as "country.dta".)
Create one new variable, called "country."
Populate this with some arbitrary number of country names--"Belgium","France","Germany", whatever. Since this is an example, four or five will be fine.
Next, create some number of years, like so:
gen year1960=1960
gen year1961=1961
gen year1962=1962
You should now have four variables--"country", "year1960", "year1961", and "year1962"--of which the latter three should be identical. To see your data, type
browse
Now, type
reshape long year, i(country)
drop _j
Once again, type
browse
to see your data.
You'll see that you now have your data arrayed in country-year format.
This is a toy example, but it's got obvious advantages. For more on the tools that went into this, see the UCLA computing site or type
help reshape
from the Stata command line.
Thanks, you just saved me 10 hours of my life :-)
ReplyDelete