|
It was no coincidence that our exemplary model was a
cross-sectional one. Their creation is much simpler since cats due to their
geographic dispersion don’t influence each others’ behavior. Taking a
time-series into account we have to deal with phenomena like trends or
seasonality. Granny’s cat for example getting older and older becomes harder to
please – that’s a trend. Those distort a
model and hamper comparison of variables. We’ll dodge it (gretl will) but first
we may need to comprehend a couple of things.
There are two types of time-series data. The one on
the left resembles a typical cross-sectional data graph. It’s called stationary
meaning its variance, covariance and expected value are equal zero the way it
was in cross-sectional model. It means it’s perfectly random so all of the
previous analysis tools will work properly. The trouble begins as we proceed to
the right.
That’s a non stationary model. There is a trend
present meaning that we can no longer use any analysis tool we’ve already
learned until we do some arrangements first. The difference is that even before
we check we’ve got a slight idea what the next value of a variable would be.
That’s not random.
We’ve got a ‘base’ and in each following period of
time we add to it ‘something’. As time goes by this ‘something’ becomes part of
the ‘base’.
y0 =
base
y1 = base + something 1
y2 =
(base + something 1) + something 2
y3 =
(base + something 1 + something 2) + something 3
The main idea is to separate the red from the blue. But
again there are two main types of base+something models we can encounter and
each has to be treated accordingly.
#1: Stochastic model
Stochastic means random. If variables are so random then
what’s the whole point of discussing it in the first place? Well they are
random to some extent. Again, let’s assume there’s a cat sleeping on a couch.
Chances are good that if we close our eyes, count to 10 and check again it’ll
still be asleep. At least we expect it to be so. We know that it’s highly
impossible that within those 10 seconds cat would depart to the moon or take up
cooking class. At best he could have woke up and jumped off the couch. So as
you can see the randomness is quite limited. The limits are set by the last
cat’s activity. If the last thing we saw cat doing before closing our eyes
would be entering a space shuttle at Cape Canveral well then it would be highly
probable that he would actually be departing to the moon. Therefore:
activity 1
= activity 0 + something 1
y1 =
y0 + εt
what we expect
to be the next activity = present activity
E(y1)
= y0
So y1 is related to y0
and y2 but it doesn’t influence y3, y4, y5
etc. This is called first order autoregressive process. The bigger the number of
the following variables determined by y0 the higher the order
of autoregression. In the previous model we analyzed a mutual regression of two
variables. Now it is the same variable having a regression in terms of itself.
#2
Deterministic model
Opposing to the limited randomness there’s continuity.
This is the cat getting older and older. We know that one day we’ll bury it in
the backyard. It’s not random, that’s something to be sure of. So we look at
the cat, close our eyes, count to ten, and there he is – 10 seconds older. No
matter how many times we do it, the cat is always heading in the same
direction. Let’s assume that the observed variable here would be the cat’s
speed. Obviously as time goes by he’s getting slower, that’s our trend. There
will be of course some random moments the cat will go berserk, running around
the house or quite contrary laying all day by the fire. Those are what we
cannot predict, the noise distorting our observations. Therefore:
speed t
=
constant (some
average speed each non-dead cat performs)
+
time trend t
+
something t
(random amok/ laziness)
Generally speaking variables are not related to one
another. They’re determined by the time instead. Since the time goes always in
one direction we might get the wrong impression that y1 depends on y0
and so on. That’s why stochastic models and deterministic ones are frequently
confused. Unluckily we have to be sure which type of model we’re dealing with
since there’s a different approach meant for each of them.
#3: Stochastic
model approach
While stochastic model is not stationary, its deltas
are. Those would be our ‘something’ separated from the base. Then we could
substitute the variables in model (F,C,H...) with values of their deltas.
Given that the variable shows autoregression of first
order then instead of F1, F2, F3, F4 ...
we would have
Δ F1 = (F1 - F0)
Δ F2 = (F2 - F1)
Δ F3 = (F3 - F2) ...
Those are calculated in gretl using Add menu (tool
bar)
Differences of variables in gretl are marked ‘d’ so
instead of F we’ll have d_F. Those d values are the ones we put in the model
instead of the original ones. We can get back to the procedure of model creation
that we’ve already learned.
#4:
Deterministic model approach
We know that in that kind of model some part of
variable F is determined by the passing time. What we would like to do is pull
out only pure F values, omitting this time-contaminated part. To do so first
we’ll have to figure out what is the relation between time and F thus create
a model F = time. Time variable can be added
to the data using Add > Time trend.
Once we get the model we save its residuals – the
differences between the estimated values and the real ones, those are our pure
Fs . Select Save > Residuals.
Then once more create a model you had in your mind
replacing F with its residuals’ values.
Of course you can do all of that absentmindedly
clicking where told. This is what you do at the exams but that’s not the case
here. Creating a model of your own requires understanding even if done with blissful ignorance of mathematical
dimension of the issue. Once you comprehend the simple you’ll be able to
explore the rest.