Author : Mike|
Date: 2002/09/11 17:06
OK, let's talk a little bit about those mythical "web visits" then :-)
First of all, when you're working with standard webserver logfiles, any information you
might want to gather about "visits" or "visitors" is nothing more than an educated guess -
in some cases it might be reasonably accurate, in others it might be totally wrong (e.g.
when most of your visitors come from behind a battery of proxies).
When you add session IDs (either with a user tracking module in Apache, or by inserting
tracking images that send cookies, or with an ASP or PHP application like PostNuke, etc.),
your visitors loose the possibility of caching, but you gain a slightly better view of
what they're "doing" on your website.
Overall, since people don't generally log off, statistics packages work with the notion of
session time-outs : if there haven't been any new "hits" with that session cookie after x
minutes, they assume that the visitor has gone away.
That would allow you to create some of the statistics you mention, like the number of
times someone with a particular session ID has "come back" (logged on), or how long he
stayed (before his session timed out), etc.
Of course, there's no guarantee that a session ID will correspond to 1 person, or that 1
person will have only 1 session ID, but again, it's a "reasonable approximation" - how
close this matches reality will depend on your visitor base.
In the case where you have actual "registered users" linked to session IDs (like in
PostNuke), you can go one step further in refining the notion of "visits", and hope that
your statistics will not be totally off-base.
But the inherent problem of session time-outs remains, so depending on whether you take 15
minutes, half an hour or an hour as time-out value, the figures you get in the end will
So much for sessions and visitors, and their inherent limitations. Now let's look at the
other aspect, namely what are you going to *do* with those statistics ?
For aWebVisit, I started from the premise that I wanted to get a better idea of how people
travelled through a website -in general-, without being overwhelmed by the 100,000
different paths people could follow during a "visit".
As part of that, I obviously had to analyse the different steps that "visitors" go
through, and so I could also extract some kind of timing information between steps as
But this becomes pretty tricky, because it doesn't take into account the page loading time
(including all related images, stylesheets, flash animations or whatever), or the fact
that a user might jump back and forward, or take a (timeout) minute break and visit some
other site linked on that page, and so on.
So the "time spent" on a particular page (or in a particular module in the case of
PostNuke) is probably the most unreliable of all, and you really need to correlate this
with other information in order to get *some* kind of semi-reasonable feeling about where
people spend their time.
Somewhat more reliable is *how often* they hit a particular area/module/whatever of your
And even there, you should correlate that with how difficult it is to navigate inside a
particular area/module. If in one area, you can get every information you want on a single
page, and in another, you need to browse through 10 pages to get what you want, this
doesn't mean that area #1 is less interesting to people - on the contrary, it's better
So compared to BBS statistics, not only is it more difficult to obtain somewhat reliable
"visit" statistics, but an accurate *interpretation* of the results is even more ...
Specifically for PostNuke, I believe the pnTracking module intends to provide *some* user
& module statistics. But even there, you will have to deal with the fact that most likely,
the majority of your "visitors" will be anonymous, which brings us right back to the
original problem of reliability of the statistics you'll get.
Of course, if all you want is some nice meaningless graphs to show off to your friends,
manager or sponsors (and who cares about accuracy), I'm sure you'll have a great tool :-)