Archives

April 2004 (7)
March 2004 (12)
February 2004 (12)
January 2004 (22)
December 2003 (19)
November 2003 (16)
October 2003 (26)
September 2003 (18)
August 2003 (38)
July 2003 (80)
June 2003 (13)
May 2003 (24)
April 2003 (76)
March 2003 (75)
February 2003 (51)
January 2003 (73)

Category

Family (5)
FYI (18)
Games (2)
Geek (88)
Geographic (3)
Hacks (13)
Home (15)
Humor (54)
Ideas (20)
Ideaspace (15)
Local (15)
Metadata (10)
Microsoft (2)
MovableType (5)
Nitwits (66)
PKI (2)
Politics (22)
Quotes (3)
RDF (15)
RSS (4)
Security (3)
Semantic Web (13)
Site Info (13)
Social Networks (1)
Spam (9)
Sysadmin (1)
Tips (2)
Tivo (2)
TMFTOTHD (1)
To Do (1)
Unlisted (1)
Web (3)
Windows (1)

Local

« MetroBlogs »
DC metroblogs
beltway bloggers

Links


Assorted bits

Blogroll Me!
GeoURL
Listed on BlogShares




April 25, 2003

Log harvesting

Via Big Pink Cookie

The data from the logs can most certainly be analyed to "tell" what's going on. Combine it with a bit of indexing and spidering of other sites and you can most certainly tell who's reading what. There don't appear to be many people doing this... yet.

You post an article, that's one link. Pings are made, making new links. Timestamps exist for them so we've got a temporal fix as well. Do some geo-ip lookups and you can tell location. Now, spider the referral links your site provides and you can see who else linked into it. Ditto on comments and trackbacks. Extend that out to the linked sites. A very big picture starts to emerge.

Now, what does that picture mean? It would depend on the answer you want. On one level some idiots will try to take it out of context to support some harebrained perspective they espouse. RSS/XML has seen one particular vendor do this time and again.

So the question becomes will people refrain from exposing data because of what the idiots will do with it? I sure hope not. Because once the data is out there it becomes possible to build a /true/ big picture. Right now we're at the mercy of a things taking only short-sighted approaches based on short-term data. As more stuff comes online it becomes possible to build bigger pictures. As a result the short-sighted perspectives are exposed for the junk they espouse. But unless the good data gets shared the bad data will outnumber it.

Yes, this is a terrifyingly risky process. To share ones use of the net and links to data poses all sorts of exposure risks. As more people expose data the one's that don't will become obvious. Turning things against them will be a lot easier.

It's sort of like income tax records. They're all public. I can find out what you made last year simply by following the law and asking for it. We do this to politicians as a way to expose hidden agendas. Those that don't expose this info are made to look suspicious; what're they hiding?

This will certainly get worse before it gets better.

Comments
Post a comment






* if you do not leave a valid e-mail or URL your comment may be deleted *







Navigation

Recent Entries

America and Europe: Vive la différence?
Server changes afoot
Diet behavior mod
Googling for sensitive info
Outlook 2003 and IMAP, a marriage made in Hell
Bike to Work Day, May 7th
Speakeasy rocks
Zippo USB?
When geographic data is nowhere 'near' correct
Local campaign contributions

User comments
Trackbacks

Contact

send me an e-mail E-mail
chat with me using MS messenger MSN Messenger
chat with me via AIM America Online
chat with me on ICQ ICQ
chat with me on Yahoo! Yahoo
Add my vCard to your electronic addressbook vCard
Friend of a Friend FoaF

Syndication

XML  RDF  CDF

Comments

XFML

Extra Stuff

foaf
vCard
pgp info
Linked In
Powered by
Movable Type 2.64