Archives

April 2004 (7)
March 2004 (12)
February 2004 (12)
January 2004 (22)
December 2003 (19)
November 2003 (16)
October 2003 (26)
September 2003 (18)
August 2003 (38)
July 2003 (80)
June 2003 (13)
May 2003 (24)
April 2003 (76)
March 2003 (75)
February 2003 (51)
January 2003 (73)

Category

Family (5)
FYI (18)
Games (2)
Geek (88)
Geographic (3)
Hacks (13)
Home (15)
Humor (54)
Ideas (20)
Ideaspace (15)
Local (15)
Metadata (10)
Microsoft (2)
MovableType (5)
Nitwits (66)
PKI (2)
Politics (22)
Quotes (3)
RDF (15)
RSS (4)
Security (3)
Semantic Web (13)
Site Info (13)
Social Networks (1)
Spam (9)
Sysadmin (1)
Tips (2)
Tivo (2)
TMFTOTHD (1)
To Do (1)
Unlisted (1)
Web (3)
Windows (1)

Local

« MetroBlogs »
DC metroblogs
beltway bloggers

Links


Assorted bits

Blogroll Me!
GeoURL
Listed on BlogShares




April 21, 2003

Benchmarking log performance?

Has anyone done any comprehensive benchmarks of various logging solutions for Apache? Something that compared the relative CPU/RAM/resource consumption of using various logging techniques. The solutions range from using rotatelogs, cronolog, mod_sql, syslog and I'm sure others. I'm curious how they stacked up.

I'm most interested in quick access to specifically targeted URL consumption. That is, show me quickly what URLs are being used and with what sorts of frequency and quantity.

I'm sort of surprised to not see anyone having done a thorough head-to-head comparison of them. Or maybe I'm just not searching for the right keywords...

Suggestions?


Perma  | Comments (1) | TrackBack (0) | 04:05 PM  | xml
Comments

Can't point you to any specific benchmarks, but I like the logging solution provided by mod_log_spread [1].

I haven't actually used it yet, but it feels "right", especially after my experience working for a company that needed to collect logs from a few hundred servers distributed globally. The servers could see heavy loads at times (sometimes seeing hundreds of thousands of simultaneous persistent connections feeding ms/real video streams). We had a a hierarchy of machines dedicated simply to collecting logs (with FTP) off the servers, parsing them, and getting the data into Oracle. The process was so costly that we usually couldn't process 24-hours worth of logs in 24 hours. This meant we also had to develop parallel solutions using something like SNMP to provide a real-time view of the network.

Using mod_log_spread the log info gets passed from httpd to the spread deamon on a unix domain socket (no disk access), from which it is broadcast using UDP to the log collectors. Because it's broadcast you can have as many collectors as you want without adding any additional load on your network.

Nice side benefit: you can also use the spread daemon as a software load balancing solution for a farm of apache servers using Wackamole and mod_backhand [2].

[1] http://www.lethargy.org/mod_log_spread/

[2] http://www.backhand.org/

Posted by: Van Gale on April 21, 2003 10:54 PM
Post a comment






* if you do not leave a valid e-mail or URL your comment may be deleted *







Navigation

Recent Entries

America and Europe: Vive la différence?
Server changes afoot
Diet behavior mod
Googling for sensitive info
Outlook 2003 and IMAP, a marriage made in Hell
Bike to Work Day, May 7th
Speakeasy rocks
Zippo USB?
When geographic data is nowhere 'near' correct
Local campaign contributions

User comments
Trackbacks

Contact

send me an e-mail E-mail
chat with me using MS messenger MSN Messenger
chat with me via AIM America Online
chat with me on ICQ ICQ
chat with me on Yahoo! Yahoo
Add my vCard to your electronic addressbook vCard
Friend of a Friend FoaF

Syndication

XML  RDF  CDF

Comments

XFML

Extra Stuff

foaf
vCard
pgp info
Linked In
Powered by
Movable Type 2.64