Dave's Free Press: Journal

violence, pornography, and rude words for the web generation


Recent posts


Recently commented posts


Journals what I read

geeky politics rant silly religion meta music perl weird drinking culture london language transport sport olympics hacking media maths web photography etiquette spam amazon film bastards books bryar holidays palm telecoms cars travel yapc bbc clothes rsnapshot phone whisky security home radio lolcats deafness environment curry art work privacy iphone linux bramble unix go business engineering kindle gps economics latin anglo-saxon money cars environment electronics
Sun, 31 Jan 2010

CPANdeps upgrade

While you won't notice any changes, there have been biiiig upgrades at CPANdeps. Here's the diff.

Until now, it's used a SQLite database of test results that I downloaded every day and then mangled a bit to do things like add some necessary indices, figure out which reports are from dev versions of perl, and so on. That worked really well back in the summer of 2007, when there were only half a million reports in the database. I started worrying a bit at the beginning of 2009 when we hit 3 million, but the update happened overnight so I didn't care. But now that we've got over 6 million reports, the update would take anywhere between 8 and 14 hours. Not only is that not sustainable given the current growth rate, it also hurts the other users on that machine, because almost all of that time is spent waiting for disk I/O - which means that they're also waiting for the disk. On top of that, when you have big databases, a SQLite CGI ain't a great idea because indices have to be fetched from disk every time, so reads pound the disk too. Doubleplusungood!

Fun fact: SQLite is great for prototyping, but it doesn't scale :-)

So now it uses MySQL. Having a database daemon running all the time means that there's now some caching, so reads are quicker. In addition, given that I can't just simply fiddle with the structure of the database that I download to produce what I want, and instead have to import the data into MySQL, it now only imports new records, so the daily update takes only a few seconds.

I also re-jigged the structure of how it caches test results. Instead of being all in one directory with hundreds of thousands of files, they're split into a hierarchy. This probably won't have any significant effect on normal operations, but it will certainly make it faster for me to navigate around and see what's going on when people submit bug reports!

Posted at 12:18 by David Cantrell
keywords: geeky | perl
Permalink | 0 Comments

Sorry, this post is too old for you to comment on it.