The reason for Open Citation Data and why it is necessary now more than ever, is arrived at by examining why researchers began to publish in journals and the current trend of applied metrics leading to a ‘publish or perish’ mentality. How citation or reference lists are included to attribute preceding work in these journal […]
This is an update of my previous dabblings with chomping through log files. To summarise where I am now: I have a distributable workflow, loosely coordinated using Redis and Supervisord – redis is used in two fashions: firstly using its lists as queues, buffering the communication between the workers, and secondly as a store, counting […]
Redis has been such a massively useful tool to me. Recently, it has let me cut through access logs munging like a hot knife through butter, all with multiprocessing goodness. Key things: Using sets to manage botlists: >>> from redis import Redis>>> r = Redis()>>> for bot in r.smembers(“botlist”):… print bot…lycos.txtnon_engines.txtinktomi.txtmisc.txtaskjeeves.txtoucs_botswisenut.txtaltavista.txtmsn.txtgooglebotlist.txt>>> total = 0>>> for […]
July 2, 2011
1