aka “How I picture a simple REST mechanism for queuing tasks with external services, such as transformation of PDFs to images, Documents to text, or scanning, file format identification.”Example: Document to .txt service utilisation Step 1: send the URL for the document to the service (in this example, the request is automatically accepted – code […]
May 8, 2008
Peter Sefton wrote to me recently, and noted that in the basic solr indexer I’ve written, it still uses the rather poor convention that the datastream with a DSID of FULLTEXT contains all the extracted text from the other binary datastreams. He wrote on to say that perhaps the connection might be able to be […]
May 7, 2008
If you’ve recently installed the shiny new Hardy Heron release of Ubuntu, or updated to the latest Debian, you may be surprised that a few old xml techniques in python no longer work. For example, the following no longer exist: from xml import xpathfrom xml.dom.ext import Anything_Really See for more details and a temporary workaround: […]
May 29, 2008
3