feeds Archives - Mondo 2000

by: The Doctor [412/724/301/703/415]

So, after going on for a good while about software agents you’re probably wondering why I have such an interest in them. I started experimenting with my own software agents in the fall of 1996 when I first started undergrad. When I went away to college I finally had an actual network connection for the first time in my life (where I grew up the only access I had was through dialup) and I wanted to abuse it. Not in the way that the rest of my classmates were but to do things I actually had an interest in. So, the first thing I did was set up my own e-mail server with Qmail and subscribed to a bunch of mailing lists because that’s where all of the action was at the time. I also rapidly developed a list of websites that I checked once or twice a day because they were often updated with articles that I found interesting. It was through those communication fora that I discovered the research papers on software agents that I mentioned in earlier posts in this series. I soon discovered that I’d bitten off more than I could chew, especially when some mailing lists went realtime (which is when everybody started replying to one another more or less the second they received a message) and I had to check my e-mail every hour or so to keep from running out of disk space. Rather than do the smart thing (unsubscribing from a few ‘lists) I decided to work smarter and not harder and see if I could use some of the programming languages I was playing with at the time to help. I’ve found over the years that it’s one thing to study a programming language academically, but to really learn one you need a toy project to learn the ins and outs. So, I wrote some software that would crawl my inbox, scan messages for certain keywords or phrases and move them into a folder so I’d see them immediately, and leave the rest for later. I wrote some shell scripts, and when those weren’t enough I wrote a few Perl scripts (say what you want about Perl, but it was designed first and foremost for efficiently chewing on data). Later, when that wasn’t enough I turned to C to implement some of the tasks I needed Leandra to carry out.

Due to the fact that Netscape Navigator was highly unreliable on my system for reasons I was never quite clear on (it used to throw bus errors all over the place) I wasn’t able to consistently keep up with my favorite websites at the time. While the idea of update feeds existed as far back as 1995 they didn’t actually exist until the publication of the RSS v0.9 specification in 1999, and ATOM didn’t exist until 2003, so I couldn’t just point a feed reader at them. So I wrote a bunch of scripts that used lynx -dump http://www.example.com/ > ~/websites/www.example.com/`date ‘+%Y%m%d-%H:%M:%S’`.txt and diff to detect changes and tell me what sites to look at when I got back from class.

That was one of the prettier sequences of commands I had put together, too. This kept going on for quite a few years, and the legion of software agents I had running on my home machine grew out of control. As opportunities presented themselves, I upgraded Leandra as best I could, from an 80486 to an 80586 to P-III, with corresponding increases in RAM and disk space. As one does. Around 2005 I discovered the novel Accelerando by Charles Stross and decided to call this system of scripts and daemons my exocortex, after the network of software agents that the protagonist of the first story arc had much of his mind running on through the story. In early 2014 I got tired of maintaining all of the C code (which was, to be frank, terrible because they were my first C projects), shell scripts (lynx and wget+awk+sed+grep+cut+uniq+wc+…) and Perl scripts to keep them going as I upgraded my hardware and software and had to keep on top of site redesign after site redesign… so I started rewriting Exocortex as an object-oriented framework in Python, in part as another toy project and in part because I really do find myself enjoying Python as a thing that I do. I worked on this project for a couple of months and put my code on Github because I figured that eventually someone might find it helpful. When I had a first cut more or less stable I showed it to a friend at work, who immediately said “Hey, that works just like Huginn!”

I took one look at Huginn, realized that it did everything I wanted plus more by at least three orders of magnitude, and scrapped my codebase in favor of a Huginn install on one of my servers.

Porting my existing legion of software agents over to Huginn took about four hours… it used to take me between two and twelve hours to get a new agent up and running due to the amount of new code I had to write, even if I used an existing agent as a template. Just looking at the effort/payoff tradeoff there really wasn’t any competition. So, since that time I’ve been a dedicated Huginn user and hacker, and it’s been more or less seamlessly integrated into my day to day life as an extension of myself. I find it strange, somewhat alarming when I don’t hear from any of them (which usually means that something locked up, which still happens from time to time) but I’ve been working with the core development team to make it more stable. I find it’s a lot more stable than my old system was simply due to the fact that it’s an integrated framework, and not a constantly mutating patchwork of code in several languages sharing a back-end. Additionally, my original exocortex network was comprised of many very complex agents: One was designed to monitor Slashdot, another Github, another a particular IRC network; Huginn offers a large number of agents, each of which carries out a specific task (like providing an interface to an IMAP account or scraping a website). By customizing the configurations of instances of each agent type and wiring them together, you can build an agent network which can carry out very complex tasks which would otherwise require significant amounts of programming time. Read more “Semi-autonomous software agents: A personal perspective.”