Semi-autonomous agents: What are they, exactly?

by: The Doctor [412/724/301/703/415]

This post is intended to be the first in a series of long form articles (how many, I don’t yet know) on the topic of semi-autonomous software agents, a technology that I’ve been using fairly heavily for just shy of twenty years in my everyday life. My goals are to explain what they are, go over the history of agents as a technology, discuss how I started working with them between 1996e.v. and 2000e.v., and explain a little of what I do with them in my everyday life. I will also, near the end of the series, discuss some of the software systems and devices I use in the nebula of software agents that comprises what I now call my Exocortex (which is also the name of the project), make available some of the software agents which help to expand my spheres of influence in everyday life, and talk a little bit about how it’s changed me as a person and what it means to my identity.

So, what are semi-autonomous agents?

One working definition is that they are utility software that acts on behalf of a user or other piece of software to carry out useful tasks, farming out busywork that one would have to do oneself to free up time and energy for more interesting things. A simple example of this might be the pop-up toaster notification in an e-mail client alerting you that you have a new message from someone; if you don’t know what I mean play around with this page a little bit and it’ll demonstrate what a toaster notification is. Another possible working definition is that agents are software which observes a user-defined environment for changes which are then reported to a user or message queuing system. An example of this functionality might be Blogtrottr, which you plug the RSS feeds of one or more blogs into, and whenever a new post goes up you get an e-mail containing the article. Software agents may also be said to be utility software that observes a domain of the world and reports interesting things back to its user. A hypothetical software agent may scan the activity on one or more social networks for keywords which a statistically unusual number of users are posting and send alerts in response. I’ll go out on a limb a bit here and give a more fanciful example of what software agents can be compared to, the six robots from the Infocom game Suspended.

In the game, you the player are unable to act on your own because your body is locked in a cryogenic suspension tank, but the six robots (Auda, Iris, Poet, Sensa, Waldo, and Whiz) carry out orders given them, subject to their inherent limitations but are smart enough to figure out how to interpret those orders (Waldo, for example, doesn’t need to be told exactly how to pick up a microsurgical arm, he just knows how to do it). So, now that we have some working definitions of software agents, what are some of the characteristics that make them different from other kinds of software? For starters, agents run autonomously after they’re started up. In some ways you can compare them to daemons running on a UNIX system or Windows services, but instead of carrying out system level tasks (like sending and receiving e-mail) they carry out user level tasks. Agents may event in a reactive fashion in response to something they encounter (like an e-mail from a particular person) or in a proactive fashion (on a schedule, when certain thresholds are reached, or when they see something that fits a set of programmed parameters). Software agents may be adaptive to new operational conditions if they are designed that way. There are software agents which use statistical analysis to fine tune their operational parameters, sometimes in conjunction with feedback from their user, perhaps by turning down keywords or flagging certain things as false positives or false negatives. Highly sophisticated software agent systems may incorporate machine learning techniques to operate more effectively for their users over time, such as artificial neural networks, perceptrons, and Bayesian reasoning networks. Software agent networks may under some circumstances be considered to be implementations of machine learning systems because they can exhibit the functional architectures and behaviors of machine learning mechanisms. I wouldn’t make this characterization of every semi-autonomous agent system out there, though.

Software agents may communicate with other agents in a network to complete tasks more effectively, or solve complex problems which cannot be easily undertaken alone. This is one example of the UNIX philosophy – individual tools which do one thing very well and can be chained together to solve larger problems. One agent may be designed to scrape a web page to extract a certain part, another agent may be designed to diff two or more pieces of data to see what, if anything is different about them, a third agent may do nothing but accept events from other agents in the network and package them into a single document, a fourth agent may speak the FTP protocol and PUT the document onto a server. It would be possible to write a single piece of software which does all of these things, but it would be extremely complex because there are so many different kinds of operations represented here. Debugging would be tricky and future code maintenance would be an exercise in patience, to say the least. Additionally, the different kinds of tasks in this example would overspecialize the agent, making it useful for only one thing. It’s not the sort of thing that J. Random Netizen would spend a lot of time assembling because of it’s limited utility. Having a bunch of smaller agents that can be patched together like Tinkertoys (say, with some sort of visual editor that lets you draw connections between the agents with a mouse) makes agent networks easier to maintain (the diff agent is self-contained, and messing with it doesn’t break everything, just that one thing), reusable (if you’ve got one diff agent you can use it everywhere you need a diff agent), and makes building networks of agents much faster (five minutes instead of a month to build, test, and debug).

 

Software agents are not evoked as required but always run in the background. To put it another way, when you need a word processor, you start up a word processor and work on a document. When you’re done, you save your work, close the document, and exit the word processor. Software agents, on the other hand, are meant to start up whenever the system they’re installed on starts up and run in the background all the time doing their thing. The kind of system they’re supposed to run on more or less has to be one that you’d expect to be always on, like a desktop at work, a server in a data center, or a VPS running at a hosting provider like Amazon’s EC2 or Digital Ocean which, when you get right down to it isn’t any different from having a dedicated box in someone’s rack. This implies that the agents must have some way of contacting their user, be it via e-mail, instant messenger, over the phone, or what have you. Agents often need some way of communicating amongst themselves if they’re going to operate in a cooperative fashion (i.e. agent networks). This can take the form of message queues, tables in a database, individual feeds like RSS or ATOM, or with APIs running over top of another protocol like XMPP or HTTP.

Software agents are meant to interact with their user only when they have something to report, such as an e-mail from a particular person. Otherwise, they stay out of your way and don’t pester you. Earlier in this post I called them “semi-autonomous software agents” for a reason: They go off and do their thing without any user interaction aside from starting them up. You don’t have to poke at them if you don’t want to, they don’t need to get any more information from you after you start them up (unless you make them interactive somehow), and they won’t send you anything that doesn’t explicitly come from what they’re doing or looking for. So, you can pretty much sit back and life will be quiet until something happens. Of course, this also means that programming an agent to do something inherently noisy (like send you a message every time an e-mail from a busy mailing list hits or report every tweet using a trending hashtag) will drive you nuts after a while, so it helps to be judicious in what you monitor, the criteria by which you monitor it, and what an acceptable level of pokes you’re willing to accept is.

Software agents can also have memories of a sort – databases of things they’ve “seen” recently. Agent memories tend to be temporally based, with events stored for anywhere from a minute to a year or possibly even permanently, subject to the limitations of the database on their back end, the amount of free storage space, and the amount of RAM available to a given agent, of course. For some species of software agent having a memory is essential: If an agent is carrying out trend analysis the agent needs to have a time series database of data to compare new events against. Another example would be an agent that watches your inbox for messages from a certain e-mail address. It would be irritating, to say the least, if the agent rescanned your inbox every time it started up and sent you a text message every time it found a matching message that it had already told you about. It is also possible to design specialized agents which act as personal archivists – agents which can be commanded to store copies of web pages locally for indexing and later reference or send URLs to a service like archive.is. A content index over a collection of files that allows them to be searched would also qualify as the memory for a software agent of some kind; Sphinx immediately comes to mind because I use it at work. Hypothetically speaking, collaboration software like etherpad-lite or a NoSQL database like RethinkDB could be used as the memory for one or more agents. Selecting the correct back-end database for a particular species of agent would be highly specific to the intended use case, and unfortunately is outside of the scope of this series of articles.

A related and equally important question to address is, “What aren’t semi-autonomous software agents?” It is an all-too common occurrance that whenever something new and different appears some people tend to treat it as a panacea which will solve all of their problems, and they get upset when they discover that this new thing won’t. So, let’s define some limits of software agents as a technology and manage expectations.

For starters, software agents are not expert systems. An expert system is a software package which consists of large database of knowledge that is specific to a particular domain (such as inorganic chemistry, structural engineering, or electrical troubleshooting), an engine that processes very long chains of if..then..else rules, and a user interface that lets someone input parameters of a problem and ask questions, and (hopefully) get assistance solving the problem. Expert systems are not designed to be proactive or reactive, they just wait for questions to process. They might have some necessary back-end processes (like garbage collection or database compaction) but they’re by and large pretty solipsistic. True software agents, on the other hand, tend to be coupled to their environments; if they’re not sampling one or more information sources they’re listening for events from other agents or processing what’s already in their memory fields and are generally doing their own thing.

Software agents are not necessarily examples of artificial intelligence, but the argument can be made that they are a first step in that direction outside of the big companies that seem to specialize in this field. The distinction between AI and AGI aside, an individual software agent may not be particularly smart as we might think of it. One example of this would be a species of software construct that just pull an RSS feed, parses it to determine correctness, pulls out individual posts and packages them as separate events. At a high level algorithmically speaking, that’s about as straightforward as it gets. Sliding farther up the scale of intelligence the argument could be made that agents which carry out statistical analysis of streams of events might be smarter (detecting sudden changes on the order of several standard deviations, sudden changes in the calculated Shannon entropy of a bitstream) but I suspect the arguments experts in the field will make will be of the form “Sophisticated statistical analysis isn’t actually AI.” A little farther up that spectrum one might find software agents that really are incorporating technology that the field of AI research and development is working with, like pattern recognition, data mining, and classification. It’s also worth noting that there isn’t anything preventing an implementation of some AI technologies from being comprised wholly or in part of individual software agents and the interactions between them.

Software agents can be designed for goal directed behavior but this is hardly a pre-requisite. First, let’s discuss in brief what goal directed behavior is – a way of solving problems when you have a goal of some kind, some possible techniques and steps to accomplish those goals, and the capacity to look at the world around you before you start solving the problem to help you determine a likely best first step toward accomplishing that goal. Change the starting conditions of the environment or your internal state (say, you had too much coffee that morning, or a software agent has been renice‘d to have much more CPU time than it ordinarily would) and that can affect the problem solving strategy drastically. So, that said… what kind of things would you need done that would necessitate designing agents with such an internal architecture? I don’t have an answer for that and I’ll admit up front that I haven’t built any. Real world environments (for whatever definition of ‘real world’ you want to use) are way more complex than simulations or game worlds. They’re messy, they have parameters that are highly difficult to characterize, and they don’t care about developers’ notions of “this can be” and “this should not be.” None of the tasks I’ve built agents to carry out require a goal-directed architecture because they’re pretty specific and what “wiggle room” exists results in acceptable amounts of errors or loss of precision. Maybe that’ll change in the future, and maybe your use cases are very different from mine, so don’t take this as my trying to dissuade you. However, take a bit of friendly advice for whatever it’s worth: Don’t try to build something highly complex until you’ve built something simpler first. Simple things are easier to test and debug than modular things; modular things are easier to test and debug than complex things; complex things are… really hard to test and debug.

This series of articles was previously highly summarized in the form of a presentation at the invitation of Ripple in August of 2015.

This article was originally posted at Antarctica Starts Here.

Leave a Reply

Your email address will not be published. Required fields are marked *