by Dr. Robert Buccigrossi, TCG CTO
Imagine going to a website, typing “Next Tuesday” into a date field, and it works. This magical capability already exists and is provided by a built-in PHP function called “strtotime”.
Why doesn’t everyone use strtotime? Because, it’s hard to get right. Consider “11–10-09”. Does that mean November 10, 2009 (US version), October 11, 2009 (Europe), or October 9, 2011 (computer time stamp)? PHP has rules to decide between these choices (based upon what separates the numbers), but then the user has to be trained. So we usually abandon text-based date entry and end up with the typical interface of “pick from this pop-up calendar and we’ll translate it to ‘yyyy-mm-dd’ for you”.
Why do I bring this up? Robotic Process Automation (RPA). Even financial trade magazines are now touting RPA as “the Most Important Megatrend You’re Not Hearing About”. While I think RPA is the closest thing to successful End-user development that I’ve seen, it still requires deep programming knowledge. (End-user development are tools that allow end-users, not professional software developers, to program computers. I have been following it for years because I believe that end-user development will eventually disrupt our industry.)
Doesn’t RPA provide a simple point-and-click interface that creates a user-friendly workflow diagram? Yes, but the devil is in the details. Once you click on an edit field or a link, you are sucked into the world of “selectors”, using CSS or XPATH to figure out exactly what element you picked. And then, you need to grab data from the web pages and store it somewhere. That means… variables! What if you need to loop over data? Now you need to know arrays! So to do anything interesting with RPA, you need to know enough about HTML, variables, and arrays.
So I mentally place RPA on par with complex spreadsheets. Yes, you will find end-users who learn enough to do amazing things with spreadsheets, (the fact that Google Sheets essentially has SQL built in also helps,) but both spreadsheets and RPA are still tough for non-developers to use. Why can’t I simply tell RPA to go to a specific page and just use my credentials to log in? Instead, I need to walk through every step, telling it how to get the credentials, what elements to look for (with selectors), what data to enter where, what submit button to press, and then how to react if something goes wrong. As developers, we’re used to this painstakingly detailed coding process, but end-users aren’t.
For RPA to really be magical, I think it needs to be much more intelligent (like “strtotime”): working hard behind the scenes to give users a simpler but more powerful set of commands. For example, RPA could provide a login command that only requires 2 things: a URL and credentials. To do this, however, the login command would need to be developed by studying a number of pages and incorporating a litany of techniques (automatically identifying login state, login dialogues, password vs. OAuth, 2‑factor authentication, etc.). Similarly, powerful commands could look at a page for lists or tables of data (including pagination) and automatically pull it into a spreadsheet.
Why doesn’t RPA have these abilities? Like “strtotime”, it’s really hard to get right. We don’t have the concept of writing code that “tries” a bunch of things, asks for assistance when it fails, and then adds the new technique to its repertoire (true machine learning). Until then, RPA is cool since it allows developers to quickly interact with Web pages, but it still requires developers, not regular end-users.