Very often you’ll find that any potential interest you’d have in Prolog is destroyed by the fact that all tutorials on the language involve solving some completely pointless and pedantic problem such as the map coloring problem or perhaps the eight queens problem.
Fortunately, Prolog is a 100% badass language that can be used to accomplish useful tasks. In this post I’ll show you how to yank some JSON data from the World Bank’s RESTful open data API and process the results. All in < 40 lines of Prolog. Not rocket science, but that's the whole point.
Let's go.
The Code
Here’s the whole program in all of its glory. I’ll break it down below. Feel free to chime in with a comment if something looks off.
Also, you can grab the code on github here.
:- use_module(library(http/http_open)).
:- use_module(library(http/json)).
fetch_and_process_data :-
fetch_data(FetchedData),
process_data(FetchedData).
url_to_process('http://api.worldbank.org/countries/USA/indicators
/AG.AGR.TRAC.NO?per_page=10&date=2005:2011&format=json').
fetch_data(FetchedData) :-
url_to_process(URL),
http_open(URL, DataStream, []),
json_read(DataStream, FetchedData, []).
process_data([ Header, Contents | [] ]) :-
process_data_header(Header),
process_data_contents(Contents).
process_data_header(_) :- !.
process_data_contents([]).
process_data_contents([JSONObject|Rest]) :-
process_json_object(JSONObject),
process_data_contents(Rest).
process_json_object(JSONObject) :-
json_object_has_value(JSONObject, date, DateValue),
json_object_has_value(JSONObject, value, IndicatorValue),
print('Date: '), print(DateValue), nl,
print('Value: '), print(IndicatorValue), nl, nl.
json_object_has_value(JSONObject, Name, Value) :-
json(NameValueList) = JSONObject,
member(NameValuePair, NameValueList),
NameValuePair = (Name = Value).
:- fetch_and_process_data.
Use SWI Prolog
We’re going to be using SWI Prolog. It’s free, well-documented and has some useful web modules. In particular we’re going to be using SWI’s http_open and json libraries.
At the top of our program we’ll include the two modules as follows:
:- use_module(library(http/http_open)).
:- use_module(library(http/json)).
Prolog programs consist of facts and rules. The syntax of a rule is rule_head :- rule_body, which is read ‘rule_head if rule_body’. You can think of the rule_head as a goal and the rule body as a sequence of subgoals that must be satisfied in order for the whole rule to be satisfied. Those ‘use_module’ statements are headless rules, so Prolog just executes them.
Create a ‘main method’ and call it.
This is one of those things that you could not know how to do after reading an entire overview of the language. What you want to do is create a main rule and then at the bottom of your code ‘call’ your main rule with the headless rule construct you just saw above.
Here’s a toy example:
main_rule :-
print('Hello, world!').
:- main_rule.
And here’s what’s actually in our code:
fetch_and_process_data :-
fetch_data(FetchedData),
process_data(FetchedData).
...
:- fetch_and_process_data.
Once you’ve included a call to your main predicate in a source file, all you have to do is ‘consult’ the source file into your Prolog session, and the code runs. For example, here’s the session for the program we’re talking about:
?- consult('http_test.pro').
Date: 2010
Value: @null
Date: 2009
Value: @null
Date: 2008
Value: @null
Date: 2007
Value: 4389812
Date: 2006
Value: 4430359
Date: 2005
Value: 4470905
% http_test.pro compiled 0.00 sec, 264 bytes
true.
Prolog predicates are referred to with the naming convention name/arity, so our main rule is called fetch_and_process_data/1 because it takes a single argument. The rule has two subgoals which, surprise, fetch and process the data.
Notice that those two subgoals share the FetchedData variable. When we call fetch_data/1 the variable is uninstantiated. fetch_data/1 will instantiate the variable to our JSON object and then call process_data/1 with the instantiated variable. Let’s see what fetch_data/1 actually does.
Retrieve the JSON
fetch_data(FetchedData) :-
url_to_process(URL),
http_open(URL, DataStream, []),
json_read(DataStream, FetchedData, []).
The three subgoals of fetch_data/1 correspond to the following actions:
- Get the URL to the World Bank API
- Open a stream to the resource specified by the URL.
- Read the JSON from the stream into a Prolog structure.
We simply store the URL with the following fact in our Prolog program. The actual URI comes from the awesome World Bank API Query Builder. The query below is asking for quantity of tractors in the US for years 2005-2011. Of course, the program could easily be modified to operate on a list of URLs or to read the URL from standard input or whatever.
url_to_process('http://api.worldbank.org/countries/USA/indicators
/AG.AGR.TRAC.NO?per_page=10&date=2005:2011&format=json').
The second subgoal opens a stream for us to read the data from using the http_open/3 predicate which comes from the first of the modules we imported above. After this subgoal is satisified, the initially uninstantiated DataStream will be bound to a stream that we can read from. That empy list in the third slot of the predicate can hold any number of options. You can read up on http_open/3 in the SWI manual.
The third subgoal reads the JSON from the stream into a Prolog structure using the json_read/3 predicate which comes from the second of the modules we imported above. After this subgoal is satisfied FetchedData will be bound to a Prolog structure containing our JSON response data. Note that when json_read/3 binds the FetchedData variable it also binds the occurrence of that variable in the head of our rule! Again, you can read up on json_read/3 in the manual.
Process the JSON
If you’ve ever read JSON into a language like Python, Ruby or JavaScript you know that one of the format’s main virtues is that it maps quite neatly into data those language’s native data structures. Unfortunately, this is not the case with Prolog. In particular, Prolog doesn’t really have a map/hash/dict type (though of course you can create such a type in the language). The module creator chose to represent JSON objects lists key=val statements that are wrapped in a json() structure. So you get the following correspondence:
{
"first": "John",
"last": "Coltrane",
"plays": "Tenor Sax"
}
json([
first = "John",
last = "Coltrane",
plays = "Tenor Sax"
]).
The actual JSON that we’re getting from the World Bank looks like this. We’ll be interested the date and value fields of the outermost object in the result list below:
[
{
"page": 1,
"pages": 3,
"per_page": "2",
"total": 6
},
[
{
"indicator": {
"id": "AG.AGR.TRAC.NO",
"value": "Agricultural machinery, tractors"
},
"country": {
"id": "US",
"value": "United States"
},
"value": "4430359",
"decimal": "0",
"date": "2006"
},
{
"indicator": {
"id": "AG.AGR.TRAC.NO",
"value": "Agricultural machinery, tractors"
},
"country": {
"id": "US",
"value": "United States"
},
"value": "4470905",
"decimal": "0",
"date": "2005"
}
]
]
You can read up more on the correspondence between JSON/Prolog types in that link to the json_read/3 predicate above. Let’s take a look at the implementation of process_data/1.
process_data([ Header, Contents | [] ]) :-
process_data_header(Header),
process_data_contents(Contents).
process_data_header(_).
process_data_contents([]).
process_data_contents([JSONObject|Rest]) :-
process_json_object(JSONObject),
process_data_contents(Rest).
process_data/1 takes a single list argument and has two subgoals: process_data_header/1 and process_data_contents/1. If you’re not used to Prolog (or FP languages such as Haskell/OCaml) that rule header probably looks pretty gnarly. Basically, the ‘formal parameter’ of the predicate is a pattern that argument will get matched against. The matching will have the effect of dissecting the data structure, assigning its innards to the variables in the pattern. In the present case we know in advance that we’ll be receiving a two-element list so we simply pluck off the first and second list elements, bind them to the DataHeader and DataContents variables and call the subgoals.
Uh, Still Processing the Data
The header of the WorldBank JSON is useful but boring so I’m not doing anything at all in the process_data_header/1 subgoal. It’s just there for completeness’ sake.
process_data_contents/1 recursively walks through a list of JSON objects and calls process_json_object on each such object. In order to do just about anything useful in Prolog you have to do this type of recursive list processing. Our base case is the empty list, in which case we do nothing. Our recursive case splits the incoming list into its head and tail (here JSONObject and Rest). We then call process_json_object/1 on the JSONObject.
Finally, process_json_object/1 grabs the fields of interest from the given JSON object and writes out the values. It does so with the aid of json_object_has_value/3 which takes a JSON object and a field name as its first two arguments and binds its third argument to the object’s value for the given field. Remember that JSON objects in Prolog are lists of equals statements. That’s why we match the variable NameValuePair with Name = Value below.
process_json_object(JSONObject) :-
json_object_has_value(JSONObject, date, DateValue),
json_object_has_value(JSONObject, value, IndicatorValue),
print('Date: '), print(DateValue), nl,
print('Value: '), print(IndicatorValue), nl, nl.
json_object_has_value(JSONObject, Name, Value) :-
json(NameValueList) = JSONObject,
member(NameValuePair, NameValueList),
NameValuePair = (Name = Value).
Phew, that’s alot of explaining but go back to the top of the page and look at the program in its entirety again. It’s relatively succinct and straightforward once you get over the initial brainsmash of operating in the declarative paradigm.
Some Random Remarks
-
I had no idea what I was in for when I started using SWI’s HTTP/JSON modules. Ultimately, I found them to be dead simple to use and relatively well-documented.
-
Prolog programs can be viewed either through a declarative or an imperative lens. I’ve used lots of imperative language above. It can be interesting to look at a single predicate and think about from both viewpoints: from the imperative standpoint, think of it in terms of goals being satisfied; from the declarative standpoint, think in terms of the predicate simply being the specification of a truth condition.
-
Writing Prolog is so damn fun that I’m bewildered as to why more programmers (esp. ones that have FP hardons) aren’t knee deep in it.