Kiva.org Team‎ > ‎Team Statistics‎ > ‎

How was the statistics page made?

(Work in progress)

Presentation

I advertised my team statistics page, and I've been asked me how they were done, so here is a short how-to.
The goal is to easily and dynamically display data exported by Kiva.org.
  • Easily means : minimum programming skills. In our case you just need to slightly modify some spreadsheet value or formula.
  • Dynamically means that the data will update itself, you don't have to do anything, any change on Kiva.org is reflected on the web page.

Requirement

You must own a Google account: http://www.google.com/accounts

Methodology

  1. Find the source of the information, in our case Kiva statistics. Kiva allows access to these in several format, JSON, XML, RSS, HTML, etc. In our case we'll use the XML format.

  2. Create a spreadsheet online using Google Documents. There are several functions (formulas) that allow you to import selected data from an external source and fill a spreadsheet table automatically.

  3. Visualization: several types of charts are available from google, like bar graphs, lines, pie graph, maps, etc. The data to be represented come from a spreadsheet.

Querying Kiva.org using the API


The general form for sending a query to Kiva.org using their Application Protocol Interface is normal http request which follows the following syntax:

http://api.kivaws.org/<version>/<category>/<method>.<result type>?<parameter1>=<value1>&<parameter2>=<value2> ...

For example :
http://api.kivaws.org/v1/teams/20633.xml [ version = "v1", category="teams", method=team id "." result type="xml" ]
http://api.kivaws.org/v1/teams/20633/loans.html [ method loans, applied to a team, result in HTML ]

The Kiva API is described at : http://build.kiva.org/api

Extracting your statistics

Getting your team id

We'll go from the name of the team (you're supposed to know it), to the team's id. Follow these steps.

Go to your team page. 
For example my team is at http://www.kiva.org/team/kathleen. The last part, after the last / is the "shortname" of the team, its name is what's displayed on top of the page "Kathleen Smith Memorial Fund" with the above URL.

To get your team id, do this query : http://api.kivaws.org/v1/teams/using_shortname/kathleen.html (replace the string "kathleen" by your own team short name).

Since the method ends with ".html" the results is in HTML. If you call "kathleen.xml" the result will be in XML:

<response>
<teams type="list">
<team>
<id>20633</id>
<shortname>kathleen</shortname>
<name>Kathleen Smith Memorial Fund</name>
<category>Memorials</category>
<image>
<id>829741</id>
<template_id>1</template_id>
</image>
<whereabouts>World</whereabouts>
<loan_because>we care.</loan_because>
<description>
Not only friends Kathleen can join, but anyone from all over the world : Kathleen visited a lot of countries, she was speaking many languages from French to Russian. After she left the USA, she was staying in Paris, France, for 15 years. She was a skilled biologist and was working to help fight cancer. If you know people with cancer or carcinomatous meningitis, this is the team you were looking for : international, open and friendly. Welcome !
</description>
<website_url>http://en.freedonne.org</website_url>
<team_since>2011-08-01T23:42:17Z</team_since>
<membership_type>open</membership_type>
<member_count>41</member_count>
<loan_count>23</loan_count>
<loaned_amount>625</loaned_amount>
</team>
</teams>
</response>

























We have highlighted the <id></id> element, which contains your team id. Remember this number, it'll be used for later queries.

Using your team id

You can query Kiva database using your team id: http://api.kivaws.org/v1/teams/<your team id>.xml
That is: http://api.kivaws.org/v1/teams/20633.xml for the team Kathleen.

A little bit of XML

You can notice that this <id> element is nested into a <team> element which is itself included in the <teams> element etc. The path to access the <id> element is:
/response/teams/team/id
the <loan_amount> path is :
/response/teams/team/loaned_amount
<response> is the root element.
XML is used to describe objects, there are no mandatory element names, you can use whatever you think is describing your data better (like <book>, <title>, <isbn>, etc. for books or <name>, <birth_date>, <height>, <weight> for a person etc.).

If a list of elements is returned they will appear sequentially, they can be accessed by adding an index :
/response/teams/team[2]/id specifies the ID of the second team from the list returned by this query:

From XML to Google Documents

Google Doc New Spreadsheet
Go to your Documents Google page : http://docs.google.com from the left menu create a new spreadsheet.
If you find it boring to follow the following step one by one, you can use this document.

What you want now is to take the XML description of your team and insert it in a spreadsheet. The ImportXML() formula just does that.

Select the cell in A2 type in this formula :
=ImportXML("http://api.kivaws.org/v1/teams/20633.xml", "/response/teams/team")
The second line should be automagically populated by your team info!

Select line 2 and click on the wrap text icon and the align top icons

The result should look like that :



You can (manually) populate the first row with the XML id element names:


About updates

The data on Google will update itself when the data on Kiva is modified. It can sometime take a few minutes or hours, but it works.
Most of the charts are dynamic, when the data on the underlying spreadsheet changes, the charts is updated. You can prevent this by using a snapshot  which will create a chart as a static image. The later is obviously faster and recommended for static data.

=ImportXML()

The first parameter to this function is an URL, which should return XML data.
The second is an XPATH, e.g. a path selection to find the data into the XML tree.
Instead of =ImportXML("http://api.kivaws.org/v1/teams/20633.xml", "/response/teams/team") you can try :
=ImportXML("http://api.kivaws.org/v1/teams/20633.xml", "//team")
"//team" means "search from the root element the element named "team" and return it. Since there is only one team element in that tree both results are the same.

Visualizing the data

To display the 3 last columns (member_count, loan_count, loaned_amount) choose Chart from the Insert menu and enter Sheet1!L1:N2 and select "Recommended charts" a vertical bar graph.


Inserting the chart on a web page

If the chart is to be inserted on a Google Site all you need to do is select Chart in the Insert format, the list of all your spreadsheets files will be displayed, select the one you want to get the data from, and the same form as above is displayed. Yes, it's that simple.

If you want to insert the chart in a hosted web page, click on the top left of the chart itself, there is a short menu with an "Publish chart..." item.
 which will show a javascript code that you should cute and paste into your web page:

 
Ïnteractive chart means that the chart will display more information when you move the mouse over the bars (or different areas of the chart).
Note that the spreadsheet must be saved before you can export a chart.

Demo spreadsheet view

From whch you can cut & paste values. Note the Sheet1..3 tab at the bottom.

Demo spreadsheet


Thank me ;-)

Did you like this little tutorial ?
You can either:
  1. Become a member of my team: http://kiva.org/invitedto/kathleen/by/stephanefr
  2. As above but you make a loan on behalf of the team
  3. Make a little donation which we'll be used on Kiva.
If you do any of the above you can have your website hosted under freedonne.org and/or be listed on http://sponsors.freedonne.org.
The gadget spec URL could not be found

Advanced hacks

Since we have populated the line 1 with the elements name, and this name is actually the last part of the XPATH, we can use the first line values in the formula. Don't panic, it's easy. The content of A2 is /response/teams/team/id and id is the value of A1, same for the second column, A2 is /response/teams/team/shortname and "shortname" it's what's written in B1.
So each value displayed on the second line is the one of the element /response/teams/team/<value of line 1 in the sheet>.

If you want to get the member_count in a cell, use the following formula:
=ImportXML("http://api.kivaws.org/v1/teams/20633.xml", "/response/teams/team/"&L1)
The operator "&" concatenates 2 strings, since L1 contains member_count the second parameter becomes /response/teams/team/member_count.

Getting the list of loans

Syntax: http://api.kivaws.org/v1/teams/<team id>/loans.xml
A graphic representation is on Sheet2 of our demo document.

Import data from the Category page

Each team category has its own page, here are some:

On the right side you see 2 tables under "TEAM LEADERBOARDS". If you inspect the HTML code, you'll find that each name is a list item, e.g. between <li></li> tags:

The list content is highligthed below:

There is a function on Google documents Spreadsheet that allows to import any table (eg. content between <table> and </table>) or list (<li>...</li>):
ImportHTML(URL, type, index) where:
URL is the page address (eg. "http://www.kiva.org/community?category=Memorials") type is either "table" or "list" and index tells which list or table of the page, they can be many, to import into the spreadsheet.

If you enter "=importHtml("http://www.kiva.org/community?category="&$D$2, "list", 11)" in A2 (and D2 contains the category name, "Memorial" in this example), here is the output:

This data can already be displayed in a table chart like above but with a better formatting.
Now we want to output the position of a given team, which exact name is stored in D4 during the 3 periods "All Time", "Last Month". "This Month".

What we did is duplicate the table above a little bit further (columns E - H) and removed the numbers at the end of each cell. This is done by RegexpReplace(text, regexp, replacement), which replace what is matched by regexp in text by the new text "replacement". Regexp is short for Regular Expression and it's a way of specifying a text pattern. In our case we want to remove the digits at the end of the team name. The regexp which will do that is "[0-9]*$" : [0-9] is a range definition matching any digit, * means "0 or more" and $ is end of line, in english it can be translated by "all the trailling digits". It seems there are spaces before the first digit, so we have added " *" : " *[0-9]*$".
Since we want to delete the digits, the last parameter of RegexpReplace will be the null string "".
Because we're lazy, the formula "=REGEXREPLACE(A:A," *[0-9]*$","")" has been used in every cell of F:H.
The column E just numbers each line (=row()).

To find the rank of your team, you just need to select the column E after keeping only the lines where the team name matches.
This is done on Sheet2 with the formula : =filter('Sheet1'!E:H, 'Sheet1'!F:F='Sheet1'!$D$4)
which means "take the array E:H from Sheet1 and keep only the lines where the value of column F = the team name".
The next line will do the same on column G then H.

The header of the first table on Sheet1 ( "All Time", "Last Month". "This Month" <) is a hack: it's the result of the formula =transpose(importHtml("http://www.kiva.org/community?category="&$D$2, "list", 10)) which takes the value of the 10th list defined on the http://www.kiva.org/community?category=<some category> page but transposed so it's made horizontal rather than vertical.

To display the data we just take the first 2 columns of sheet2.

Below, our spreadsheet.

And the chart

Personal stats

I've been ask if it was possible to use Google tools to display personal Kiva statistics. Yes it is ;-)

Please look at this spreadsheet.

NOTES

  1. All the requests to Kiva servers should end with app_id="your domain name" (in reverse DNS notation, eg. foo.com becomes com.foo).
  2. On the above example we don't do a good job at paging, eg. Kiva servers will return only X (usually 20) results at a time, you have to do multiple queries to pull all the data you need. The paging parameters are returned at the beginning of the response:
<paging>
<page>1</page>
<total>272</total>
<page_size>20</page_size>
<pages>14</pages>
</paging>
Pages are accessed by adding the page= parameter on the query.


BUGS

Sometimes the chart doesn't update, but only after refresh.
The charts range must be known in advance, which is not always possible to know. For example the number of countries covered by Kiva will change over time, so the # of rows will change but the chart gadget needs a fixed range, e.g. Sheet1:A1:B28.


ą
Stephane (webmaster),
Oct 23, 2011, 6:02 PM
Comments