At the weekend we opened up to public view our internal Multi-Academy Trust database on our site here. A number of people have asked why we did this, and a number have asked how. So here goes.

We provide online tools to MATs so we need to know who our customers are likely to be. We need to understand the market (sorry, it is one, and to the business that supply schools it always has been) and how it is changing and growing. We need to know who is a part of it and who we need to contact. Now there are a number of 3rd party database tools to manage contacts but we have found that they are not quite good enough to handle the complexity of the MAT data structure. So we built our own.

Firstly, the data itself. Where did it come from? That’s the easy bit. DfE have regularly published their list of Open Academies, in spreadsheet format. Originally it was a fairly poor dataset but more recently it has improved, and the latest incarnation has the added benefit of a column headed ‘Trust Id”. Any one who has ever constructed a relational database will realise the benefits this suddenly brings. Suddenly its possible to create a database with links between the Trusts (the MATs) and the academies within them. All the other data about the schools themselves comes from the aptly named “Get Information about Schools” site. This site does maintain much of the data we have presented but there is far too much clicking involved for my liking.

So we wanted to know a number of things about MATs:

  • Who they were
  • Which academies were members (many of our existing customer schools are in MATs but we don’t really know it)
  • Where these MATs were located
  • Who to contact

and we wanted all the information on a single page with the minimum of clicking required. I hate clicking. Not as much as I hate scrolling, but I really don’t like having to click unnecessarily.

And the DfE had provided a number of datasets, which once we linked them together (Trust Id plus school URN is a powerful combination) made this easily possible. Here’s the interesting thing. Creating the database and the site to show it took an afternoon. Which is why I’m a little surprised that the DfE hasn’t done something like this themselves. But maybe they like clicking.

WARNING – technical bit coming up. The tools we used are fairly straightforward. It’s a standard web set-up of Apache, MySQL and PHP code. The whizzy bits (the data loads for each trust) are AJAX. The mapping is a simple matter of providing the correct data to Google and they then send the map back to us. They do that for free, by the way.

So the data is very useful to us in this form. Why open it up? Well I have for the past couple of years blogged on the MAT data, using the DfE spreadsheet to analyse how the formation of MATs was progressing. This I did mainly using Excel and to be honest it was a bit of a chore. So I thought that it would be easier to make the internal database we have publically available so that the kind of information I was providing in the blog could be there all the time for people to view and interrogate as they wish. We’ll add some of this analysis over time, but its an out of hours project so it’ll be bit by bit. It’s also a little bit of a hint to gov.uk that data is only really open if people can easily see it.

There are obviously some restrictions. We are limited to updating the data when DfE do so, which is about quarterly. They also still have this thing about keeping Free Schools separate from open academies in a different spreadsheet with different headings. Which, frankly, is a bit of a pain. So we can show the free schools but not easily (it slows down the page load time) aggregate the data in the school numbers lists. A good example here is that it is difficult to show ‘Free School Only” MATs. This is my way of saying “don’t blame us if the data isn’t 100% accurate”. All I can promise is that it is an accurate representation of the data that DfE provide.

One other thing. The site is presented as a neutral representation of a dataset. There is no statement intended here, either for or against MATs. They exist. The data about them exists. Here’s a better way of viewing the data. That’s all.

We hope you find the information useful and that it helps you in some way.