Where we provide APIs to our data, the data.parliament team has provided a method to download a pre-defined subset of data for each dataset as CSV. You can do this easily using the http://explore.data.parliament.uk website. However you will notice that you can only download a maximum of 500 resources at a time, and if you need complete datasets you will need to download the data in bite-size sections. We have provided some aides for you to do so, including a handy progress indicator.
You may be wondering why we have imposed the 500 limit. The results returned by downloading is currently always dynamic even when looking at obtaining the complete dataset without any filter. When you use the “Download” button you are actually downloading the metadata that has been defined as a result of modelling the data into RDF. This means that these queries are very resource intensive, and is the reason to why we have placed the limit. It’s so our servers can still serve up pages even when there is a high demand on them.
The good news is that we are working on ways to improve the situation, and you should see this get very much better within the next few months. We have been working with a new version of ELDA (our open source Linked Data Provider) in improving XML parsing (this is how we produce the CSV), as well as looking at ways in pre-processing the data to make available as downloads. We are looking at a way to provide XML data download as an alternative. Indeed many if not all of our datasets are hierarchal in nature and XML is far better fit (although we do understand the difficulties that some users would experience in manipulating XML).
Not only do we wish to provide data, but wish to do so in a useful way. This is why we provided the CSV link and continue to strive in getting the right balance and something for everybody. Ultimately we would like to hear back from you on how we can better serve the needs for downloading data. Have a look at the blog post http://blog.data.parliament.uk/2015/01/just-give-us-the-data/ for more information.
Finally for developers the best way of getting data in bulk is still to subscribe to our ATOM feeds and query the resultant files/resources to download them one at a time.