| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

FrontPage

Page history last edited by Dan Zambonini 15 years, 9 months ago

hoard.it

 

hoard.it is a prototype system for scraping granular, semantic data from existing (template-driven) HTML pages. To date, the prototype has been used to aggregate museum object data and 'museum listings' data, but it can be used for any other type of data.

 

The prototype can be found here.

 

Please check out the Frequently Asked Questions first. Any questions after that? Contact Us.

 

New - Application Gallery!

 

Latest Updates

 

2008-06-17     Added 'order by random', placing an (internal) ID into each record XML, and ability to specify  'id' when querying the API,

                     e.g. http://feeds.boxuk.com/museums/xmlfeed/id/70783

 

2008-05-23     Can now search some of the 'normalised' data (dates/country) n.yearFrom, n.yearTo, n.country

                     e.g. http://feeds.boxuk.com/museums/xmlfeed/n.yearFrom/1975/n.yearTo/1985/n.country/japan/format/gallery

 

2008-05-23     Added 'order' and 'limit' parameters

                     e.g. http://feeds.boxuk.com/museums/xmlfeed/keyword/bakelite/format/gallery/order/dc.title/limit/10

 

2008-05-23     Minor updates to main page: made the example search terms clickable; added total no. object records.

 

2008-05-22     Added basic search form to the main page, plus a link back to this wiki. Removed the 'objects by country' graph from the home page, put the object map in its place.

 

2008-05-22     Pointed http://hoard.it domain to the main prototype page.

 

2008-05-22     Added 'gallery' output format, for quick overview 

                     (e.g. http://feeds.boxuk.com/museums/xmlfeed/keyword/parachute/format/gallery)

 

2008-05-22     Added limit (temporarily) of 1000 records to be returned by API, to prevent anyone bringing down the server...

 

2008-05-22     Added 'keyword' parameter to feed; searches ALL fields

                     (e.g. http://feeds.boxuk.com/museums/xmlfeed/record.type/object/keyword/contraceptive/format/html)

 

2008-05-22     Added local caching of thumbnails to improve performance 

 

2008-05-21     Started crawling some National Portrait Gallery objects (about 2,000 to date)

Comments (0)

You don't have permission to comment on this page.