Welcome to the BSBI Distribution Database
The distribution database (DDB) is a new system developed by the Botanical Society of the British Isles, to store and make accessible the ever-growing pool of biological records created or collated by the society.
The system contains over 30 million records, drawn principally from the VPDB and from the MapMate hub, along with some other smaller sets of data. The intention is that the system will be updated frequently with new and revised data, and that the content will exceed 30 million records by the end of the year.
The eventual objective is that the DDB will collate the large and disparate range of databases and data sets that the BSBI holds, in addition to newly created records. Although some of these records are already integrated into the BRC's Vascular Plants Database and accessible via the NBN, many other data sets are not currently accessible.
The DDB can provide county-recorders and research workers with on-line access to large, detailed data sets and, where appropriate, allow data to be edited to correct mistakes or to add annotations.
The system already provides fairly sophisticated search tools and it is envisaged that further search and reporting capabilites will be added as required.
The DDB is primarily intended as a research tool for county-recorders, referees and other specialist users. It provides access to records in full detail and data is fully editable by users. Because this is research-level, live data it may not be appropriate for the general public or for consultancies. Filtered, interpreted data can be obtained from your local records centre, the NBN Gateway and from other public sources.
Please register to request access to the DDb system. Access requests are welcomed and will be considered on a case-by-case basis. If we can't grant you full access to the system, we may instead be able to provide an export of only the data relevant to your work or suggest other sources for the information that you need.
The system allows the possibility on-line editing of records and submission of new records. This capability may be made use of in the future, but there are no plans to replace other existing recording systems (e.g. MapMate) with an online system. Please continue to use whatever recording system you prefer - the DDB will accept and fully integrate records from a diverse range of sources and formats.
- Detailed records
- The DDB retains and makes accessible all available details for records imported from external databases.
- Editing and checking
- County-recorders and other users with editing rights can modify records online, or quickly mark records as confirmed or unconfirmed.
- Data validation
- Although not yet fully implemented the system is intended to semi-automate the detection of erroneous records. Currently, the system detects malformed dates and grid-references during data import and online tools can be used to check grid-references against county boundaries and to compare records against the expected distribution for a taxon (at hectad or tetrad level).
- The database allows users to set up separate workspaces where they can freely modify or create records without impacting the public database. This could allow use of the system for 'working-copy' data that is not ready for public release.
- Reporting tools
- The system allows fairly complex queries to be run, the results of which can be displayed on maps. The range of output formats and visualisations possible will be expanded over the comming year.
For more details please click on the Capabilites tab.
To report any technical problems or to suggest improvements please email Tom Humphrey. Enquiries about data use or requests for access should be directed to Alex Lockton or Kevin Walker.
Using the search link (in the page's top 'menu bar') the database can be queried using one or more fields (e.g. taxon name and grid reference). Additional field types can be added to the search form using the drop-down list below the form ('add more constraints') - by default only a subset of the available search criteria are displayed. In addition sub-queries can be join to further restrict the search range or to modify the type of output. By combining search criteria or join fields, fairly complex search criteria can be created.
As an alternative to using this search form, you can display a distribution map for a taxon quickly by using the maps link in the top menu bar. By clicking on hectad squares on the resulting map you can drill down to a table view of the underlying data.
From the tabular record list you can display the full details uploaded for a record, by clicking on the icon in the left-most column. If you are logged in then alternatively you can view an editable view of the record by clicking ''.
Having retreived a set of search results, you can export the result set in various formats (currently only comma-separated, but more options will be added soon) or view the results as a map using the map tab (from here results can also be exported in a format suitable for import into DMAP)
It's strongly recommended that you register as a user of the website. This gives allows the system to retain a list of your recent searches, which you can quickly retrieve. As a registered user you can edit records, making changes to your own private view of the database. To register or login please use the links at the top right of the page.
Some searches can be quite slow. When trying to retreive results from complex queries it can be useful to view the search history page, which shows a list of your past and currently running queries. From this page cached results from your queries can quickly be retreived once the search has completed, consequently you do not need to wait for a particular search to finish before starting another one, or going to a different web page (provided you have logged in so that your search history is connected with your username, you could start a search, shut down your computer and come back later to view the results.
The following examples are intended to show some of possible types of search that are possible.
- Strawberry-tree from Kerry
- Records of Arbutus unedo from Kerry (either VCH1 or VCH2).
- Woody plants from SD59
- This query uses two fields, to specify hectad SD59 and a match with the PLANTATT woodiness attribute list (specifying a value of 'w' to signify woody).
- Records collected by F. Druce prior to 1930
- To search for people efficiently, start typing their surname, then pick the name from the drop-down list.
- List of Carex sp. in VC60
- A search using taxa, rather than records as the primary field. The link records query dictates that only records from VC60 are wanted. If the record set wasn't linked then the search would just give a list of species names without frequencies.
- VC40 Bluebells from hectads which also have red campion.
- This is a join between two record queries, using hectad as the common factor. The result set returns the bluebell records from VC 40, where there is also a record of S. dioica in the the tetrad. I've had problems getting queries such as this to work consistenly quickly - further work is needed, but as a work around currently try to restrict both queries as much as possible (e.g. the duplicate statement of VC40 for both queries - which technically shouldn't be neccessary.
- Records from a specified literature source
- This type of search of search is possible, but improvements are needed.
- Poa annua recorded by John O'Reilly from VC65 between 2007 and 5/6/2008 imported from MapMate.
- (just for the sake of using many criteria)
Capabilites of the system
Although the data currently online is largely just a duplicate of the VPDB, the database is ultimately intended to include not only raw botanical distribution records, but also, so far as possible to link in related metadata (including species checklists, attribute metadata, bibliographic references etc.).
As a token demonstration of linking data sets a few sets of the attributes from the CEH PLANTATT catalogue have been loaded, but the future possibilities for linking external is quite extensive. Attribute sets (numerical data, text, codes, web urls or checklists) can be attached to most types of primary object in the database (linked to almost any key field of the object), examples could include tagging named localities with links to related wikipedia articles; tagging hectads with min and max altitude; species protection status; tagging recorders with BSBI membership.
Versioning and workspaces
Although the user interface support for this is not fully implemented yet the system allows very extensive versioning of all records in the system. This allows multiple public or private views of the data or seperate working settings to co-exist. Apart from the obvious utility of enabling changes to records to be tracked, this also could allow very open edit access to the database, allowing for example unvetted users to add or edit records, without their changes impacting the main public dataset until checked and approved. This can also facilitate the import of new data from untrusted sources such as herb@home, or from museum catalogues without disrupting the mainstream public view of the database. Versioning allows arbitrary historical snapshots, so for example a published report could include a link to a live set of search results, fixed in time (immune from subsequent revisions of the database).
Supporting this adds some overheads to the database in terms of sotreage and retreival speed, but it appears to be feesible to retain the complete edit history for all objects in the database indefinitely.
All records in the system are editable online. As this requires acccess to a fast internet connection, alternative offline editing would be desireable. This could be done by allowing export of subsetsw of data in a spreadsheet format. If each row was tagged with a unique identifier, then subsequent reimport of the edited spreadsheet would allow the edited rows to update.
As external data sets can easily be repeatedly reimported then the alternative approach is for users to continue to use their preferred recording software and to reimport the data - which provided each record as a persistent identifier can be done seamlessly, without resulting in record duplication.
Currently data export is quite limited, record sets and taxon lists can be exported as CSV files, and hectad sets can be exported in DMAP (.dis) format. My intention is to provide whatever more complete structured file formats are required. It's not curently possible to track what records have been exported, but this capability would be straightforward to add, should it be seen as desireable.