DSpace Basics

Training

https://www.youtube.com/channel/UCEtGUi5KbW6rlX0439gxUcg

Documentation

https://wiki.duraspace.org/display/DSDOC6x/DSpace+6.x+Documentation

Technical FAQ

https://wiki.duraspace.org/display/DSPACE/TechnicalFAQ

How to Guides

https://wiki.duraspace.org/display/DSPACE/How+To+Guides

3rd party Tutorials

http://wiki.lib.sun.ac.za/index.php?title=SUNScholar/Practical_guidelin…)

Objects Definition:

Object

Example

Community

Laboratory of Computer Science; Oceanographic Research Center

Collection

LCS Technical Reports; ORC Statistical Data Sets

Item

A technical report; a data set with accompanying description; a video recording of a lecture

Bundle

A group of HTML and image bitstreams making up an HTML document

Bitstream

A single HTML file; a single image file; a source code file

Bitstream Format

Microsoft Word version 6.0; JPEG encoded image format

Authorization

DSpace's authorization system is based on associating actions with objects and the lists of EPeople who can perform them. The associations are called Resource Policies, and the lists of EPeople are called Groups. There are two built-in groups: 'Administrators', who can do anything in a site, and 'Anonymous', which is a list that contains all users. Assigning a policy for an action on an object to anonymous means giving everyone permission to do that action. (For example, most objects in DSpace sites have a policy of 'anonymous' READ.) Permissions must be explicit - lack of an explicit permission results in the default policy of 'deny'. Permissions also do not 'commute'; for example, if an e-person has READ permission on an item, they might not necessarily have READ permission on the bundles and bitstreams in that item. Currently Collections, Communities and Items are discoverable in the browse and search systems regardless of READ authorization.

Collection

ADD/REMOVE

add or remove items (ADD = permission to submit items)

DEFAULT_ITEM_READ

inherited as READ by all submitted items

DEFAULT_BITSTREAM_READ

inherited as READ by Bitstreams of all submitted items. Note: only affects Bitstreams of an item at the time it is initially submitted. If a Bitstream is added later, it does not get the same default read policy.

COLLECTION_ADMIN

collection admins can edit items in a collection, withdraw items, map other items into this collection.

Item

ADD/REMOVE

add or remove bundles

READ

can view item (item metadata is always viewable)

WRITE

can modify item

Bundle

ADD/REMOVE

add or remove bitstreams to a bundle

Bitstream

READ

view bitstream

WRITE

modify bitstream

DSpace Item State Definitions

wiki.duraspace.org/display/DSDOC5x/DSpace+Item+State+Definitions

Workspace item

An item that is under submission and active edit by an authorized user. The workspace item is visible only to the submitter and the system administrators. (Currently there is no simple way to find/browse such items other than with the direct item ID or to use the supervisor functionality). Using the supervisor functionality, a system admin can allow other authorized user to see/edit the item in the workspace state.

Expected use cases:

  • Self deposit
  • Collaboration over an in-progress submission for a small group of researchers. (This use case is implemented only with major limitations, using the supervision feature – concurrency, lack of delegation: supervision must be defined by the system administrators, etc.)

Workflow Item

An item that is under review for quality control and policy compliance. The workflow item is visible to the original submitter (currently only basic metadata are visible out-of-box in the mydspace summary list), users assigned to the specific workflow step where the item resides, and system administrators. (Currently there is no simple way to find/browse such items other than with the direct item ID or to use the abort workflow functionality).

Expected use cases:

  • Quality control
  • Improvements to the bibliographic record (metadata available in workflow can be different than those asked of the submitter)
  • Check of policy / copyright

Withdrawn item

It is a logical deletion. The Item can be restored and it can be used to keep track of what has been available for a while on the public site.

Expected use cases:

  • Staging area for item to be removed when copyright issues arise with publisher. If the copyright issue is confirmed, the item will be permanently deleted or kept in the withdrawn state for future reference.
  • Logical deletion delegated to community/collection admin, where permanent deletion is reserved to system administrators
  • Logical deletion, where permanent deletion is not an option for an organization
  • Removal of an old version of an item, forcing redirect to a new up-to-date version of the item (this use case is not currently implemented out-of-box in DSpace, see )

Private item

This state should only refer to the discoverable nature of the item. A private item will not be included in any system that aims to help users to find items. So it will not appear in:

  • Browse
  • Recent submission
  • Search result
  • OAI-PMH (at least for the ListRecords and ListIdentifiers verb; though the OAI-PMH specification is not clear about inconsistent implementation of the ListRecords and GetRecord verb)
  • REST list and search methods

It should be accessible under the actual ACL rules of DSpace using direct URL or query method such as:

  • Splash page access (i.e. /handle/<xxxxx>/<yyyyy>)
  • OAI-PMH GetRecord verb
  • REST direct access /rest/item/<item-id> or equivalent

Expected use cases:

  • Provide a light rights awareness feature where discovery is not enabled for search and/or browse
  • Hide “special items” such as repository presentations, guides or support materials
  • Hide an old version of an Item in cases where real versioning is not appropriate or liked
  • Hide specific types of item such as “Item used to record Journal record: Journal Title, ISSN, Publisher etc.” used as authority file for metadata (dc.relation.ispartof) of “normal item”

Archived/Published item

An item that is in a stable state, available in the repository under the defined ACL rule. Changes to these items are possible only for a restricted group of users (administrators) and should produce versioning according to the Institution's policy.

Embargoed Item

https://wiki.duraspace.org/display/DSDOC6x/Embargo#Embargo-Private/Publ…

Are a special case of Archived/Published Item. The item has some time based access policy attached to it and/or the underlying bitstreams. Specifically, read permission for someone (EPerson Group) starting from a defined date. Typically embargo is applied to the bitstreams so that "fulltext" has initially very limited access (normally administrators or other "repository staff" groups) and only after a defined date will the fulltext become visible to all users (Anonymous group). This scenario is used to implement typical "embargo requirements" from publishers -- see Delayed Open Access.

If the metadata of the item should be visible only to a specific group of users, it is possible to define an embargo policy also for the ITEM itself. A READ policy for a specific group will mean that only the users in that group will be able to access the item splash page. Note that currently only some UIs (JSPUI/XMLUI) are fully rights aware (see Discovery documentation for more information, especially the section on "Access Rights Awareness"). This means that in different UIs, some metadata of a restricted item could be exposed to unauthorized users. When you need to work with UIs not fully rights aware, a workaround can be to use the "Private Item" flag to make the item undiscoverable so that metadata will be not exposed to unauthorized users. Please note that this workaround has several major limitations:

  • No one, not ever authorized users, is able to find the item by browsing or searching the repository.
  • You need to manage externally a schedule that alerts you when the embargo is expired so that you may re-enable the discoverable nature of the item.

Bulk Import Items

https://wiki.duraspace.org/display/DSDOC6x/Importing+and+Exporting+Item…

DSpace use simple archive format to export and bulk import items to the repository: The basic concept behind the DSpace's Simple Archive Format is to create an archive, which is a directory containing one subdirectory per item. Each item directory contains a file for the item's descriptive metadata, and the files that make up the item.

archive_directory/

item_000/

dublin_core.xml -- qualified Dublin Core metadata for metadata fields belonging to the dc schema

metadata_[prefix].xml -- metadata in another schema, the prefix is the name of the schema as registered with the metadata registry

contents -- text file containing one line per filename

collections -- text file that contains the handles of the collections the item will belong two. Optional. Each handle in a row.

-- Collection in first line will be the owning collection

file_1.doc -- files to be added as bitstreams to the item

file_2.pdf

item_001/

dublin_core.xml

contents

file_1.png

...

**** A sample zip archive is available to model after, also a sample can be obtained by exporting an item from DSpace. ****

The dublin_core.xml or metadata_[prefix].xml file has the following format, where each metadata element has it's own entry within a <dcvalue> tagset. There are currently three tag attributes available in the <dcvalue> tagset:

  • <element> - the Dublin Core element
  • <qualifier> - the element's qualifier
  • <language>- (optional)ISO language code for element

    <dublin_core>

    <dcvalue element="title" qualifier="none">A Tale of Two Cities</dcvalue>

    <dcvalue element="date" qualifier="issued">1990</dcvalue>

    <dcvalue element="title" qualifier="alternative" language="fr">J'aime les Printemps</dcvalue>

    </dublin_core>

    (Note the optional language tag attribute which notifies the system that the optional title is in French.)

Every metadata field used, must be registered via the metadata registry of the DSpace instance first, see Metadata and Bitstream Format Registries.

Recommended Metadata

It is recommended to minimally provide "dc.title" and, where applicable, "dc.date.issued". Obviously you can (and should) provide much more detailed metadata about the Item. For more information see: Metadata Recommendations.

The contents file simply enumerates, one file per line, the bitstream file names. See the following example:

file_1.doc

file_2.pdf

license

Please notice that the license is optional, and if you wish to have one included, you can place the file in the .../item_001/ directory, for example.

The bitstream name may optionally be followed by any of the following:

  • \tbundle:BUNDLENAME
  • \tpermissions:PERMISSIONS
  • \tdescription:DESCRIPTION
  • \tprimary:true

Where '\t' is the tab character.

'BUNDLENAME' is the name of the bundle to which the bitstream should be added. Without specifying the bundle, items will go into the default bundle, ORIGINAL.

'PERMISSIONS' is text with the following format: -[r|w] 'group name'

'DESCRIPTION' is text of the files description.

Primary is used to specify the primary bitstream.

Configuring metadata_[prefix].xml for Different Schema

It is possible to use other Schema such as EAD, VRA Core, etc. Make sure you have defined the new scheme in the DSpace Metada Schema Registry.

  1. Create a separate file for the other schema named metadata_[prefix].xml, where the [prefix] is replaced with the schema's prefix.
  2. Inside the xml file use the dame Dublin Core syntax, but on the <dublin_core> element include the attribute schema=[prefix].
  3. Here is an example for ETD metadata, which would be in the file metadata_etd.xml:

    <?xml version="1.0" encoding="UTF-8"?>

    <dublin_core schema="etd">

    <dcvalue element="degree" qualifier="department">Computer Science</dcvalue>

    <dcvalue element="degree" qualifier="level">Masters</dcvalue>

    <dcvalue element="degree" qualifier="grantor">Michigan Institute of Technology</dcvalue>

    </dublin_core>