PID Generation in Fedora

Relevant classes and relations

  • fedora.server.management.DBPIDGenerator (implementing Interface fedora.server.management.PIDGenerator) does select the two columns highestID and namespace from the database table pidGen. This Table (pidGen) consists of only those two columns - they represent the tuple necessary to identify the highest PID per namespace.
  • If a new PID is to be created, the Method generatePID(...) gets called and the PID gets incremented by one. Subsequently the new PID value is written back into the database table.
  • The Class fedora.server.management.BasicPIDGenerator is referenced by a module configuration in the fedora server configuration file. It is referenced by the following entry:
<module role="fedora.server.management.PIDGenerator" class="fedora.server.management.BasicPIDGenerator">
...
</module>
  • The PIDGenerator is used by class fedora.server.storage.DefaultDOManager and the rebuild mechanism as well.
  • There is a list of retainable PIDs defined in the fedora.fcfg configuration file, it is part of the DOManager module:
 <param name="retainPIDs" 
     value="escidoc demo test changeme fedora-bdef fedora-bmech tutorial">
 </param>
If an object does already have a PID at ingest time, the PID will be ignored and substituted with a PID generated by Fedora. However, if the PID namespace is specified in the retainPIDs list, it will not be touched by Fedora and taken as-is without generating a new PID.
The to-be-ignored namespaces have to be specified in a space-delimited manner. If this list is empty, Fedora will substitute namespaces "demo" and "test". If the list contains a "*", all namespaces will be ignored.

PIDGeneration during ingest

  • If a new object gets ingested, it will be checked if it does already contain a valid PID. If it does have a valid PID and it's PID namespace is in the retain list or the retain list is empty (see above), it will be validated (existing namespace and unique PID (DefaultDOManager)). If the existing PID is greater than the currently highest PID and does contain a valid namespace, it will pass the validation and will be marked to ascertain the PID will not be used within this namespace again.
  • If the object gets ingested and does not already have a valid PID, the current RecoveryContext will be looked up and be searched for a PID. If the Recovery Context does not have a PID, the PID will be generated using the mechanism described above.

Requirements for valid PIDs

In order to be valid and accepted by Fedora, the PID must conform to the following requirements: (See class fedora.common.PID). It is the grammar for a qualified prefixed name (QName).
 PID:
   Length : maximum 64
   Syntax : namespace-id ":" object-id

 namespace-id:
   Syntax : ( [A-Z] | [a-z] | [0-9] | "-" | "." ) 1+

 object-id:
   Syntax : ( [A-Z] | [a-z] | [0-9] | "-" | "." | "~" | "_" | escaped-octet ) 1+

 escaped-octet:
   Syntax : "%" hex-digit hex-digit

 hex-digit:
   Syntax : [0-9] / [A-F]

This article refers to Fedora Commons version 3.0b1

Add new attachment

In order to upload a new attachment to this page, please use the following box to find the file, then click on “Upload”.
« This page (revision-1) was last changed on 21-May-2008 11:49 by unknown [RSS]