fedora.server.storage.translation
Class DOTranslationUtility

java.lang.Object
  extended by fedora.server.storage.translation.DOTranslationUtility

public abstract class DOTranslationUtility
extends java.lang.Object

Utility methods for usage by digital object serializers and deserializers. This class provides methods for detecting various forms of relative repository URLs, which are URLs that point to the hostname and port of the local repository. Methods will detect these kinds of URLS in datastream location fields and in special cases of inline XML. Methods are available to convert these URLS back and forth from relative URL syntax, to Fedora's internal local URL syntax, and to absolute URL sytnax. This utility class defines different "translation contexts" and the format of these relative URLs will be set appropriately to the context. Currently defined translation contexts are: 0=Deserialize XML into java object appropriate for in-memory usage 1=Serialize java object to XML appropriate for "public" export (absolute URLs) 2=Serialize java object to XML appropriate for move/migrate to another repository 3=Serialize java object to XML appropriate for internal storage

The public "normalize*" methods in this class should be called to make the right decisions about what conversions should occur for what contexts. Other utility methods set default values for datastreams and disseminators.

Version:
$Id: DOTranslationUtility.java 6618 2008-02-19 12:17:43Z cwilper $
Author:
payette@cs.cornell.edu

Field Summary
static int DESERIALIZE_INSTANCE
          DESERIALIZE_INSTANCE: Deserialize XML into a java object appropriate for in-memory usage.
static java.util.regex.Pattern s_fedoraLocalPattern
           
static java.util.regex.Pattern s_getItemPattern
           
static int SERIALIZE_EXPORT_ARCHIVE
          SERIALIZE_EXPORT_ARCHIVE: Serialize digital object to XML in a manner appropriate for creating a stand alone archive of objects from a repository that will NOT be available after objects have been exported.
static int SERIALIZE_EXPORT_MIGRATE
          SERIALIZE_EXPORT_MIGRATE: Serialize digital object to XML in a manner appropriate for migrating or moving objects from one repository to another.
static int SERIALIZE_EXPORT_PUBLIC
          SERIALIZE_EXPORT_PUBLIC: Serialize digital object to XML appropriate for "public" external use.
static int SERIALIZE_STORAGE_INTERNAL
          SERIALIZE_STORAGE_INTERNAL: Serialize java object to XML appropriate for persistent storage in the repository, ensuring that any URLs that are relative to the local repository are stored with the Fedora local URL syntax.
 
Constructor Summary
DOTranslationUtility()
           
 
Method Summary
static void appendXMLStream(java.io.InputStream in, java.lang.StringBuffer buf, java.lang.String encoding)
          Appends XML to a StringBuffer.
static Datastream normalizeDSLocationURLs(java.lang.String PID, Datastream origDS, int transContext)
           
static java.lang.String normalizeInlineXML(java.lang.String xml, int transContext)
          Utility method to normalize a chunk of inline XML depending on the translation context.
static Datastream setDatastreamDefaults(Datastream ds)
          Check for null values in attributes and set them to empty string so 'null' does not appear in XML attribute values.
static Disseminator setDisseminatorDefaults(Disseminator diss)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DESERIALIZE_INSTANCE

public static final int DESERIALIZE_INSTANCE
DESERIALIZE_INSTANCE: Deserialize XML into a java object appropriate for in-memory usage. This will make the value of relative repository URLs appropriate for instantiations of the digital object in memory. For External (E) and Redirected (R) datastreams, any URLs that are relative to the local repository are converted to absolute URLs using the currently configured hostname:port of the repository. To do this, the dsLocation is searched for instances the Fedora local URL string ("http://local.fedora.server") which is the way Fedora internally keeps track of instances of relative repository URLs. For Managed Content (M) datastreams, the internal identifiers are instantiated as is. Also, certain reserved inline XML datastreams (WSDL and SERVICE_PROFILE) are searched for relative repository URLs and they are made absolute.

See Also:
Constant Field Values

SERIALIZE_EXPORT_PUBLIC

public static final int SERIALIZE_EXPORT_PUBLIC
SERIALIZE_EXPORT_PUBLIC: Serialize digital object to XML appropriate for "public" external use. This is context is appropriate when the exporting repository will continue to exist and will continue to support callback URLs for datastream content and disseminations. This gives a "public" export of an object in which all relative repository URLs AND internal identifiers are converted to absolute callback URLs. For External (E) and Redirected (R) datastreams, any URLs that are relative to the local repository are converted to absolute URLs using the currently configured hostname:port of the repository. For Managed Content (M) datastreams, the internal identifiers in dsLocation are converted to default dissemination URLs so they can serve as callbacks to the repository to obtain the internally managed content. Also, selected inline XML datastreams (i.e., WSDL and SERVICE_PROFILE) are searched for relative repository URLs and they are made absolute.

See Also:
Constant Field Values

SERIALIZE_EXPORT_MIGRATE

public static final int SERIALIZE_EXPORT_MIGRATE
SERIALIZE_EXPORT_MIGRATE: Serialize digital object to XML in a manner appropriate for migrating or moving objects from one repository to another. This context is appropriate when the local repository will NOT be available after objects have been migrated to a new repository. For External (E) and Redirected (R)datastreams, any URLs that are relative to the local repository will be expressed with the Fedora local URL syntax (which consists of the string "local.fedora.server" standing in place of the actual "hostname:port"). This enables a new repository to ingest the serialization and maintain the relative nature of the URLs (they will become relative to the *new* repository. Also, for Managed Content (M) datastreams, the internal identifiers in dsLocation are converted to default dissemination URLs. This enables the new repository to callback to the old repository to obtain the content bytestream to be stored in the new repository. Also, within selected inline XML datastreams (i.e., WSDL and SERVICE_PROFILE) any URLs that are relative to the local repository will also be expressed with the Fedora local URL syntax.

See Also:
Constant Field Values

SERIALIZE_STORAGE_INTERNAL

public static final int SERIALIZE_STORAGE_INTERNAL
SERIALIZE_STORAGE_INTERNAL: Serialize java object to XML appropriate for persistent storage in the repository, ensuring that any URLs that are relative to the local repository are stored with the Fedora local URL syntax. The Fedora local URL syntax consists of the string "local.fedora.server" standing in place of the actual "hostname:port" on the URL). Managed Content (M) datastreams are stored with internal identifiers in dsLocation. Also, within selected inline XML datastreams (i.e., WSDL and SERVICE_PROFILE) any URLs that are relative to the local repository will also be stored with the Fedora local URL syntax. Note that a view of the storage serialization can be obtained via the getObjectXML method of API-M.

See Also:
Constant Field Values

SERIALIZE_EXPORT_ARCHIVE

public static final int SERIALIZE_EXPORT_ARCHIVE
SERIALIZE_EXPORT_ARCHIVE: Serialize digital object to XML in a manner appropriate for creating a stand alone archive of objects from a repository that will NOT be available after objects have been exported. For External (E) and Redirected (R)datastreams, any URLs that are relative to the local repository will be expressed with the Fedora local URL syntax (which consists of the string "local.fedora.server" standing in place of the actual "hostname:port"). This enables a new repository to ingest the serialization and maintain the relative nature of the URLs (they will become relative to the *new* repository. Also, for Managed Content (M) datastreams, the internal identifiers in dsLocation are converted to default dissemination URLs, and the contents of the URL's are included inline via base-64 encoding. This enables the new repository recreate the content bytestream to be stored in the new repository, when the original repository is no longer available. Also, within selected inline XML datastreams (i.e., WSDL and SERVICE_PROFILE) any URLs that are relative to the local repository will also be expressed with the Fedora local URL syntax.

See Also:
Constant Field Values

s_fedoraLocalPattern

public static java.util.regex.Pattern s_fedoraLocalPattern

s_getItemPattern

public static java.util.regex.Pattern s_getItemPattern
Constructor Detail

DOTranslationUtility

public DOTranslationUtility()
Method Detail

normalizeDSLocationURLs

public static Datastream normalizeDSLocationURLs(java.lang.String PID,
                                                 Datastream origDS,
                                                 int transContext)

normalizeInlineXML

public static java.lang.String normalizeInlineXML(java.lang.String xml,
                                                  int transContext)
Utility method to normalize a chunk of inline XML depending on the translation context. This is mainly to deal with certain inline XML datastreams found in Behavior Mechanism objects that may contain a service URL that references the host:port of the local Fedora server. This method will usually only ever be called to check WSDL and SERVICE_PROFILE inline XML datastream, but is of general utility for dealing with any datastreams that may contain URLs that reference the local Fedora server. However, it this method should be used sparingly, and only on inline XML datastreams where the impact of the conversions is well understood.

Parameters:
xml - a chunk of XML that's contents of an inline XML datastream
transContext - Integer value indicating the serialization or deserialization context. Valid values are defined as constants in fedora.server.storage.translation.DOTranslationUtility: 0=DOTranslationUtility.DESERIALIZE_INSTANCE 1=DOTranslationUtility.SERIALIZE_EXPORT_PUBLIC 2=DOTranslationUtility.SERIALIZE_EXPORT_MIGRATE 3=DOTranslationUtility.SERIALIZE_STORAGE_INTERNAL 4=DOTranslationUtility.SERIALIZE_EXPORT_ARCHIVE
Returns:
the inline XML contents with appropriate conversions.

setDatastreamDefaults

public static Datastream setDatastreamDefaults(Datastream ds)
                                        throws ObjectIntegrityException
Check for null values in attributes and set them to empty string so 'null' does not appear in XML attribute values. This helps in XML validation of required attributes. If 'null' is the attribute value then validation would incorrectly consider in a valid non-empty value. Also, we set some other default values here.

Parameters:
ds - The Datastream object to work on.
Returns:
The Datastream value with default set.
Throws:
ObjectIntegrityException

setDisseminatorDefaults

public static Disseminator setDisseminatorDefaults(Disseminator diss)
                                            throws ObjectIntegrityException
Throws:
ObjectIntegrityException

appendXMLStream

public static void appendXMLStream(java.io.InputStream in,
                                   java.lang.StringBuffer buf,
                                   java.lang.String encoding)
                            throws ObjectIntegrityException,
                                   java.io.UnsupportedEncodingException,
                                   StreamIOException
Appends XML to a StringBuffer. Essentially, just appends all text content of the inputStream, trimming any leading and trailing whitespace. It does his in a streaming fashion, with resource consumption entirely comprised of fixed internal buffers.

Parameters:
in - InputStreaming containing serialized XML.
buf - StringBuffer to write XML content to.
encoding - Character set encoding.
Throws:
ObjectIntegrityException
java.io.UnsupportedEncodingException
StreamIOException