File Formats and Digital Preservation of Electronic Records

free essayThe primary goal of digital preservation is to ensure that computer files can be accessed without being compromised. Computer files are distinctively structured on the basis of structural principles that are known as file formats. These policies are contained in a document called format specification (Ashraf, Sharma, & Gulati, 2010). The document provides information that can be utilized to develop applications that can read and render data. Even though several types of formats are independent of software, updates can cause formats to be inaccessible over time. Thus, it is the responsibility of an organization to adopt a policy that will implement best practices relating to file formats. The practices used by organizations in file formats determine the success of digital preservation. Therefore, an appropriate selection of file formats coupled with effective digital preservation techniques can significantly improve the digital preservation of electronic records.

Issues of File Formats

Obsolescence

File formats change as users and developers introduce new functionalities. The new forms may cause obsolescence, as newer versions of software may not support older formats (Borghoff, 2010). If the software does not have the element of backward compatibility, data may become inaccessible. Moreover, when the reception of a particular format is slow in the industry, obsolescence will inevitably occur. The reason is that there will be no creation of compatible software to read the data. In other instances, obsolescence occurs when competing firms purchase software used to perform similar functions with what their software does and then withdraw their use.

Get a price quote

Proliferation

When formats are not normalized, a company acquires a huge number of diverse formats that are difficult to track and manage. The determination of the formats that are at risk and the tools required is a complex task.

Selecting File Formats

File formats encrypt information into a specific code that can only be administered and rendered by specialized hardware and software. The availability of this information is at risk due to advancing technology. Therefore, the choice of file formats should be determined by considering the immediate environment and long-term sustainability (Giaretta, 2011). Electronic records are useful when they can be accessed throughout their lifecycle. The realism of handling a vast collection of electronic records is enhanced by reducing the number of different file formats involved (Borghoff, 2010). Furthermore, the choice of file formats to be used in migration poses additional problems. The reason is that the formats need to comply with requirements of authenticity and simplicity of access (Ashraf, Sharma, & Gulati, 2010). When digital records are input to a digital repository, they are identified by their extension (Gladney, 2010). The kind of file that a firm uses has a significant impact on how preservation practices can be applied. Thus, the factors below should be considered when selecting a file format.

Save 25% on your ORDER Save 25% on your ORDER

Exclusive savings! Save 25% on your ORDER

Get 15% OFF your FIRST ORDER (code: leader15) + 10% OFF every order by receiving 300 words/page instead of 275 words/page

Metadata Support

Various file formats make it possible to enclose metadata. The metadata can be generated automatically by the original application and input by users or a blend of both methods. The metadata has fundamental value during the active use and for lasting preservation since it offers information on provenance and technical aspects of the data (Smallwood, 2013). Moreover, metadata includes information on the subdivision, internal relationships, sequence, size, and encoding of file formats which aids in digital preservation.

Viability

Various formats offer error detection features to notice file corruption that may arise during data transmission. Other formats incorporate a cyclic redundancy check value that uses advanced techniques (Borghoff, 2010). For instance, PNG format includes byte sequences that search for different kinds of errors that may be introduced (Standard International, 2014). As a result, these formats are robust, hence their preference.

Our Benefits

  • English-Speaking Writers
  • Plagiarism-Free Papers
  • Confidentiality Guaranteed
  • VIP Services
  • 300 Words/Page
  • Affordable Prices

Proprietary Format

The format specification document cannot be freely accessed by the public. The files in proprietary format can only be opened by the software that was used in their creation (Ashraf, Sharma, & Gulati, 2010). Consequently, the files in this format are exposed to a huge risk of obsolescence since the software upgrades may cause them to be incapable of opening old files.

Open File Format

Open file formats denote file formats where the format specification document is accessible to the public. In simple terms, files in this format can be interpreted by software that was not used to create the files (Borghoff, 2010). Thus, files are not contingent on the original software. Importantly, this format ensures that data can be accessed over a longer period in its original format. Digital preservation aims to put content in open and popular formats (Gladney, 2010). Due to the fact that popular formats are used by many people, they can develop software that can read them.

How It Works

Interoperability

The capability to exchange digital records with different users and IT applications is a vital consideration. Formats that are supported by various software and those that are platform-independent are preferred since they also support sustainability by enhancing migration from one technical setting to another (Giaretta, 2011). Therefore, they are less likely to become obsolete.

Documentation Quality

The file formats should have comprehensive documentation that gives the possibility to interpret digital content in the format by people or applications. Hereby, properly documented formats can be retrieved and read even when the original software becomes outdated.

Digital Preservation Techniques

Migration

Migration is the transfer of digital content from a specific hardware or software to another one. It also enables the transfer of non-digital media to digital content. This technique guarantees the integrity of digital material by preserving the basic features of the data and retaining the capability to retrieve and view it, regardless of the changes in technology (Gladney, 2010). Nonetheless, with migration, it is usually impossible to ensure that the copies of data are completely identical to the original data, while simultaneously enhancing the compatibility of new technology. Metadata should document the migration process and be reversible. A backward migration should lead to a precise recreation of the original content (Ashraf, Sharma, & Gulati, 2010). In fact, this is not always possible, as some information is lost during the process. The weakness of migration is that it consumes a significant amount of time and financial resources.

Emulation

Emulation is the duplication of features of one system utilizing a different system to allow the second system to have full functionalities of the first one. It is a technique used to address technological obsolescence, as it offers a method for preserving the functionality of access to digital material that may be lost with software and hardware upgrade (Smallwood, 2013). Unlike migration, this process does not alter the original data in any way. It is a beneficial method since no further actions are needed once the emulation process has been completed. Emulation entails the development of emulators, which are applications that render codes from one computing setting to be properly implemented in another environment (Standard International, 2014). Nevertheless, this digital preservation approach is costly and time-consuming.

Our Customer Support Team is at Your Disposal 24/7

Encapsulation

This strategy is defined as the process of grouping together digital material and metadata required to offer access to the object. Importantly, it reduces the likelihood of losing components needed to interpret the object (Ashraf, Sharma, & Gulati, 2010). Appropriate types of metadata needed for encapsulation include reference, provenance, fixity, representation, and context information (Gladney, 2010). It is a suitable solution to technological obsolescence since all information required to decode the bits is accessible.

Normalization

This strategy entails the migration of digital material to standard formats. Normalization is the most extensively used technique of digital preservation. The file format is identified and transformed to an open format. The authenticity of the digital content may be compromised if the metadata is affected (Borghoff, 2010). The new formats are easily documented and available.

Technology Preservation

This method involves preserving the technical environment running particular software. The technical environment includes the operating system, media drives, software, and the original application. In fact, even this environment is maintained, old data cannot be lost since both software and hardware needed to operate the data will be available (Gladney, 2010). Nevertheless, this method is not viable due to the huge cost of preserving the technical environment.

Universal Virtual Computer

This strategy is a type of emulation. An independent program is developed that can simulate the essential architecture of any computer since the beginning, encompassing memory, registers, and regulations on how to transfer information amongst them (Ashraf, Sharma, & Gulati, 2010). Users could develop files using any application of their choice, but the data would be saved in a manner that could be read and retrieved by the universal computer. Therefore, reading a file in the future would only need a single emulation layer amidst the universal computer and future computers.

VIP VIP services

$2.00

Get extended REVISION

$3.00

Get SMS NOTIFICATIONS

$3.99

Get order Proofread by editor

$5.99

Get a full PDF plagiarism report

10.95

Get order prepared by Top 10 writers

$11.55

Get VIP Support

VIP Services package 29.98 USD

VIP

Software for Digital Preservation

PRONOM

PRONOM is a web-instituted technical registry to aid in digital preservation. The PRONOM registry offers a searchable web database of technical information regarding file formats, app tools needed to access them, and technical setting required to read them. Users can utilize different criteria, including file extensions and name, to search for formats in the database (Ashraf, Sharma, & Gulati, 2010). Furthermore, this software contains information regarding support periods for applications and can thus be queried on this ground.
The persistent unique identifier is an extensible system of constant, unique, and unambiguous identifiers for records in the registry. In fact, these identifiers are fundamental to the exchange and management of digital material by permitting people to identify and share the information that is needed to support access to individual digital materials (Ashraf, Sharma, & Gulati, 2010). Currently, the PUID system is restricted to one particular class of representation information, the format in which the digital content is encoded.
DROID is an application tool that performs systemized batch identification of file formats. This tool uses byte sequence and file extension signatures to identify the precise format of digital files. The signatures are kept in an XML file. New signatures that are frequently added to PRONOM and DROID can be designed to routinely download and update signature files (Ashraf, Sharma, & Gulati, 2010). Moreover, PRONOM provides links to documentation of formats. Importantly, the stability of a file format may be assessed by its age and rate at which new versions are released. PRONOM allows the users to search for all known versions and dates when the versions of formats were released. Furthermore, the compatible software search feature in PRONOM simplifies the process of identifying distinct file formats.

Conclusion

Obsolescence and proliferation are the main issues of file formats. The selection of file formats is an important process in digital preservation. Individuals and firms can use various criteria to choose the file format. Nonetheless, it is not possible to select file formats that meet all the criteria. The criteria considered when selecting file formats include metadata support, viability, open and proprietary formats, documentation quality, backward compatibility, and interoperability. Digital preservation techniques that can be used include migration, emulation, encapsulation, normalization, technology preservation, and universal virtual computer. The choice of the strategy is dependent on the resources available, personnel, and time. The organizational policy adopted will determine the success of digital preservation. PRONOM software is effective in addressing all concerns of digital preservation. For instance, it has tools that allow users to find the database by using either the file extension or the name.