Quality report 2024
The Year 2024 of the National Digital Preservation Services in Finland
General
Digital preservation services (DPS) refer to services provided for the digital preservation of cultural heritage content and research data. The development of DPS is continuous and takes place in close cooperation with the partner organizations. The aim is that the most significant digitalized cultural heritage content, and content created in digital form, in the Digital Preservation Service for Cultural Heritage will be preserved for future generations and the long-term use of the content will be possible. Similarly, the Digital Preservation Service for Research Data ensures the availability and preservation of digital research data. Both services use a shared digital preservation system for bit-level preservation.
The Digital Preservation Service for Cultural Heritage started preserving content in 2015, and the Digital Preservation Service for Research Data in late 2019. Organizations that use the Digital Preservation Service for Research Data for preparing and storing content can also make more extensive use of Fairdata services, including the packaging service and the management interface.
The Main Results of the Year 2024
In 2024, the annual growth in preserved content was 582 terabytes. More than 1,950,000 new archival information packages were accumulated for preservation, which is a new record in annual growth. The amount of content in preservation exceeded 3.3 petabytes and the number of Archival Information Packages (AIPs) 5,500,000.
The DPS carried out their first large-scale mass migration. The ARC/WARC mass migration project migrated online content harvested by the National Library of Finland’s Web Archive into a new format in keeping with current standards. The number of migrated packages was approx. 620,000, and they take up around 65 TB of disk space.
The operating system of DPS production was migrated from CentOS 7 to RedHat Enterprise Linux 9. Several third-party tools, including those used in ingest, were also updated to newer versions. Support for file formats in ingest was improved by extending it to new formats and updating the validation tools.
Efforts to calculate the carbon footprint of the DPS were continued in 2024 on the basis of the previous year’s calculations. In this context, the DPS gained international visibility at the annual iPres conference in Ghent, Belgium. At this conference, a presentation of the model for calculating DPS infrastructure’ carbon footprint won the best poster award. This visibility has led to international cooperation in 2025, and the DPS participates in the Carbon Footprint Task Force organized by the Digital Preservation Coalition (DPC).
The website of the National Digital Preservation Services was updated. The new website meets the needs of research organizations better and is closer in its layout to the Fairdata service website, addresses the needs of the cultural heritage sector better, and is profiled as a unique DPS website. The new site has a more logical structure, and new content was created for it.
Partner Organizations
Organization | Purpose of use | Capacity (TB) |
---|---|---|
Celia | Master-arkisto ja pitkäaikaisesti säilytettävät valitut uudet äänikirjat | 160 |
EMMA – Espoon modernin taiteen museo | EMMAn kokoelmien mediataide | 7 |
Kansallinen audiovisuaalinen instituutti | Valikoitu osa kotimaisen elokuvan digitoitavista aineistoista | 2400 |
Kansallisarkisto | Kansallisarkiston vastaanottamat alkujaan digitaaliset valtionhallinnon asiakirjalliset aineistot | 41 |
Kansallisarkisto | VAPA-järjestelmään siirretyt tietoaineistot | 1 |
Kansallisarkisto | Kansallisarkiston massadigitointi-hankkeen aineistot | 114 |
Kansallisarkisto | Kansallisarkiston digitaaliarkistosta siirrettävät aineistot ja takautuvan digitoinnin aineistot | 805 |
Kansallisarkisto | Kansallisarkiston yksinomaan digitaalisessa muodossa olevat yksityisarkistoaineistot | 27 |
Kansallisgalleria | Kiasman mediataiteen teosten pitkäaikaissäilytys | 20 |
Kansalliskirjasto | Kansalliskirjaston digitoimat kulttuuriperintöaineistot | 1083 |
Kansalliskirjasto | Kulttuuriaineistolain nojalla kerätyt aineistot | 355 |
Kotimaisten kielten keskus Kotus | Kotuksen kielentutkimus- ja kulttuuriperintöaineistojen pitkäaikaissäilytys | 60 |
Museovirasto | Kulttuuriympäristön tutkimusraportit | 1 |
Musiikkiarkisto | Musiikkiarkiston pitkäaikaissäilytettävät aineistot | 70 |
Postimuseo | Postimuseon filateelisen kokoelman pitkäaikaissäilytys | 2 |
Svenska Litteratursällskapet SLS | SLS:n pitkäaikaissäilytettävät aineistot | 50 |
Yhteiskuntatieteellinen tietoarkisto, FSD | Tietoarkiston arkistoimien tutkimusaineistojen kokoelman pitkäaikaissäilytys | 1 |
Organization | Purpose of use | Capacity (TB) |
---|---|---|
Geologian Tutkimuskeskus | GTK:n tomografialaitteen tuottamat tietoaineistot | 16 |
Geologian Tutkimuskeskus | Röntgenfluoresenssi-kuvantamislaitteen tuottamat tietoaineistot | 2 |
Helsingin yliopisto | Helsingin yliopiston SMEAR-aineistojen valikoima meteorologisia - ja ilmanlaatumittauksia | 2 |
Helsingin yliopisto | M. cinxia and C. melitaearum in the Åland metapopulation system | 2 |
Helsingin yliopisto | FIRE (The Finnish Reflection Experiment) | 1 |
Helsingin yliopisto | Luomuksen aineistot | 150 |
Helsingin yliopisto | Suomalaiset hautajaiset Covid-19-epidemian aikana | 1 |
Itä-Suomen yliopisto | SENSOTRA | 1 |
Jyväskylän yliopiston kiihdytinlaboratorio | 250-Nobeliumin hajoamisspektroskopia | 1 |
Oulun yliopisto, Sodankylän geofysikaalinen observatorio | Havaintoaineistot | 30 |
Tampereen yliopisto | Kansanperinteen arkiston Yleiskokoelma | 5 |
Tampereen yliopisto | Yhteiskuntatieteiden tiedekunnan Kansanperinteen arkiston A-K-kokoelma | 2 |
Turun yliopisto | Historian, kulttuurin ja taiteiden tutkimuksen arkiston aineistot (HKT-arkisto) | 20 |
Åbo Akademi | Samlingar vid Åbo Akademis bibliotek | 10 |
Data Accumulation in 2024
Approx. 582 terabytes of new data were ingested during the year, and at the end of 2024, there were more than 3.3 petabytes of content in preservation. The data accumulation during 2024 is shown in the figure below.

In 2024, the DPS assumed responsibility for preserving more than 1,950,000 information packages, and at the end of the year, more than 5,572,000 packages were in preservation. The accumulation of Archival Information Packages during 2024 is shown in the figure below.

Digital Preservation Services and Sustainability
The annual carbon footprint of the DPS has been updated. The storage capacity of the DPS is 7.5 petabytes, and the annual carbon emissions allocated according to the life cycle of its infrastructure are 21,169 kg CO2 eq. Presented as carbon emissions per terabyte, this is approx. 3 kg CO2 eq.
According to 2023 policies, the life cycle of the DPS hardware is at least 7 years. Due to the increase in storage capacity over the years, the infrastructure has components of different ages. The oldest components are now six years old, and their ageing does not yet appear to present a major risk of jeopardizing content preservation. The DPS closely monitor the condition of the infrastructure and will react promptly to any observations of infrastructure replacement needs.
Maintenance of the Digital Preservation Services
A wide range of actions is needed to provide digital preservation services:
- maintenance tasks,
- method and model development,
- software and hardware infrastructure, and
- administrative work.
The following section focuses especially on the maintenance tasks of the DPS based on the model for quality reporting on IT services’ production operations, in which the actual production over a certain period of time as well as incidents and recovery from them are typically stressed.
The main objectives of maintaining the Digital Preservation Services are:
- ensuring the integrity and availability of AIPs,
- monitoring the functioning of the service, and
- supporting organizations in using the DPS (incl. fixing invalid or incomplete Submission Information Packages (SIPs) detected during ingest).
Monitoring of the Digital Preservation Services
The monitoring of the DPS has been automated as far as possible. This control provides status and event information, not only for the maintenance of the services but also for the partner organizations, enabling experts to infer the status of the service and take the necessary action.
Currently, the following are automatically monitored in the DPS:
- hardware failures (including faulty hard drives),
- faulty tape drives,
- server availability,
- disk area fill rate,
- visibility of distributed storage areas on different servers,
- up-to-date status of the virus check database,
- storage layer integrity,
- availability of tape libraries,
- SSL certificate life cycles, and
- failed login attempts of SFTP port on fronted servers.
Manual monitoring additionally covers:
- progress of the job queue in the ingest,
- processing of SIPs stuck in the job queue,
- checking of AIP integrity,
- analyzing problems associated with rejected SIPs,
- replicating faulty media, and
- creating dark archive copies.
As part of efforts to develop the DPS, monitoring of the service is also being improved and new processes will be automated. This will enable cost-effective maintenance of the service while the volume of content to be preserved increases.
Quality Deviations Relating to Preserved Content in 2024
Quality in digital preservation has been considered by the DPS together with partner organizations. The parties’ mutual understanding is that the integrity of content and reliability of preservation are particularly important. Consequently, quality deviations are situations in which the preservation of content has been threatened rather than, for instance, those where the service is temporarily unavailable. Reporting on the quality of the service on this basis is somewhat challenging, as the usual indicators for IT environments (incl. service accessibility rate) do not describe deviations in or actual threats to the preservation of content. As situations where the preservation of content is threatened have been defined those in which less than three intact copies of Archival Information Packages remain. Typically, recovery from these situations relies on a copy in another media type, and the DPS maintenance is able to restore normal preservation status as part of its normal operations.
During the year, the DPS solution experienced seven disk failures, four faulty memory modules and one faulty RAID controller. The backup battery of one RAID controller was replaced. Three power supplies and three tape drives became defective in the tape library, in addition to which two faulty tapes were discovered. The remote control card of one server became faulty. These issues did not compromise the integrity or availability of AIP copies.
During the year, one corrupt AIP copy was detected on one tape and a new copy was produced to replace it. The contents of one tape were lost due to a software error, but the tape was restored from other copies.
There were two deviations in DPS availability in 2024, one in August and another in November. Both of these availability deviations lasted for less than an hour and did not in any way put the preserved content at risk.
New Features of Software Development
In 2024, tape-to-tape copying using open source tools was implemented in the DPS. This speeds up and facilitates data replication and refreshment in the tape infrastructure and ensures that bit-level preservation in the DPS remains firmly supplier independent.
Technical support for partner organizations was improved by publishing a new library for preparing content for digital preservation and by offering a new RPM channel for tools published by the DPS. They both improve and simplify the production of high-quality SIPs and significantly facilitate the installation of tools provided by the DPS. An upgrade of the DPS interface was also initiated in the background.
The Digital Preservation Service for Research Data simplified the content pre-ingest process. In addition, extensive work was carried out to integrate the upgraded Metax metadata repository into the packaging service and the DPS management interface.
Support for Partner Organizations
The DPS helps partner organizations in questions related to the digital preservation of content. While this support is provided particularly during the DPS deployment process, organizations may also submit service requests in other situations. Support requests are received at the DPS support address: pas-support@csc.fi.
In 2024, a total of 136 service requests were received from partner organizations. In addition to dealing with service requests, discussions with partner organizations take place in such forums as the digital preservation collaboration group, which meets three to four times each year. The established routines of the digital preservation collaboration group include agreeing on specification changes together with the partner organizations. This includes agreeing on the file formats preserved in the DPS. The group also discusses large themes related to digital preservation which, in 2024, included removal of content from the DPS, an upgrade of the DPS interface and principles of bit-level content preservation.
Information on events and topical issues of the DPS was provided on the digitalpreservation.fi website, X channel (@dpres_fi) and the email list intended for information purposes. Conversations with partner organizations continued at monthly #PASKaffe events.