Web ARChive

The Web ARChive (WARC) archive format specifies a method for combining multiple digital resources into an aggregate archive file together with related information. The WARC format is a revision of the Internet Archive's ARC File Format^[1] that has traditionally been used to store "web crawls" as sequences of content blocks harvested from the World Wide Web. The WARC format generalizes the older format to better support the harvesting, access, and exchange needs of archiving organizations. Besides the primary content currently recorded, the revision accommodates related secondary content, such as assigned metadata, abbreviated duplicate detection events, and later-date transformations.^[2]

WARC is now recognised by most national library systems as the standard to follow for web archival.^[3]

Software

Online Services

Webrecorder.io with 5 GB of free space per account — by Rhizome.org
Perma.cc 10 free links per month per user, stored at Harvard Law School Library - by lil.law.harvard.edu

References

Vorlage:Reflist

External links

Vorlage:Web-stub

↑ ARC_IA, Internet Archive ARC file format. In: www.digitalpreservation.gov. Abgerufen am 9. Mai 2015.
↑ WARC, Web ARChive file format. In: www.digitalpreservation.gov. Abgerufen am 9. Mai 2015.
↑ http://digitalia.sbn.it/article/view/1473
↑ Giuseppe Scrivano: GNU wget 1.14 released. In: GNU wget 1.14 released. Free Software Foundation, Inc., 6. August 2012, abgerufen am 25. Februar 2016.

[1] ARC_IA, Internet Archive ARC file format. In: www.digitalpreservation.gov. Abgerufen am 9. Mai 2015.

[2] WARC, Web ARChive file format. In: www.digitalpreservation.gov. Abgerufen am 9. Mai 2015.

[3] ttp://digitalia.sbn.it/article/view/1473

[FSF2012-4] Giuseppe Scrivano: GNU wget 1.14 released. In: GNU wget 1.14 released. Free Software Foundation, Inc., 6. August 2012, abgerufen am 25. Februar 2016.

[1]

[2]

[3]

[4]