Jump to content

Pdf-parser

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by PDFAnalyst (talk | contribs) at 10:59, 8 October 2010 (Added reference to Internet Storm Center per non-notable prod). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

pdf-parser is a command-line program that parses and analyses PDF documents. It provides features to extract raw data from PDF documents, like compressed images. pdf-parser can deal with malicious PDF documents that use obfuscation features of the PDF language[1]. The tool can also be used to extract data from damaged or corrupt PDF documents.

pdf-parser is released in the Public Domain.

pdf-parser was originally created in 2008 and last updated on April 4, 2010.

It is written in the Python programming language and can be used on all platforms supporting the Python interpreter, including smart phones.

References

  1. ^ PDF Babushka by Bojan Zdrnja, Internet Storm Center, January 14, 2010
  • pdf-parser Official site, with documentation and changelog