Pdf-parser

pdf-parser is a command-line program that parses and analyses PDF documents. It provides features to extract raw data from PDF documents, like compressed images. pdf-parser can deal with malicious PDF documents that use obfuscation features of the PDF language^[1]. The tool can also be used to extract data from damaged or corrupt PDF documents.

pdf-parser is released in the Public Domain.

pdf-parser was originally created in 2008 and last updated on April 4, 2010.

It is written in the Python programming language and can be used on all platforms supporting the Python interpreter, including smart phones.

References

^ PDF Babushka by Bojan Zdrnja, Internet Storm Center, January 14, 2010

External links

pdf-parser Official site, with documentation and changelog

[1] PDF Babushka by Bojan Zdrnja, Internet Storm Center, January 14, 2010

[1]