PDF2Text by PDFTron Systems, Inc.

PDF2Text 4.5

A PDF Conversion Control/Component

Easy and accurate PDF text extraction!

For more information go to
PDF2Text is a stand-alone solution for high-quality and efficient text extraction from PDF documents. PDF2Text can be used to extract text from any PDF document as Unicode or as structured XML.

PDF2Text is offered as an easy-to-use command-line application and as a software development component that can be used as a building block for other client and server-based applications.

Key Features:
  • Extracts text from any PDF document to text or as structured XML.

  • Offers different Unicode text encoding (UTF-8 and UTF-16) options.

  • Provides positioning, font, and styling information for every Paragraph, Line, Word, or a Glyph on a page.

  • Offers options to control the level of detail and the formatting in the output XML.

  • Offers advanced options to control ligature expansion, hyphen removal, and to remove duplicate text (e.g. which is sometimes used for drop shadow effects).

  • Allows for text extraction from a clip rectangle or to hide text in specific regions on a page.

  • Option to remove hidden text or text that is obscured by other page elements (such as images or rectangles).

  • Supports all versions of PDF format (PDF 1.0 to ISO32000).

  • Supports automation and batch operation.

  • Technical Information

    Component Type - Contains the following types of components...

     • ASP.Net Web Control
    For more information and to buy this product...
    View similar products in:
    PDF Conversion
    Product Type:
    Control/Component
    Product Version:
    4.5
    Prices From:
    from $250.00