AutoSplit: Naming Output PDF Files By Text Search
Introduction
This tutorial shows how to name output PDF files in "Split Document" operation provided by the AutoSplit™ plug-in for the Adobe® Acrobat®. The "Split Document" operation provides a way to automatically name output PDF files using document's text. Multiple different naming methods can be combined together to create a wide variety of file naming schemes.
We are going to split a multi-page PDF document into single-page files and automatically name each output file with a text extracted from first page. The text search is used to extract the text from the document. The tutorial illustrates the use of text patterns.
An input document example
Every page in the sample input PDF document is stamped with a Bates number. The numbers are using "XYZ-ABC123456" format. We will use a text pattern search to find a matching text on the first page of each output document. The matching text will be to name output files from the splitting process. As a result, the input PDF document will be split into 12 single-page documents and named by the corresponding Bates number.
Input document will be split into 12 single-page documents and named by its Bates number
This operation is also available in the Action Wizard (Acrobat's batch processing tool) and can be used for automating of document processing workflows.
Prerequisites
You need a copy of the Adobe® Acrobat® along with the AutoSplit™ plug-in installed on your computer in order to use this tutorial. You can download trial versions of both the Adobe® Acrobat® and the AutoSplit™.
Step 1 - Open a PDF Document
Start the Adobe® Acrobat® application and open a PDF document using "File > Open..." menu.
Please note that if an input PDF document does not contain any searchable text, then it can be used for any text-based processing. If you are using a scanned paper document, then make sure the "Recognize Text" operation (also known as "Optical Character Recognition" or OCR) is applied to this document prior to processing.
Open a PDF file
Step 2 - Open the "Split Document Settings" Dialog
Select "Plug-ins > Split Documents > Split Document..." from the main Adobe® Acrobat® menu to open the "Split Document Settings" dialog.
Open the Split Document Settings dialog
Step 3 - Select Split Method
Select a desired document splitting method. As an example, we have selected to split the input document into equal size output documents (one page per file).
Select split method
Step 4 - Specify Output File Naming
We are going to show how to use the "Text By Search" method for naming the files.
Press the "Add" button in the "Output Naming and Destination" section to add a new component to the name.
Click Add button
Select the "Text By Search" option. Click "Next >>".
Select the Text By Search option
Enter a search pattern (using a regular expression syntax) into the "Find what" box.
For example, enter "\b[A-Z\-]+\d{6}\b" to match Bates numbers that follow "XYZ-ABC123456" format. This search expression will find all text that conforms to this pattern and use it as part of the filename.
Enter a search expression
Click "OK" in the "Find text" dialog to close it.
Close the dialog
Step 5 - Specify an Output Folder
Press "Browse..." button to select an output folder. Optionally, specify name prefix and/or base filename.
Click "OK" to start splitting the document.
Specify an output folder
Step 6 - Start Extraction Process
Click "OK" in the confirmation dialog.
Start extraction
Step 7 - Inspect the Results
Check the list of the output files displayed in the "AutoSplit Results" dialog. Click "Open Output Folder" to inspect output PDF files.
Inspect the results
The output folder contains 12 single page documents named after the corresponding Bates number.
An input document example
Click here for a list of all step-by-step tutorials available.