Bookmarking PDF Documents by Text Style
Introduction
The tutorial shows how to generate PDF bookmarks based on text style using the AutoBookmark™ plug-in for the Adobe® Acrobat®. Use this method to automatically generate multi-level bookmarks from the text attributes such as font style, text size, indentation and/or text pattern. All text that uses a selected font and/or text size will be automatically bookmarked.
Bookmarking PDF documents by text style
The main part of the tutorial contains step-by-step instruction for bookmarking by text style.
In addition, there is also an advanced section that covers in detail various bookmarking settings:
This operation is available via application menu/toolbar and via Action Wizard (Acrobat's batch processing tool).
Prerequisites
You need a copy of the Adobe® Acrobat® along with the AutoBookmark™ plug-in installed on your computer in order to use this tutorial. You can download trial versions of both the Adobe® Acrobat® and the AutoBookmark™ plug-in.
Bookmarking by Text Style ↑overview
Step 1 - Check for Searchable Text
Open a PDF document that needs to be bookmarked using "File > Open..." menu.
Open a PDF document
The first step is to verify that input PDF document actually contains a searchable text. If you can highlight a text string and copy/paste it into another text editor (such as Notepad, MS Word or even Outlook), then the document does contain a searchable text and can be used for bookmarking by style.
If the PDF file has been scanned from a paper document, then it needs to be processed by "Text Recognition" tool to make it searchable. Select "Enhance Scans" from the "Tools" menu and click on "Recognize Text" to run text recognition on the currently open PDF document.
Run optical character recognition in the Adobe Acrobat
Step 2 - Select a Sample Text
Use the selection tool to select a sample text that needs to be bookmarked.
Select a sample text
Step 3 - Start Bookmarking Tool
Select "Plug-Ins > Bookmarks > Generate From Text Styles..." from the main menu to open the "Generate Bookmarks From Text Style" dialog.
Open the Generate Bookmarks From Text Style tool
Step 4 - Add a Bookmark Level
Click "Add..." to create a new bookmark level description. The level description defines the settings that will be used to find and bookmark text for one bookmark level. In many applications, there is only single level of the bookmarks required - the top one. However, if multiple levels of bookmarks are necessary, then each level needs its own set of settings. For example, the top-level bookmarks can be created out of the 20pt Arial text, while second-level bookmarks out of 15pt Tahoma text.
Click the Add button
If a sample text has been selected on the page, then the "Add New Level Definition" dialog will prompt to create a new level based on the selected text style. Click "OK" to proceed. Optionally, check the "Open settings dialog to edit level parameters" option to configure settings in detail.
Specify bookmarking level
The new bookmark level is now added to the list:
A new bookmark level appears
Step 5 - Add Additional Level(s)
If you want to bookmark only text with just selected style, then click "OK" to start bookmarking.
If you want to add an additional level(s) of bookmarks, then move the "Generate Bookmarks From Text Style" dialog to the side of the screen and use the selection tool to highlight a sample text for the another level.
Select a sample text for a new bookmarking level
Step 6 - Add a New Bookmark Level
Click "Add..." to create a new bookmark level description.
Click the Add button
The "Add New Level Definition" dialog appears on the screen prompting to create a new level based on a selected text style. Optionally, specify the desired bookmark level from the "Bookmark at Level:" pull-down menu. It is possible to create multiple descriptions for the same bookmark level. For example, use two bookmark level settings for the top-level to create bookmarks from text that uses both 20pt Arial and 25 pt Times font.
Click "OK" to confirm adding a new level.
Specify bookmarking level
Now there are two levels defined. If you want to add more levels, then repeat steps 5-6.
Second bookmark level appears
Step 7 - Configure Processing Settings
There are many processing settings that can be customized to get the particular results. The advanced section explains how to configure various processing parameters.
Configure processing settings
Step 8 - Start Bookmarking
Click "OK" in the "Generate Bookmarks From Text Style" dialog to start the bookmarking process.
Start the bookmarking process
Click "OK" again to confirm the processing.
Confirm the processing
The report dialog appears on the screen at the end of the processing showing the number of bookmarks created. Click "OK" to close it.
The report dialog will appear
Step 9 - Inspect the Results
The software searches document pages for all occurrences of the text that is matching the bookmark level description(s). The matching text is used to create bookmarks. The bookmarks are automatically arranged into a nested hierarchy based on the configuration settings.
The bookmark panel is automatically opened at the end of processing to show the bookmarks created. Inspect the bookmarks to make sure everything is bookmarked correctly. If there are any problems, adjust processing settings accordingly.
Inspect the results
Configuring Processing Settings ↑overview
Processing Page Range ↑overview
Specify the page range of where to look for bookmarks. Enter a first and a last page number in the "Generate Bookmarks From Text Style" dialog. This option is useful when it is necessary to exclude certain portions of the document from processing, such as the table of contents or the index.
Specify processing page range
Where to Insert New Bookmarks ↑overview
Use "Insert bookmarks" pulldown menu to specify where to insert new bookmarks: after, before, or in place (replace) of the existing bookmarks.
Select where to insert new bookmarks
Here are the examples of the output:
Bookmark examples
Using Stop Words ↑overview
The "stop words" feature can be used to filter out the unwanted bookmark titles. If any "stop word" is present in the bookmark title, then the bookmark will be excluded from the output. The "stop word" can be a single word or a phrase. For example, use "Annual report" to avoid creating bookmarks that contain "Annual Report" anywhere in the bookmark title.
Click the "Options..." button in the "Generate Bookmarks From Text Style" dialog to enter a list of stop-words.
Click the Options button
Check the "Ignore text that contains stop words:" option if you want to enter a list of stop words. Click the "Edit Stop Words..." button to manage the list.
Open the Edit Stop Words dialog
Enter "stop words" or regular expressions on the separate line in the text editing area of the "Edit Stop Words" dialog.
Enter stop words
You can enter "stop words" by:
  • Manually typing in in the editing area of the dialog. Each separate entry should appear on a separate line.
  • Using the "Select Text" tool and copying a desired text from a document. Copy text to the clipboard and then paste it into the "stop words" editing list.
  • Copy text from another text editor.
Check the "Use regular expressions (text patterns)" option to indicate that the stop-words use regular expression syntax.
Check the "Match case" option to match words down to the letter case.
Check the "Match whole words" option to match only whole words. For example, if this option is on, then "Account" will not match "Accounts" or "Accounting".
Click "OK" button to finish editing "stop words" and return back to the "Bookmarking Options" dialog.
Stop word example:
Example of the stop words functionality
Bookmarking Options
Check the "Ignore consecutive duplicate bookmarks" option to skip consecutive bookmarks that have the same title. Only the first bookmark will be retained.
Check the "Sort bookmarks vertically within each page" to sort resulting bookmarks within each page prior to adding them to the document's bookmarks. PDF documents are not really "text documents" in traditional sense. PDF file might store text elements in a different order than they are appearing on the page. This might result in wrong nesting order of bookmarks. It's generally recommended to turn this option "on" unless input document has multiple-column text. In that case the vertical order of bookmarks does not reflect the logical order of the text on the page.
Click "OK" button to return back to the "Generate Bookmarks From Text Style" dialog.
Select desired processing options
Set Bookmarking Level ↑overview
Select the bookmarking level in the tree and click "Set Level..." to modify the bookmarking level.
Click the Set Level button
The "Set Bookmarking Level" dialog appears. Select the desired level from the "Set Bookmarking Level:" pull-down menu. Click "OK".
Set bookmarking level
Set Matching Text Style ↑overview
Double-click on the specific bookmark level in list to edit bookmarking settings. Alternatively, select the bookmark level and click "Edit...". The "Bookmark Level Description" dialog will appear.
Open the Bookmark Level Description dialog
Select any combination of the text attributes that you want to use for an automatic generation of the bookmarks in the "Text Matching" tab of the "Bookmark Level Description" dialog.
The most commonly used attributes are the font name(s) and size. Most documents contain various section headings that have a distinctive font style that is different from the surrounding text. You can either enter these parameters manually or use a sample text from the document. It is possible to use more than a single font name to describe a desired bookmark level.
Modify matching text attributes by selecting a font name and specifying font size manually. Click "Add..." and select desired font name from the list. The list contains only the most common font names. However, you can retrieve the names of all fonts that are used within the current document by pressing the "Update Fonts" button. The software scans all pages in the PDF document and enumerates the font names.
Optionally, use a sample text from the document and click the "Set Font Style From Selected Text" button to set the font name and size to match a currently selected sample text. You have to select the sample text prior to opening the "Bookmark Level Description" dialog.
Modify matching text attributes
Set Tolerance for Matching Text ↑overview
The software allows to specify a tolerance for matching the text size. Tolerance is the specified maximum acceptable variation from a target value.
Double-click on the specific bookmark level in the tree of the "Generate Bookmarks From Text Style" dialog. Alternatively, select the bookmark level and click "Edit...". The "Bookmark Level Description" dialog will appear.
Open the Bookmark Level Description dialog
Select the "Text Matching" tab in the "Bookmark Level Description" dialog. Adjust the "Tolerance" parameter for the text size. For example, if the font size parameter is set to 10pt and tolerance is set to 1pt, then the software will match all text that uses font size between 9 and 11 pt.
You may also check the "Allow partial match for font names" option to relax font name matching requirement and allow matching similar font names. For example, if you specified "Helvetica" font, then all fonts that have "Helvetica" anywhere in their names ("Helvetica-Bold" or "Helvetica-Italic") will also produce a match.
Check the "Allow characters with different style and size inside text line" to ignore differences in style or size in the middle and end of the line. Software will only match text style for a first character/word on the line and match everything else regardless of the style and size.
Specify tolerance for matching text
Tolerance example:
Text size tolerance example
Using Text Patterns ↑overview
The software allows to bookmark only a text that matches a user-defined text pattern. Use this option to bookmark text that can be represented as a text pattern. For example: email address, account number, date, repeating header/footer and etc.
Double-click on the specific bookmark level in the tree of the "Generate Bookmarks From Text Style" dialog. Alternatively, select the bookmark level and click "Edit...". The "Bookmark Level Description" dialog will appear.
Open the Bookmark Level Description dialog
Select the "Text Matching" tab in the "Bookmark Level Description" dialog.
Check the "Match Text Pattern" option to specify a text matching pattern. Note that it will be used in addition to other matching parameters (such as text style). There is a separate bookmarking method is focused on using text patterns.
A text pattern is a sequence of letters and symbols that defines what characters can appear in the matching text string. The AutoBookmark™ plug-in uses regular expressions for defining text patterns. Only text that matches a specified pattern will be used to create a bookmark. The software does not require an exact match, it only checks if a text line contains a given pattern.
For example, if you specified word "Chapter" as a matching text pattern then multiple text lines might match it. "Chapter 1" or "Chapter 1 - Functionality Overview" will both satisfy the matching criteria. This will result in bookmarking of both text lines. The text line that matches the pattern is used for the bookmark title. However, this might produce very long titles or titles that contain undesired text.
Check the "Limit bookmark titles to matching pattern only" option to use only the portion of the text string that matches a specified pattern. For example, when using the following text pattern "Chapter \d" (\d - matches any digit) both "Chapter 1" and "Chapter 2 - Functionality Overview" text strings will be matched. If "Limit bookmark titles to matching pattern only" option is checked, then bookmark titles will read "Chapter 1" and "Chapter 2" respectively.
Specify matching text content attributes
Using the text pattern example:
Text size tolerance example
Set Page Area for Text Search ↑overview
Sometimes, an unwanted text may get bookmarked because it is using the same font style as a legitimate text. Use processing page area to limit text search only to the specific part of the page.
Double-click on the specific bookmark level in the tree of the "Generate Bookmarks From Text Style" dialog. Alternatively, select the bookmark level and click "Edit...". The "Bookmark Level Description" dialog will appear.
Open the Bookmark Level Description dialog
Select the "Text Location" tab in the "Bookmark Level Description" dialog.
Check the "Match text located only in the following area:" option. Click "Set Page Area From a Sample Page...".
Click the Set Page Area From a Sample Page button
Select a sample page number. Specify a text location on a sample page by drawing a rectangle. The text search will be limited to the selected area on the page. Click "OK" once done.
Specify page area
Define a Visual Appearance of the Bookmarks ↑overview
The software allows to customize a visual appearance of the resulting bookmarks by specifying text color and style.
Double-click on the specific bookmark level in the tree of the "Generate Bookmarks From Text Style" dialog. Alternatively, select the bookmark level and click "Edit...". The "Bookmark Level Description" dialog will appear.
Open the Bookmark Level Description dialog
Select the "Appearance" tab in the "Bookmark Level Description" dialog.
Set a desired text style (Plain, Bold, Italic, Bold & Italic).
Zoom option defines how a bookmarked page is displayed in the viewer when a bookmark is clicked:
  • Inherit Zoom - Displays a page designated by a bookmark using a current zoom factor. Page is positioned in the viewer in way that bookmarked text appears at the top of the view window. This only happens when a page layout mode of the viewer is set to "Continuous". Use "View/Page Layout" menu to set a desired page layout mode.
  • Fit Page - Displays a page designated by a bookmark, with its contents magnified just enough to fit the entire page within the window both horizontally and vertically. If the required horizontal and vertical magnification factors are different, uses the smaller of the two, centering the page within the window in the other dimension.
  • Fit Width - Displays a page designated by a bookmark, with the vertical coordinate positioned at the top edge of the window and the contents of the page magnified just enough to fit the entire width of the page within the window.
  • Fit Visible - Displays a page designated by a bookmark, with its contents magnified just enough to fit its bounding box entirely within the window both horizontally and vertically. If the required horizontal and vertical magnification factors are different, uses the smaller of the two, centering the bounding box within the window in the other dimension.
  • Actual Size - Displays a page designated by a bookmark with 100% magnification factor.
Check the "Show expanded" option to display all bookmarks at this level expanded.
Click "OK" once done.
Close the tool
Examples of the different bookmark styles:
Text size tolerance example
Customize the Bookmark Titles ↑overview
The software allows to customize titles of the resulting bookmarks by enforcing a text case, adding leading numbers, inserting a custom text and/or performing a search and replace operation.
Double-click on the specific bookmark level in the tree of the "Generate Bookmarks From Text Style" dialog. Alternatively, select the bookmark level and click "Edit...". The "Bookmark Level Description" dialog will appear.
Open the Bookmark Level Description dialog
Select the "Content" tab in the "Bookmark Level Description" dialog.
Initially, an original text from a document is used for the bookmark titles. This text can be modified in the number of ways.
Text case can be altered to produce a uniformly formatted titles. Available options are:
  • Do Not Change - no changes to the original text is done.
  • UPPERCASE - all titles are converted to the upper case characters.
  • Title Case - first letters of each word are capitalized.
  • Sentence Case - only first letter of the title is capitalized, all others appear in the lower case.
  • lowercase - all characters appear in lower case.
Text case examples:
Text case examples
Additional text can be added to all bookmark titles. Check the "Insert this before each title" option and enter a desired text in the editing box to the right. This text will be inserted before each title. Check the "Insert this text after each title" option and enter a desired text in the editing box to the right. This text will be appended to the end of each bookmark title.
Leading numbers can be optionally added or remove to/from bookmark titles. Number, letters or roman numerals can be used as leading numbers. The format of leading numbers is set separately for each bookmark level. This allows to create arbitrary numbering schemes.
Bookmarks text can be limited to a certain number of characters to avoid accidental creation of the large unreadable titles. Enter the maximum allowed title length (in characters) in the "Maximum title length" entry box. Default value is set to 128 characters.
Customize bookmark titles
Examples of the bookmarks customization:
Examples of the bookmarks customization
Format the Resulting Bookmarks` Titles with Text Patterns ↑overview
Bookmark titles can be modified and formatted using powerful text patterns called regular expressions. The AutoBookmark™ plug-in provides functionality to search bookmark titles with a regular expressions and either remove it completely or replace it with other text. It is possible, for example, to replace all phone numbers or email addresses with something else and perform advanced formatting such as to move words in the bookmark titles.
Double-click on the specific bookmark level in the tree of the "Generate Bookmarks From Text Style" dialog. Alternatively, select the bookmark level and click "Edit...". The "Bookmark Level Description" dialog will appear.
Open the Bookmark Level Description dialog
Select the "Content" tab in the "Bookmark Level Description" dialog.
Check the "Search and replace bookmark titles with text patterns" option to search and replace bookmark text.
When performing this style of formatting with regular expressions you must know what it is you want to format and how. First you need write a regular expression to find a desired text substring in a bookmark title. Second, you need to specify a replacement pattern that will replace this substring in the bookmark title. In its simplest form, you can just specify a text string that you want to find and string that you want to replace with.
For example, if bookmark titles contain words "the court of last resort" that you want to replace with "COLR", then enter "the court of last resort" as a search pattern and "COLR" as the replace pattern. However, the real potential of this operation comes when you start using the full power of the regular expressions that allow you to match dynamic text and refer to substrings while performing the replacement. You can completely transform the input text into anything you want.
Format bookmark titles
Sample output after text "search and replace" operation:
Text size tolerance example
Save Configuration Settings ↑overview
Bookmarking settings can be saved into a settings file for later reuse. The settings file stores all processing parameters including the stop words and bookmark level descriptions. This helps to save time when the same processing settings need to be frequently used.
Click "Save..." in the "Generate Bookmarks From Text Style" dialog to save bookmarking settings into a file.
ma
The "Save As" dialog appears on the screen. Browse to the desired storage folder and enter an appropriate file name. Click "Save". Settings will be stored in the file with *.ABM extension.
Save settings
Load Configuration Settings ↑overview
Configuration settings can be loaded from earlier saved *.ABM file. The configuration file stores all processing parameters including the stop words and bookmark level descriptions.
Open the "Generate Bookmarks From Text Style" dialog by selecting "Plug-Ins > Bookmarks > Generate From Text Styles..." in the main menu. Click "Load..." to use earlier saved configuration settings.
Click the Load button
The "Open" dialog appears on the screen. Browse to the desired storage folder and select a particular AutoBookmark™ settings file with *.ABM extension. Click "Open".
Load settings
Settings will be loaded, and user interface will be updated. All current settings will be lost.
Settings will be loaded
Click here for a list of all step-by-step tutorials available.