Difference between revisions of "Text in PDF"

From
Jump to: navigation, search
m
Line 52: Line 52:
  
 
==Text operators==
 
==Text operators==
A PDF text object consists of operators that may show text strings, move text position, set text state and certain other parameters. This text object represents a '''sequence''' of text commands and therefore results depend on their '''order'''. Code sample below, created using [[Fixed layout API]], shows how one could use it for placing text on document's page.
+
A PDF text object consists of operators that may show text strings, move text position, set text state and certain other parameters. This text object represents a '''sequence''' of text commands and therefore results depend on their '''order'''. Code samples below, created using [[Fixed layout API]], show how one could use text objects and supported operators for placing text on document's page.
  
 +
===Add text using one of the [[Text in PDF#Standard fonts|standard fonts]]===
 
  <nowiki>
 
  <nowiki>
 
// create output PDF file stream
 
// create output PDF file stream
Line 99: Line 100:
  
 
So we’ve added 10pt offset to the X-coordinate and 820pt to the Y coordinate of the page's initial transformation, moving our objects from left to the right and from bottom to the top. This way we got our text object placed on top of the page. If there were another text object added after the first one, it would have to have an additional transformation applied in order to not overlap with the first one.
 
So we’ve added 10pt offset to the X-coordinate and 820pt to the Y coordinate of the page's initial transformation, moving our objects from left to the right and from bottom to the top. This way we got our text object placed on top of the page. If there were another text object added after the first one, it would have to have an additional transformation applied in order to not overlap with the first one.
 +
 +
===Add text using one of the [[Text in PDF#External fonts|external fonts]], set its color and other properties===
 +
 +
<nowiki>
 +
// create output PDF file
 +
using (FileStream outputStream = new FileStream("outfile.pdf", FileMode.Create, FileAccess.Write))
 +
{
 +
    // create new document
 +
    using(FixedDocument document = new FixedDocument())
 +
    {
 +
        // add blank first page
 +
        document.Pages.Add(new Page(Boundaries.A4));
 +
 +
        // create text object and append text to it
 +
        TextObject textObject = new TextObject("Arial", 14);
 +
 +
        // apply identity matrix, that doesn't change default appearance
 +
        textObject.SetTextMatrix(1, 0, 0, 1, 0, 0);
 +
        textObject.AppendText("Hello world using Apitron PDF Kit!");
 +
                               
 +
        textObject.SetFont("ArialItalic",14);
 +
        // apply vertical scaling and offset
 +
        textObject.SetTextMatrix(1, 0, 0, 2.5, 0, -40);
 +
 +
        // set mode to stroke only
 +
        textObject.SetTextRenderingMode(RenderingMode.StrokeText);
 +
        textObject.AppendText("Hello world using Apitron PDF Kit!");
 +
 +
        // set current stroking and non-stroking color
 +
        document.Pages[0].Content.SetDeviceStrokingColor(new double[]{1,0,0});
 +
        document.Pages[0].Content.SetDeviceNonStrokingColor(new double[]{1,0,0});
 +
 +
        // set current transformation                   
 +
        document.Pages[0].Content.Translate(10, 820);
 +
        // add text object to page content, it will automatically create text showing operators                               
 +
        document.Pages[0].Content.AppendText(textObject);
 +
                 
 +
        // save to output stream
 +
        document.Save(outputStream);
 +
    }
 +
}
 +
</nowiki>
 +
 +
This code produces the following results:
 +
 +
[[File:apitron_pdf_kit_text_object_usage_external_font.png|frame|none|Add text to PDF file using external font and set text properties]]
 +
 +
These two lines of text were added using single text object.  As it’s been said at the beginning of this article, text object is a '''sequence of text commands'''; therefore it’s possible to create several textual sequences with different appearance contained in one text object.
 +
 +
You may also note that we changed the text color, using this code:
 +
 +
<nowiki>
 +
// set current stroking and non-stroking color
 +
document.Pages[0].Content.SetDeviceStrokingColor(new double[]{1,0,0});
 +
document.Pages[0].Content.SetDeviceNonStrokingColor(new double[]{1,0,0});
 +
</nowiki>
 +
 +
It set both stroking and non-stroking colors to <span style="color:red">red</span> by specifying its RGB value in so-called [[device color space]]. It was automatically detected from number of arguments and was set to [[DeviceRGB]] (see section ''8.6.4.3 “DeviceRGB Colour Space”'' of the PDF specification). All subsequent drawing commands added after these calls and specifying filling or stroking would have this color applied to them. In our example we added text after these calls so it became red.

Revision as of 14:58, 13 February 2018

Introduction

The Apitron PDF Kit and Apitron PDF Rasterizer libraries implement all text features described in PDF specification. It’s important to note that they also automatically handle bi-directional text entries often used in Arabic and Asian cultures.

Any text in PDF has the following key attributes:

  • Font, can be one of the standard fonts, externally linked or embedded
  • Text positioning and showing operators, describing the text transformation and state
  • Stroking and non-stroking colors

Subtopics below will guide you through all aspects related to these properties and will show how to use them practically.

Fonts in PDF

Several font types are defined in PDF spec and described in terms of font file format, encodings, character maps and other usual font characteristics. We will discuss fonts from the other point of view, because in most of the cases you won’t be thinking whether your font is stored in TrueType, OpenType, CFF or other font file format. The most important things are however, whether it will be accessible to the viewer of prepared document and how it’ll affect the resulting PDF file.

So far there are three font types you have to deal with:

Standard fonts

Fonts defined by PDF specification as to be supported by any conforming PDF reader and therefore documents created using such fonts should be always viewable. These fonts don’t require any font data to be written into the resulting PDF file and don’t affect its size. These fonts are: Times-Roman, Helvetica, Courier, Symbol, Times-Bold, Helvetica-Bold, Courier-Bold, ZapfDingbats, Times-Italic, Helvetica-Oblique, Courier-Oblique, Times-BoldItalic, Helvetica-BoldOblique, Courier-BoldOblique.

Apitron PDF Kit defines a StandardFonts enum that maps one-to-one to this set.

See section 9.6.2.2 “Standard Type 1 Fonts (Standard 14 Fonts)” of the PDF specification for the details. Sample code below, shows how one could use a standard font for a text object:

Usage in Fixed layout API

// create text object based on standard Type1 font
TextObject text = new TextObject(StandardFonts.TimesBold, 12);
 

Usage in Flow layout API

// set the font using inline style property
TextBlock text= new TextBlock("Hello world!"){Font=new Font(StandardFonts.HelveticaBold,16)};
 

External fonts

Fonts assumed to be installed to the default system fonts folder location, e.g. one of the fonts from “C:\Windows\Fonts” or included with the reader app. They could be loaded when document is being viewed using any of the conforming readers. These fonts also don’t affect documents' size because their data is not included into the resulting file. If the requested file is not found during the generation or rendering of the the document, a fallback or substitution font will be used. It's also possible to specify font substitutions for both Apitron PDF Kit and Apitron PDF Rasterizer libraries.

Usage in Fixed layout API

// create text object based on external font that should exist in the target system
TextObject text = new TextObject("Arial", 12);
 

Usage in Flow layout API

// set the external font using inline style property
TextBlock text= new TextBlock("Hello world!"){Font=new Font("Arial",16)};
 

Embedded fonts

As their name suggests, these fonts are getting included into the PDF file making it self-contained and viewable on all systems where a conforming reader exists. They also affect the resulting file size. It’s possible to embed only the data needed to display the particular text contained in a certain PDF file and it’s what Apitron PDF Kit does when it has to embed font data. This technique is called font-subsetting and only the glyphs actually used in document's text along with the accompanying data needed to describe this new font subset are being embedded into the resulting PDF file as a reduced-size font file. Apitron PDF Rasterizer fully supports embedded font programs and doesn't need any special manipulation to handle them. Both Fixed layout API and Flow layout API handle this case fully automatically.

Text operators

A PDF text object consists of operators that may show text strings, move text position, set text state and certain other parameters. This text object represents a sequence of text commands and therefore results depend on their order. Code samples below, created using Fixed layout API, show how one could use text objects and supported operators for placing text on document's page.

Add text using one of the standard fonts

// create output PDF file stream
using (FileStream outputStream = new FileStream("outfile.pdf", FileMode.Create, FileAccess.Write))
{
    // create new document
    using(FixedDocument document = new FixedDocument())
    {
        // add blank first page
        document.Pages.Add(new Page(Boundaries.A4));

        // create text object and append text to it
        TextObject textObject = new TextObject(StandardFonts.Helvetica,12);                

        // apply identity matrix, that doesn't change default appearance
        textObject.SetTextMatrix(1,0,0,1,0,0);
        textObject.AppendText("Hello world using Apitron PDF Kit!");

        // set current transformation matrix so text will be added to the top of the page,
        // PDF coordinate system has Y-axis directed from bottom to top.
        document.Pages[0].Content.Translate(10, 820);

        // add text object to page content, it will automatically create text showing operators                                
        document.Pages[0].Content.AppendText(textObject);

        // save to output stream
        document.Save(outputStream);
    }
}
 

The output can be found below:

Text object usage

We created an empty document, added new page to it and appended an instance of TextObject class into its content. For this text object we also created a text matrix and indicated that we’d like to use one of the standard fonts. This clean example based on Fixed layout API hides many low-level details behind the scene e.g. creation of necessary operators, providing you with a clean and straightforward way to get job done. Other text options can be set using text object instance, e.g. leading, char and word spacing, text rise, rendering mode etc. See section 9.4 “Text Objects” of the PDF specification for the complete list.

Other thing to notice is how we positioned the text on the page. It was done by altering the current transformation object for the page's content to which our textobject was subsequently added. This transformation set the initial position of the text object's coordinate space.

// set current transformation matrix, so the text will be added to the top of the page,
// PDF coordinate system has Y-axis pointing from bottom to top.
document.Pages[0].Content.Translate(10, 820);
 

So we’ve added 10pt offset to the X-coordinate and 820pt to the Y coordinate of the page's initial transformation, moving our objects from left to the right and from bottom to the top. This way we got our text object placed on top of the page. If there were another text object added after the first one, it would have to have an additional transformation applied in order to not overlap with the first one.

Add text using one of the external fonts, set its color and other properties

// create output PDF file
using (FileStream outputStream = new FileStream("outfile.pdf", FileMode.Create, FileAccess.Write))
{
    // create new document
    using(FixedDocument document = new FixedDocument())
    {
        // add blank first page
        document.Pages.Add(new Page(Boundaries.A4));

        // create text object and append text to it
        TextObject textObject = new TextObject("Arial", 14);

        // apply identity matrix, that doesn't change default appearance
        textObject.SetTextMatrix(1, 0, 0, 1, 0, 0);
        textObject.AppendText("Hello world using Apitron PDF Kit!");
                                
        textObject.SetFont("ArialItalic",14);
        // apply vertical scaling and offset
        textObject.SetTextMatrix(1, 0, 0, 2.5, 0, -40);

        // set mode to stroke only
        textObject.SetTextRenderingMode(RenderingMode.StrokeText);
        textObject.AppendText("Hello world using Apitron PDF Kit!");

        // set current stroking and non-stroking color
        document.Pages[0].Content.SetDeviceStrokingColor(new double[]{1,0,0});
        document.Pages[0].Content.SetDeviceNonStrokingColor(new double[]{1,0,0});

        // set current transformation                    
        document.Pages[0].Content.Translate(10, 820);
        // add text object to page content, it will automatically create text showing operators                                
        document.Pages[0].Content.AppendText(textObject);
                   
        // save to output stream
        document.Save(outputStream);
    }
}
 

This code produces the following results:

Add text to PDF file using external font and set text properties

These two lines of text were added using single text object. As it’s been said at the beginning of this article, text object is a sequence of text commands; therefore it’s possible to create several textual sequences with different appearance contained in one text object.

You may also note that we changed the text color, using this code:

// set current stroking and non-stroking color
document.Pages[0].Content.SetDeviceStrokingColor(new double[]{1,0,0});
document.Pages[0].Content.SetDeviceNonStrokingColor(new double[]{1,0,0});
 

It set both stroking and non-stroking colors to red by specifying its RGB value in so-called device color space. It was automatically detected from number of arguments and was set to DeviceRGB (see section 8.6.4.3 “DeviceRGB Colour Space” of the PDF specification). All subsequent drawing commands added after these calls and specifying filling or stroking would have this color applied to them. In our example we added text after these calls so it became red.