How to write UTF-8 characters to a pdf file using itextsharp?

teenup picture teenup · May 24, 2011 · Viewed 44.5k times · Source

I have tried a lot on google but not able to find..

Any help is appreciated.

Please find the code below:-

protected void Page_Load(object sender, EventArgs e)
    {
        StreamReader read = new StreamReader(@"D:\queryUnicode.txt", Encoding.Unicode);
        string str = read.ReadToEnd();

        Paragraph para = new Paragraph(str);

        FileStream file = new FileStream(@"D:\Query.pdf",FileMode.Create);

        Document pdfDoc = new Document();
        PdfWriter writer = PdfWriter.GetInstance(pdfDoc, file );

        pdfDoc.Open();
        pdfDoc.Add(para);
        pdfDoc.Close();

        Response.Write("Pdf file generated");
    }

Answer

Chris Haas picture Chris Haas · May 24, 2011

Are you converting HTML to PDF? If so, you should note that, otherwise never mind. The only reason I ask is that your last comment about getting æ makes me think that. If you are, check out this post: iTextSharp 5 polish character

Also, sometimes when people say "Unicode" what they're really trying to do is to get symbols like Wingdings into a PDF. If you mean that check out this post and know that Unicode and Wingding Symbols really aren't related at all. Unicode symbols in iTextSharp

Here's a complete working example that uses two ways to write Unicode characters, one using the character itself and one using the C# escape sequence. Make sure to save your file in a format that supports wide characters. This sample uses iTextSharp 5.0.5.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using iTextSharp.text;
using iTextSharp.text.pdf;
using System.IO;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create our document object
            Document Doc = new Document(PageSize.LETTER);

            //Create our file stream
            using (FileStream fs = new FileStream(Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "Test.pdf"), FileMode.Create, FileAccess.Write, FileShare.Read))
            {
                //Bind PDF writer to document and stream
                PdfWriter writer = PdfWriter.GetInstance(Doc, fs);

                //Open document for writing
                Doc.Open();

                //Add a page
                Doc.NewPage();

                //Full path to the Unicode Arial file
                string ARIALUNI_TFF = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Fonts), "ARIALUNI.TTF");

                //Create a base font object making sure to specify IDENTITY-H
                BaseFont bf = BaseFont.CreateFont(ARIALUNI_TFF, BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED);

                //Create a specific font object
                Font f = new Font(bf, 12, Font.NORMAL);

                //Write some text, the last character is 0x0278 - LATIN SMALL LETTER PHI
                Doc.Add(new Phrase("This is a test ɸ", f));

                //Write some more text, the last character is 0x0682 - ARABIC LETTER HAH WITH TWO DOTS VERTICAL ABOVE
                Doc.Add(new Phrase("Hello\u0682", f));

                //Close the PDF
                Doc.Close();
            }
        }
    }
}

When working with iTextSharp you have to make sure that you're using a font that supports the Unicode code points that you want to use. You also need to specify IDENTITY-H when using your font. I don't completely know what it means but there's some talk about it here: iTextSharp international text