How to set Font in itextSharp for HTML to PDF

Aneel Mehta picture Aneel Mehta · Mar 5, 2012 · Viewed 11.4k times · Source

I have to create run time pdf from html in Web application developed in VB.Net and MSSQL 2005, using itextSharp.

The HTML is saved in database. which contains Gujarati, Hindi and English contents.

Can anyone tell me how to set font for the html and which fonts should I use that display Enlgish, Gujarati and Hindi I tried Arial Unicode MS but its does not display Hindi accurate.

Thank you in advance

here is the code of method that i am using to convert html string into pdf file which user can save on local machine.

Private Sub ExporttoPDF(ByVal FullHtml As String, ByVal fileName As String)
    Try
            Response.Clear()    ' Clear Response and set content type and disposition so that user get save file dialogue. 
            Response.ContentType = "application/pdf"
            Response.AddHeader("content-disposition", String.Format("attachment;filename={0}.pdf", fileName))
            Response.Cache.SetCacheability(HttpCacheability.NoCache)


            Dim sr As StringReader = New StringReader(FullHtml)
            Dim pdfDoc As iTextSharp.text.Document = New iTextSharp.text.Document(PageSize.A4.Rotate, 10, 10, 10, 10)
            Dim htmlparser As HTMLWorker = New HTMLWorker(pdfDoc)
            PdfWriter.GetInstance(pdfDoc, Response.OutputStream)
            pdfDoc.Open()
            Dim fontpath As String = System.Web.HttpContext.Current.Request.PhysicalApplicationPath + "\fonts\ARIALUNI.TTF" 
            '  "ARIALUNI.TTF" file copied from fonts folder and placed in the folder
            Dim bf As BaseFont = BaseFont.CreateFont(fontpath, BaseFont.IDENTITY_H, BaseFont.EMBEDDED)

            FontFactory.RegisterDirectory( System.Web.HttpContext.Current.Request.PhysicalApplicationPath , True)

            FontFactory.Register(fontpath, "Arial Unicode MS")
            FontFactory.RegisterFamily("Arial Unicode MS", "Arial Unicode MS", fontpath)    

            'parse html from String reader "sr"
            htmlparser.Parse(sr)
            pdfDoc.Close()
            Response.Write(pdfDoc)
            Response.End()

    Catch ex As Exception
        Throw ex
    End Try
End Sub

Here is how I am using the code

dim htmlstring as string = "<html><body encoding=""" + BaseFont.IDENTITY_H + """ style=""font-family:Arial Unicode MS;font-size:12;""> <h2> set Font in itextSharp for HTML to PDF  </h2> <span> I (aneel/અનિલ/अनिल) am facing problem to create a pdf from html that contains enlish, ગુજરાતી, हिंदी and other unicode characters.  </span> </body></html>"         
ExporttoPDF(htmlstring ,"sample.pdf")   

In result for Gujarati it is displaying અનલિ where as expected is અનિલ

Where as for Hindi it is displaying अनलि where as it should be अनिल .

Answer

Stefan picture Stefan · Mar 5, 2012

try

pdfDoc.Add(New Header(iTextSharp.text.html.Markup.HTML_ATTR_STYLESHEET, “yourcssfile.css”)) // or path to your css file

then

Dim styles As iTextSharp.text.html.simpleparser.StyleSheet
styles = New iTextSharp.text.html.simpleparser.StyleSheet
styles.LoadTagStyle("ol", "leading", "16")

you can add everything to styles. then replace html.parse with this

HTMLWorker.ParseToList(New StreamReader("htmlpath.html", Encoding.Default), styles);