I am trying to display a PDF by converting it into a binary string from the backend. This is the ajax call I am making
$.ajax({
type : 'GET',
url : '<url>',
data : oParameters,
contentType : 'application/pdf;charset=UTF-8',
success : function(odata) {
window.open("data:application/pdf;charset=utf-8," + escape(odata));
} });
When I try to open the PDF in a new window, the url looks like
data:application/pdf;charset=utf-8,%25PDF-1.3%0D%0A%25%uFFFD%uFFFD%uFFFD%uFFFD%0D%0A2%200%20obj%0D%0A/WinAnsiEncoding%0D........
As you can see, it uses "WinAnsiEncoding" to display the PDF. Because of this, some of the characters are not being displayed properly. How do I change this to UTF-8?
EDIT : The backend is in ABAP. I am converting a smartform to OTF and then to a string using the function module "CONVERT_OTF".
CALL FUNCTION fname
EXPORTING
user_settings = space
control_parameters = ls_ctropt
output_options = ls_output
gv_lang = lv_lang
IMPORTING
job_output_info = ls_body_text
EXCEPTIONS
formatting_error = 1
internal_error = 2
send_error = 3
user_canceled = 4
OTHERS = 5.
CALL FUNCTION 'CONVERT_OTF'
EXPORTING
format = 'PDF'
IMPORTING
bin_filesize = ls_pdf_len
bin_file = ls_pdf_xstring
TABLES
otf = ls_body_text-otfdata
lines = lt_lines
EXCEPTIONS
err_max_linewidth = 1
err_format = 2
err_conv_not_possible = 3
err_bad_otf = 4
OTHERS = 5.
CALL METHOD server->response->set_header_field( name = 'Content-Type'
value = 'application/pdf;charset=UTF-8' ).
CALL METHOD server->response->append_data( data = lv_pdf_string
length = lv_len ).
Concerning your remark that it uses "WinAnsiEncoding" to display the PDF:
After the comma in
data:application/pdf;charset=utf-8,%25PDF-1.3%0D%0A%25%uFFFD%uFFFD%uFFFD%uFFFD%0D%0A2%200%20obj%0D%0A/WinAnsiEncoding%0D........
everything is pure data. Thus, "WinAnsiEncoding" is merely part of the content of the PDF, and if it is the reason of your troubles, the PDF generator must be asked to change his PDF generation process.
In the case at hand, your data is:
%PDF-1.3
%...
2 0 obj
/WinAnsiEncoding
........
which is completely normal PDF structure. It merely means that the PDF object 2 is defined as /WinAnsiEncoding
which may or may not be used for some font definition, and even if it is used, it may still be adapted by some /Differences to include the characters you require. Furthermore it does not make sense to change this to UTF-8 (as you request) because UTF-8 is not a standard encoding for PDF page content. If you somehow put UTF-8
there, you'll break the PDF even more.
I'm afraid, though, that there are other problems, too.
You add a charset parameter to the type application/pdf --- this does not make sense, PDF is a binary format, i.e. a sequence of bytes is expected and, therefore, no charset is involved.
Your method call escape(odata)
creates %uFFFD%uFFFD%uFFFD%uFFFD --- this is invalid according to the RFCs which only define
A percent-encoding mechanism is used to represent a data octet in a component when that octet's corresponding character is outside the allowed set or is being used as a delimiter of, or within, the component. A percent-encoded octet is encoded as a character triplet, consisting of the percent character "%" followed by the two hexadecimal digits representing that octet's numeric value.
(RFC 3986, section 2.1)
Because the percent ("%") character serves as the indicator for percent-encoded octets, it must be percent-encoded as "%25" for that octet to be used as data within a URI.
(ibidem, section 2.4)
Thus, %uFFFD%uFFFD%uFFFD%uFFFD is invalid.
PDF being a binary format are better suited for Base64 encoding, i.e.
data:application/pdf;base64,BASE_64_ENCODED_PDF
Thus, I propose you change your client side process accordingly.