Top "Unicode" questions

Unicode is a standard for the encoding, representation and handling of text with the intention of supporting all the characters required for written text incorporating all writing systems, technical symbols and punctuation.

Python: Using .format() on a Unicode-escaped string

I am using Python 2.6.5. My code requires the use of the "more than or equal to" sign. Here it goes: &…

python string unicode python-2.x
setting a UTF-8 in java and csv file

I am using this code for add Persian words to a csv file via OpenCSV: String[] entries="\u0645 \u062E\…

java unicode csv utf-8 opencsv
Is there a list of characters that look similar to English letters?

I’m having a crack at profanity filtering for a web forum written in Python. As part of that, I’…

python unicode glyph profanity
How to internationalize a Java web application?

I learnt from Google that Internationalization is the process by which I can make my web application to use all …

java jsp unicode internationalization
Really Good, Bad UTF-8 example test data

So we have the XSS cheat sheet to test our XSS filtering - but other than an example benign page …

unicode utf-8
How to match Cyrillic characters with a regular expression

How do I match French and Russian Cyrillic alphabet characters with a regular expression? I only want to do the …

regex unicode character-properties
Golang converting from rune to string

I have the following code, it is supposed to cast a rune into a string and print it. However, I …

string parsing go unicode rune
How to correctly parse UTF-8 encoded HTML to Unicode strings with BeautifulSoup?

I'm running a Python program which fetches a UTF-8-encoded web page, and I extract some text from the HTML …

python unicode utf-8 beautifulsoup urllib2
TypeError: coercing to Unicode: need string or buffer, int found

I have 2 APIs. I am fetching data from them. I want to assign particular code parts to string so that …

python api unicode buffer typeerror
Difference between Big Endian and little Endian Byte order

What is the difference between Big Endian and Little Endian Byte order ? Both of these seem to be related to …

unicode utf-16 endianness