JonathanNemo
Member
And to give some visibility into what the world was like prior to thankfully now widespread UTF-8 Unicode, one friend explained the situation way back when as such:
The old ASCII encoding was largely made to accommodate American languages (the "a" in ASCII) and not much else. Prior to an attempt at a universal encoding, every country had several encodings, some loosely based on ASCII, that all tried to handle their respective languages. It's also a big part of why in the 50s through the mid 90s, it wasn't uncommon to see individual nations develop their own computer architectures.
Unicode also had its own pain points (see also: both kinds of UTF-16 and UTF-32), but UTF-8 largely solved those in a number of ways. See also; Rob Pike's history of UTF-8.
When there is no common standard, people find alternatives. In Taiwan, developers decades ago created environments that trap system calls, interpret special two-byte combinations and draw ideographic characters stored as bitmaps in 24 by 24 points on floppies.
They got more than five different encodings running. Unicode sorted most of the problem, but there are still some documents left in the past. To correctly view the documents, people have to try one encoding, then check the contents, then try another because encoding information is not stored with the data.
Japan still has two encoding schemes up till today.
The old ASCII encoding was largely made to accommodate American languages (the "a" in ASCII) and not much else. Prior to an attempt at a universal encoding, every country had several encodings, some loosely based on ASCII, that all tried to handle their respective languages. It's also a big part of why in the 50s through the mid 90s, it wasn't uncommon to see individual nations develop their own computer architectures.
Unicode also had its own pain points (see also: both kinds of UTF-16 and UTF-32), but UTF-8 largely solved those in a number of ways. See also; Rob Pike's history of UTF-8.