Discussion:
ANSI encoded file and Japanese Character rendering on a Japanese workstation
(too old to reply)
Lakshmi
2005-02-21 22:47:53 UTC
Permalink
A japanese customer sent me a file, ids.txt, containing a list of Employeed
ids. These ids are a

mix of "Japanese Characters" and English characters. I can open this file
in notepad on a

Japanese enabled workstatation and when I do a File->Save, it shows the
encoding as unicode.
When I save this file as "ids_ansi.txt" with the encoding changed to ANSI
and issue a "type ids_ansi.txt"

command at a cmd prompt, I can see that the Japanese & English characters
are rendered

correctly.

I am under the impression that japanese characters can be rendered
correctly only if the

underlying charset is unicode. If so, how did the cmd window render the
Japanese & English

characters in the ANSI encoded file correctly ?

Here is the reason why I am asking the question.
We have a .net windows forms application which can save data to any rdbms
using the ADO.net

architecture. Some of the databases have been round for a while and have
only the cp850

character installed on them and customers resist changing to unicode. I
wanted to know if there is

an easy enough way to specify that string traffic between the windows

forms->ADO.Net->database is only in cp850 such that japanese character
rendering occurs on the forms

without any "loss of translation." In short, I would like to have the same
behaviour as the cmd

prompt that I described above. I understand that with this approach the
sort order, the size of local

language characters that can be stored, etc will be an issue.

Any tips, hints appreciated.
Thanks
Severian
2005-02-22 05:01:39 UTC
Permalink
Post by Lakshmi
A japanese customer sent me a file, ids.txt, containing a list of Employeed
ids. These ids are a
mix of "Japanese Characters" and English characters. I can open this file
in notepad on a
Japanese enabled workstatation and when I do a File->Save, it shows the
encoding as unicode.
When I save this file as "ids_ansi.txt" with the encoding changed to ANSI
and issue a "type ids_ansi.txt"
command at a cmd prompt, I can see that the Japanese & English characters
are rendered
correctly.
I am under the impression that japanese characters can be rendered
correctly only if the
underlying charset is unicode. If so, how did the cmd window render the
Japanese & English
characters in the ANSI encoded file correctly ?
Here is the reason why I am asking the question.
We have a .net windows forms application which can save data to any rdbms
using the ADO.net
architecture. Some of the databases have been round for a while and have
only the cp850
character installed on them and customers resist changing to unicode. I
wanted to know if there is
an easy enough way to specify that string traffic between the windows
forms->ADO.Net->database is only in cp850 such that japanese character
rendering occurs on the forms
without any "loss of translation." In short, I would like to have the same
behaviour as the cmd
prompt that I described above. I understand that with this approach the
sort order, the size of local
language characters that can be stored, etc will be an issue.
For DBCS, Japanese uses CP 932, which will be the default on a
Japanese system which is why you could type the file. AFAIK, there is
no way to encode Japanese character in CP 850 (Western European).

If you move the file to a system using CP 850, you'll see very
different output.

--
Sev
Garrett McGowan[MSFT]
2005-02-24 01:32:00 UTC
Permalink
I'd avoid submitting non-Western European character data to
char/varchar/text fields if the database is set to a cp850 collation. You
might get away with it by setting Auto Translate = False in your connection
string, but you would need to run the client under the matching system
locale in order to retrieve the data intact.

If you even suspect that you'll need to store data from multiple codepages,
you should consider migrating the data to a new database that uses Unicode
datatypes (e.g., nchar/nvarchar/ntext) for the affected fields.

Garrett McGowan [MSFT Developer International]

This posting is provided "AS IS" with no warranties, and confers no rights.
Use of included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm
--------------------
microsoft.public.dotnet.internationalization,microsoft.public.platformsdk.lo
calization,microsoft.public.win32.programmer.international
Subject: Re: ANSI encoded file and Japanese Character rendering on a
Japanese workstation
Post by Lakshmi
A japanese customer sent me a file, ids.txt, containing a list of Employeed
ids. These ids are a
mix of "Japanese Characters" and English characters. I can open this file
in notepad on a
Japanese enabled workstatation and when I do a File->Save, it shows the
encoding as unicode.
When I save this file as "ids_ansi.txt" with the encoding changed to ANSI
and issue a "type ids_ansi.txt"
command at a cmd prompt, I can see that the Japanese & English characters
are rendered
correctly.
I am under the impression that japanese characters can be rendered
correctly only if the
underlying charset is unicode. If so, how did the cmd window render the
Japanese & English
characters in the ANSI encoded file correctly ?
Here is the reason why I am asking the question.
We have a .net windows forms application which can save data to any rdbms
using the ADO.net
architecture. Some of the databases have been round for a while and have
only the cp850
character installed on them and customers resist changing to unicode. I
wanted to know if there is
an easy enough way to specify that string traffic between the windows
forms->ADO.Net->database is only in cp850 such that japanese character
rendering occurs on the forms
without any "loss of translation." In short, I would like to have the same
behaviour as the cmd
prompt that I described above. I understand that with this approach the
sort order, the size of local
language characters that can be stored, etc will be an issue.
For DBCS, Japanese uses CP 932, which will be the default on a
Japanese system which is why you could type the file. AFAIK, there is
no way to encode Japanese character in CP 850 (Western European).
If you move the file to a system using CP 850, you'll see very
different output.
--
Sev
Steven Cheng
2005-02-24 10:58:56 UTC
Permalink
Hi Lakshmi,

In addition to using Unicode, if the OS's System Locale(used for non-unicode
program) matchs your text file( which contains those japanese or other
fareast characters) 's charset(MBCS), the console or windows app can also
handle it correctly.

Thanks,

Steven
Post by Garrett McGowan[MSFT]
I'd avoid submitting non-Western European character data to
char/varchar/text fields if the database is set to a cp850 collation. You
might get away with it by setting Auto Translate = False in your connection
string, but you would need to run the client under the matching system
locale in order to retrieve the data intact.
If you even suspect that you'll need to store data from multiple codepages,
you should consider migrating the data to a new database that uses Unicode
datatypes (e.g., nchar/nvarchar/ntext) for the affected fields.
Garrett McGowan [MSFT Developer International]
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm
--------------------
microsoft.public.dotnet.internationalization,microsoft.public.platformsdk.lo
calization,microsoft.public.win32.programmer.international
Subject: Re: ANSI encoded file and Japanese Character rendering on a
Japanese workstation
Post by Lakshmi
A japanese customer sent me a file, ids.txt, containing a list of
Employeed
Post by Lakshmi
ids. These ids are a
mix of "Japanese Characters" and English characters. I can open this file
in notepad on a
Japanese enabled workstatation and when I do a File->Save, it shows the
encoding as unicode.
When I save this file as "ids_ansi.txt" with the encoding changed to ANSI
and issue a "type ids_ansi.txt"
command at a cmd prompt, I can see that the Japanese & English
characters
Post by Lakshmi
are rendered
correctly.
I am under the impression that japanese characters can be rendered
correctly only if the
underlying charset is unicode. If so, how did the cmd window render the
Japanese & English
characters in the ANSI encoded file correctly ?
Here is the reason why I am asking the question.
We have a .net windows forms application which can save data to any rdbms
using the ADO.net
architecture. Some of the databases have been round for a while and have
only the cp850
character installed on them and customers resist changing to unicode. I
wanted to know if there is
an easy enough way to specify that string traffic between the windows
forms->ADO.Net->database is only in cp850 such that japanese character
rendering occurs on the forms
without any "loss of translation." In short, I would like to have the same
behaviour as the cmd
prompt that I described above. I understand that with this approach the
sort order, the size of local
language characters that can be stored, etc will be an issue.
For DBCS, Japanese uses CP 932, which will be the default on a
Japanese system which is why you could type the file. AFAIK, there is
no way to encode Japanese character in CP 850 (Western European).
If you move the file to a system using CP 850, you'll see very
different output.
--
Sev
3***@gmail.com
2013-12-19 11:05:30 UTC
Permalink
Hello..
This is nice post done by you.
If you are interested or wants to know our different ideas about character rendering then you can see at.
3D Yantram Animation Studio
http://3dyantram.info/character-modeling.html
3***@gmail.com
2013-12-19 12:04:00 UTC
Permalink
Hello..
This is nice post done by you. see our site at..
3D Yantram Architectural Animation Studio
http://3d-walkthrough-rendering.outsourcing-services-india.com

Loading...