csharp: Converting chinese character to Unicode

简介: Function chinese2unicode(Str) Dim Str_one:Str_one = "" Dim Str_unicode:Str_unicode = "" For i = 1 To Len(Str) Str_one = Mid(Str, i, 1) If AscW(Str_one) < 0 or
Function chinese2unicode(Str)
    Dim Str_one:Str_one = ""
    Dim Str_unicode:Str_unicode = ""
    For i  = 1 To Len(Str)
        Str_one = Mid(Str, i, 1)
        If AscW(Str_one) < 0 or AscW(Str_one) > 255 Then
            Str_unicode = Str_unicode & Chr(38)
            Str_unicode = Str_unicode & Chr(35)
            Str_unicode = Str_unicode & Chr(120)
            Str_unicode = Str_unicode & Hex(AscW(Str_one))
            Str_unicode = Str_unicode & Chr(59)
        Else
            Str_unicode = Str_unicode & Str_one
        End If
    Next
    chinese2unicode=Str_unicode
End Function

 /// <summary>
        /// %26%23x4EB2%3B%26%23x7231%3B%26%23x7684%3B%26%23x4F1A%3B%26%23x5458%3BTeresaLiu%2C%26%23x516D%3B%26%23x798F%3B%26%23x73E0%3B%26%23x5BF6%3B%26%23x6703%3B%26%23x54E1%3B%26%23x5BC6%3B%26%23x78BC%3B%26%23x4FEE%3B%26%23x6539%3B%26%23x9805%3B%26%23x901A%3B%26%23x77E5%3B%26%23xFF1A%3B%26%23x95A3%3B%26%23x4E0B%3B%26%23x5DF2%3B%26%23x6210%3B%26%23x529F%3B%26%23x66F4%3B%26%23x6539%3B%26%23x5BC6%3B%26%23x78BC%3B%26%23xFF0C%3B%26%23x5982%3B%26%23x6709%3B%26%23x67E5%3B%26%23x8A62%3B%26%23xFF0C%3B%26%23x8ACB%3B%26%23x81F4%3B%26%23x96FB%3B%26%23x9999%3B%26%23x6E2F%3B27109368%26%23xFF0F%3B%26%23x4E2D%3B%26%23x570B%3B4008846222
        ///塗聚文 20140724
        /// </summary>
        /// <param name="str"></param>
        /// <returns></returns>
        private string chinese2uncode(string str)
        {
            string s = "";
            string outStr = "";
            if (!string.IsNullOrEmpty(str))
            {
                for (int i = 0; i < str.Length; i++)
                {
                    if (Microsoft.VisualBasic.Strings.AscW(str[i].ToString()) < 0 || Microsoft.VisualBasic.Strings.AscW(str[i].ToString())>255) //如果是中文转换Regex.IsMatch(str[i].ToString(), @"[\u4e00-\u9fa5]")
                    { 
                        //outStr += "\\u" + ((int)str[i]).ToString("x"); 
                        outStr = outStr+(char)38;// "&";//char(38);
                        outStr = outStr + (char)35;// "#";
                        outStr = outStr + (char)120;// "x";
                        outStr = outStr + Microsoft.VisualBasic.Conversion.Hex(Microsoft.VisualBasic.Strings.AscW(str[i].ToString())); //outStr + 
                        outStr = outStr + (char)59;// ";";
            //Str_unicode = Str_unicode & Chr(38)
            //Str_unicode = Str_unicode & Chr(35)
            //Str_unicode = Str_unicode & Chr(120)
            //Str_unicode = Str_unicode & Hex(AscW(Str_one))
                        //Str_unicode = Str_unicode & Chr(59)// ;
                        

                    }
                    else 
                    { 
                        outStr += str[i]; 
                    }

                }
            }
            s = outStr;
            return s;
        }

目录
相关文章
|
4月前
|
Python
8-7|TypeError: The fill character must be a unicode character, not bytes
8-7|TypeError: The fill character must be a unicode character, not bytes
|
6月前
|
编解码 开发者 Python
【Python】已解决:UnicodeEncodeError: ‘gbk’ codec can’t encode character ‘\u0157’ in position 1: illegal m
【Python】已解决:UnicodeEncodeError: ‘gbk’ codec can’t encode character ‘\u0157’ in position 1: illegal m
98 1
|
6月前
|
编解码 测试技术 Python
【Python】已解决:UnicodeEncodeError: ‘ascii’ codec can’t encode characters in position 0-1: ordinal not i
【Python】已解决:UnicodeEncodeError: ‘ascii’ codec can’t encode characters in position 0-1: ordinal not i
1104 1
|
6月前
|
Python
SyntaxError: Non-ASCII character 与 Cannot decode using encoding "ascii" 错误解决
SyntaxError: Non-ASCII character 与 Cannot decode using encoding "ascii" 错误解决
61 0
|
6月前
|
XML 数据采集 编解码
【Python】已解决:UnicodeEncodeError: ‘utf-8’ codec can’t encode character ‘\udf76’ in position 32: surrog
【Python】已解决:UnicodeEncodeError: ‘utf-8’ codec can’t encode character ‘\udf76’ in position 32: surrog
51 0
|
6月前
|
编解码 开发者 Python
【Python】已解决:UnicodeEncodeError: ‘utf-8’ codec can’t encode characters in position 42-43: surrogates
【Python】已解决:UnicodeEncodeError: ‘utf-8’ codec can’t encode characters in position 42-43: surrogates
720 0
|
6月前
|
编解码 开发者 Python
【Python】已解决:SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: t
【Python】已解决:SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: t
714 0
|
7月前
|
Python
SyntaxError: Non-ASCII character 与 Cannot decode using encoding "ascii" 错误解决
在Python调试中遇到的两种编码错误:1) &quot;Cannot decode using encoding &#39;ascii&#39;&quot;,此错误发生在处理含有非ASCII字节的字符串时;2) &quot;SyntaxError: Non-ASCII character&quot;,当程序文件含中文且未声明编码。解决方法是在脚本开头添加 &quot;# -*- coding: utf-8 -*-&quot; 或 &quot;#coding=UTF-8&quot;,告知Python使用UTF-8解析,确保文件实际也以UTF-8编码保存。
80 0
|
8月前
|
编解码 Python Windows
Python写入文件报错‘gbk’ codec can’t encode character的解决办法
Python写入文件报错‘gbk’ codec can’t encode character的解决办法
230 2
|
关系型数据库 MySQL Shell
[ERROR] COLLATION ‘utf8_unicode_ci‘ is not valid for CHARACTER SET ‘latin1‘
[ERROR] COLLATION ‘utf8_unicode_ci‘ is not valid for CHARACTER SET ‘latin1‘