文档库 最新最全的文档下载
当前位置:文档库 › ASCII、UTF8、Uncicode编码下的中英文字符大小

ASCII、UTF8、Uncicode编码下的中英文字符大小

ASCII、UTF8、Uncicode编码下的中英文字符大小
ASCII、UTF8、Uncicode编码下的中英文字符大小

ASCII、UTF8、Uncicode编码下的中英文字符大小

?ASCII不能保存中文

?UTF8是变长编码。在对ASCII字符编码时,UTF更省空间,只占1个字节,与ASCII编码方式和长度相同;Unicode在对ASCII字符编码时,占用2个字节,且第2个字节补零。

?UTF8在对中文编码时需要占用3个字节;Unicode对中文编码则只需要2个字节。

代码示例:

1private static void ShowCode() {

2string[] strArray = { "b", "abcd", "乙", "甲乙丙丁" };

3byte[] buffer;

4string mode, back;

5

6foreach (string str in strArray) {

7

8for (int i = 0; i <= 2; i++) {

9if (i == 0) {

10 buffer = Encoding.ASCII.GetBytes(str);

11 back = Encoding.ASCII.GetString(buffer, 0,

buffer.Length);

12 mode = "ASCII";

13 } else if (i == 1) {

14 buffer = Encoding.UTF8.GetBytes(str);

15 back = Encoding.UTF8.GetString(buffer, 0,

buffer.Length);

16 mode = "UTF8";

17 } else {

18 buffer = Encoding.Unicode.GetBytes(str);

19 back = Encoding.Unicode.GetString(buffer, 0, buffer.Length);

20 mode = "Unicode";

21 }

22

23 Console.WriteLine("Mode: {0}, String: {1}, Buffer.Length: {2}",

24 mode, str, buffer.Length);

25

26 Console.WriteLine("Buffer:");

27for (int j = 0; j <= buffer.Length - 1; j++) {

28 Console.Write(buffer[j] + "");

29 }

30

31 Console.WriteLine("\nRetrived: {0}\n", back);

32 }

33 }

34 }

运行结果:

1 Mode: ASCII, String: b, Buffer.Length: 1

2 Buffer: 98

3 Retrived: b

4

5 Mode: UTF8, String: b, Buffer.Length: 1

6 Buffer: 98

7 Retrived: b

8

9 Mode: Unicode, String: b, Buffer.Length: 2

10 Buffer: 980

11 Retrived: b

12

13 Mode: ASCII, String: abcd, Buffer.Length: 4

14 Buffer: 979899100

15 Retrived: abcd

16

17 Mode: UTF8, String: abcd, Buffer.Length: 4

18 Buffer: 979899100

19 Retrived: abcd

20

21 Mode: Unicode, String: abcd, Buffer.Length: 8

22 Buffer: 9709809901000

23 Retrived: abcd

24

25 Mode: ASCII, String: 乙, Buffer.Length: 1

26 Buffer: 63

27 Retrived: ?

28

29 Mode: UTF8, String: 乙, Buffer.Length: 3

30 Buffer: 228185153

31 Retrived: 乙

32

33 Mode: Unicode, String: 乙, Buffer.Length: 2

34 Buffer: 8978

35 Retrived: 乙

36

37 Mode: ASCII, String: 甲乙丙丁, Buffer.Length: 4

38 Buffer: 63636363

39 Retrived: ????

40

41 Mode: UTF8, String: 甲乙丙丁, Buffer.Length: 12

42 Buffer: 231148178228185153228184153228184129

43 Retrived: 甲乙丙丁

44

45 Mode: Unicode, String: 甲乙丙丁, Buffer.Length: 8

46 Buffer: 5011789782578178

47 Retrived: 甲乙丙丁

得出结论:

1 ASCII不能保存中文(貌似谁都知道=_-`)。

2UTF8是变长编码。在对ASCII字符编码时,UTF更省空间,只占1个字节,与ASCII编码方式和长度相同;Unicode在对ASCII字符编码时,占用2个字节,且第2个字节补零。

3UTF8在对中文编码时需要占用3个字节;Unicode对中文编码则只需要2个字节。

相关文档