UTF-8:一般?斌?统一码?

UTF-8: General? Bin? Unicode?(UTF-8:一般?斌?统一码?)
本文介绍了UTF-8:一般?斌?统一码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

我想弄清楚我应该对各种类型的数据使用什么排序规则.我将存储的内容 100% 是用户提交的.

I'm trying to figure out what collation I should be using for various types of data. 100% of the content I will be storing is user-submitted.

我的理解是我应该使用 UTF-8 通用 CI(不区分大小写)而不是 UTF-8 二进制.但是,我找不到 UTF-8 General CI 和 UTF-8 Unicode CI 之间的明确区别.

My understanding is that I should be using UTF-8 General CI (Case-Insensitive) instead of UTF-8 Binary. However, I can't find a clear a distinction between UTF-8 General CI and UTF-8 Unicode CI.

  1. 我应该将用户提交的内容存储在 UTF-8 General 还是 UTF-8 Unicode CI 列中?
  2. UTF-8 二进制适用于什么类型的数据?

推荐答案

总的来说,utf8_general_ciutf8_unicode_ci 快,但不太正确.

In general, utf8_general_ci is faster than utf8_unicode_ci, but less correct.

区别如下:

对于任何 Unicode 字符集,使用 _general_ci 排序规则执行的操作比使用 _unicode_ci 排序规则执行的操作快.例如,与 utf8_unicode_ci 的比较相比,utf8_general_ci 归类的比较速度更快,但准确性稍差.这样做的原因是utf8_unicode_ci支持扩展等映射;也就是说,当一个字符与其他字符的组合相等时.例如,在德语和其他一些语言中,"等于ss".utf8_unicode_ci 还支持收缩和可忽略的字符.utf8_general_ci 是不支持扩展、收缩或可忽略字符的旧排序规则.它只能在字符之间进行一对一的比较.

For any Unicode character set, operations performed using the _general_ci collation are faster than those for the _unicode_ci collation. For example, comparisons for the utf8_general_ci collation are faster, but slightly less correct, than comparisons for utf8_unicode_ci. The reason for this is that utf8_unicode_ci supports mappings such as expansions; that is, when one character compares as equal to combinations of other characters. For example, in German and some other languages "" is equal to "ss". utf8_unicode_ci also supports contractions and ignorable characters. utf8_general_ci is a legacy collation that does not support expansions, contractions, or ignorable characters. It can make only one-to-one comparisons between characters.

引用自:http://dev.mysql.com/doc/refman/5.0/en/charset-unicode-sets.html

更详细的解释,请阅读 MySQL 论坛中的以下帖子:http://forums.mysql.com/read.php?103,187048,188748

For more detailed explanation, please read the following post from MySQL forums: http://forums.mysql.com/read.php?103,187048,188748

至于 utf8_bin:utf8_general_ciutf8_unicode_ci 都执行不区分大小写的比较.相比之下,utf8_bin 区分大小写(除其他差异外),因为它比较字符的二进制值.

As for utf8_bin: Both utf8_general_ci and utf8_unicode_ci perform case-insensitive comparison. In constrast, utf8_bin is case-sensitive (among other differences), because it compares the binary values of the characters.

这篇关于UTF-8:一般?斌?统一码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

相关文档推荐

Can#39;t Create Entity Data Model - using MySql and EF6(无法创建实体数据模型 - 使用 MySql 和 EF6)
MySQL select with CONCAT condition(MySQL选择与CONCAT条件)
Capitalize first letter of each word, in existing table(将现有表格中每个单词的首字母大写)
How to retrieve SQL result column value using column name in Python?(如何在 Python 中使用列名检索 SQL 结果列值?)
Update row with data from another row in the same table(使用同一表中另一行的数据更新行)
Exporting results of a Mysql query to excel?(将 Mysql 查询的结果导出到 excel?)