The Unicode Character Database (UCD) consists of a number of data files listing Unicode character properties and related data. It also includes data files containing test data for conformance to several important Unicode algorithms. Full documentation for the UCD can be found in Unicode Standard Annex #44, Unicode Character Database. http://www.unicode.org/reports/tr44/