About WURCS

The computational analysis of complex carbohydrates, or glycans, has produced a number of both linear and non-linear notations to represent these complex structures. Each representation format has advantages and disadvantages over the others to varying degrees, for different applications. In recent years, the Semantic Web has become the focus of life science database development as a means to link life science data in an effective and efficient manner. In order for carbohydrate data to be applied to this new technology, we have determined that there are two requirements for carbohydrate data representations: 1) a linear notation which can be used as an URI if needed, 2) a unique notation such that any published glycan structure can be represented distinctly. This latter requirement includes the possible representation of non-standard monosaccharide units as a part of the glycan structure, as well as compositions, repeating units and ambiguous structures where linkages/linkage positions are unidentified. Since none of the existing formats could completely satisfy these requirements, we have developed the Web3 Unique Representation of Carbohydrate Structures (WURCS) as a new linear notation for representing carbohydrates for the Semantic Web. In this new format, it is possible to represent any published carbohydrate structure as a linear string, which can be used as a unique identifier for the Semantic Web. Based on this notation, we hope that all carbohydrate databases can map their structures to this identifier such that they can be linked together appropriately.