In the paper, problems of legal information digitalization are investigated. Conditions for extraction information from legal texts (i.a. normative acts) related to the common ones processing (non-legal terms, in English) are outlined. Problems of dimensionality reduction and application of similarity measures are discussed. Sample results of similarity analysis is presented. Further research aimed at semantic analysis of legal texts are outlined.