The invention relates to a Spark SQL-based distributed full text retrieval system and method. The system comprises an SQL translation layer, a data source management layer, a parallel calculation layer and a distributed storage layer; an SQL-based full text retrieval method and translation processes, among modules of the SQL translation layer, of full text retrieval SQL statements are proposed; a full text retrieval process parallelization method is designed in a data source management module; and in a retrieval optimization module, two index storage models and corresponding primitive table data reduction strategies during query are designed, wherein a partition align connection algorithm which is used for reducing primitive table data during query and has a complexity of O (n) is designed for an index appointed column-based storage model. Under the two storage models, the index construction time is shortened to 0.6% / 0.5% of the traditional database, the query time is shortened to the 1% / 10% of the traditional database, and the index storage amount is decreased to 55.0% of the traditional database. According to the method, the Spark SQL data analysis function is strengthened, and the requirements for traditional business migration and full text retrieval carried out on mass data in the existing businesses can be satisfied.