Ngrams Project

We are focusing on developing proprietary web crawler and scalable content analyzing system. During the public study we analyzed main pages of TOP 10M domains, extracted 5,231,405,510 shingles (ngrams) and checked their uniqueness. We created report for each domain and put there obtained data. Details