    I started this thread for discussion of statistics for cracking cipher texts. I think that if I want to make my hill climbing script more advanced, I am going to need some better measures to discern English from nonsense.

    I had an idea for a ‘word fitness’ – a bit different from ngram fitness, where the most common words are analysed from a corpus, and words which are more frequently found next to each other have a higher fitness together. This could go well with some sort of dictionary attack where words are substituted in to find which ones fit best. Does anyone have any ideas as to how this could be computed?

