Image for Inverse Document Frequency

Inverse Document Frequency

Inverse Document Frequency (IDF) is a measure used in text analysis to assess how important a word is across a collection of documents. If a word appears in many documents, it’s considered less useful for distinguishing one document from another, so its IDF score is low. Conversely, if a word is rare and appears in few documents, its IDF score is high, indicating it helps identify specific content. Essentially, IDF helps highlight unique or significant words by giving more weight to those that are less common across the entire document set.