{"id":26186,"date":"2020-03-20T05:37:00","date_gmt":"2020-03-20T00:37:00","guid":{"rendered":"http:\/\/3.85.220.248\/?p=26186"},"modified":"2024-05-01T07:24:31","modified_gmt":"2024-05-01T02:24:31","slug":"understanding-tf-idf-and-bm-25","status":"publish","type":"post","link":"https:\/\/kmwllc.com\/index.php\/2020\/03\/20\/understanding-tf-idf-and-bm-25\/","title":{"rendered":"Understanding TF-IDF and BM-25"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"26186\" class=\"elementor elementor-26186\" data-elementor-post-type=\"post\">\n\t\t\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-3568240c elementor-section-stretched elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"3568240c\" data-element_type=\"section\" data-e-type=\"section\" data-settings=\"{&quot;stretch_section&quot;:&quot;section-stretched&quot;}\">\r\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-thegem\"><div class=\"elementor-row\">\r\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-20788c4a\" data-id=\"20788c4a\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-46ee24a0 flex-horizontal-align-default flex-horizontal-align-tablet-default flex-horizontal-align-mobile-default flex-vertical-align-default flex-vertical-align-tablet-default flex-vertical-align-mobile-default elementor-widget elementor-widget-text-editor\" data-id=\"46ee24a0\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t<div class=\"elementor-text-editor elementor-clearfix\">\r\n\t\t\t\t\t\t\n<h4 class=\"wp-block-heading\">Introduction<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">This article is for search practitioners who want to achieve a deep understanding of the ranking functions TF-IDF and BM25 (also called \u201csimilarities\u201d in Lucene). If you\u2019re like many practitioners, you\u2019re already familiar with TF-IDF, but when you first saw the complicated BM25 formula, you thought \u201cmaybe later.\u201d Now is the time to finally understand it! You\u2019ve probably heard that BM25 is similar to TF-IDF but works better in practice. This article will show you precisely how BM25 builds upon TF-IDF, what its parameters do, and why it is so effective. If you\u2019d rather skip over the math and work with practical examples that demonstrate BM25\u2019s behaviors, check out our companion article on\u00a0<a href=\"https:\/\/kmwllc.com\/index.php\/2020\/03\/10\/understanding-scoring-through-examples\/\">Understanding Scoring Through Examples<\/a>.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Reviewing TF-IDF<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">Let\u2019s review TF-IDF by trying to develop it from scratch. Imagine we\u2019re building a search engine. Assume we\u2019ve already got a way to find the documents that&nbsp;<em>match<\/em>&nbsp;a user\u2019s search. What we need now is a&nbsp;<strong>ranking function<\/strong>&nbsp;that will tell us how to order those documents. The higher a document\u2019s score according to this function, the higher up we\u2019ll place it in the list of results that we return to the user.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The goal of TF-IDF and similar ranking functions is to reward&nbsp;<em>relevance<\/em>. Say a user searches for the term \u201cdogs.\u201d If Document 1 is more relevant to the subject of dogs than Document 2, then we want the score of Document 1 to be higher than the score of Document 2, so we\u2019ll show the better result first and the user will be happy. How much higher does Document 1\u2019s score have to be? It doesn\u2019t really matter, as long as the score order matches the relevance order.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">You might feel a little shocked by the audacity of what we\u2019re attempting to do: we\u2019re going to try to judge the relevance of millions or billions of documents using a mathematical function, without knowing anything about the person who\u2019s doing the search, and without actually reading the documents and understanding what they\u2019re about! How is this possible?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We\u2019ll make a simple but profoundly helpful assumption. We\u2019ll assume that the more times a document contains a term, the more likely it is to be&nbsp;<em>about<\/em>&nbsp;that term. That\u2019s to say, we\u2019ll use&nbsp;<strong>term frequency (TF)<\/strong>, the number of occurrences of a term in a document, as a proxy for relevance. This one assumption creates a path for us to solve a seemingly impossible problem using simple math. Our assumption isn\u2019t perfect, and it goes very wrong sometimes, but it works often enough to be useful. So from here on, we\u2019ll view term frequency as a good thing\u200a\u2014\u200aa thing we want to reward.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>TF-IDF: Attempt 1<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">As a starting point for our ranking function, let\u2019s do the simplest, easiest thing possible. We\u2019ll set the score of a document equal to its term frequency. If we\u2019re searching for a term T and evaluating the relevance of a document D, then:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-cyan-blue-color\">score(D, T) = termFrequency(D, T)<\/mark><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">When a query has multiple terms, like \u201cdogs and cats,\u201d how should we handle that? Should we try to analyze the relationships between the various terms and then blend the per-term scores together in a complex way? Not so fast! The simplest approach is to just add the scores for each term together. So we\u2019ll do that, and hope for the best. If we have a multi-term query Q, then we\u2019ll set:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-cyan-blue-color\">score(D, Q) = sum over all terms T in Q of score(D, T)<\/mark>\n<\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">How well does our simple ranking function work? Unfortunately, it\u2019s got some problems:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">1) Longer documents are given an unfair advantage over shorter ones because they have more space to include more occurrences of a term, even though they might not be more relevant to the term. Let\u2019s ignore this problem for now.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">2) All terms in a query are treated equally, with no consideration for which ones are more meaningful or important. When we sum the scores for each term together, insignificant terms like \u201cand\u201d and \u201cthe\u201d which happen to be very frequent will dominate the combined score. Say you search for \u201celephants and cows.\u201d Perhaps there\u2019s a single document in the index that includes all three terms (\u201celephants\u201d, \u201cand\u201d, \u201ccows\u201d), but instead of seeing this ideal result first, you see the document that has the most occurrences of \u201cand\u201d\u200a\u2014\u200amaybe it has 10,000 of them. This preference for filler words is clearly not what we want.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>TF-IDF: Attempt 2<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To prevent filler words from dominating, we need some way of judging the&nbsp;<em>importance<\/em>&nbsp;of the terms in a query. Since we can\u2019t encode an understanding of natural language into our scoring function, we\u2019ll try to find a proxy for importance. Our best bet is&nbsp;<em>rarity<\/em>. If a term doesn\u2019t occur in most documents in the corpus, then whenever it does occur, we\u2019ll guess that this occurrence is significant. On the other hand, if a term occurs in most of the documents in our corpus, then the presence of that term in any particular document will lose its value as an indicator of relevance.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">So high term frequency is a good thing, but its goodness is offset by high&nbsp;<strong>document frequency (DF)\u200a<\/strong>\u2014\u200athe number of documents that contain the term\u200a\u2014\u200awhich we\u2019ll think of as a bad thing.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To update our function in a way that rewards term frequency but penalizes document frequency, we could try dividing TF by DF:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-cyan-blue-color\">score(D, T) = termFrequency(D, T) \/ docFrequency(T)<\/mark><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">What\u2019s wrong with this? Unfortunately, DF by itself tells us nothing. If DF for the term \u201celephant\u201d is 100, then is \u201celephant\u201d a rare term or a common term? It depends on the size of the corpus. If the corpus contains 100 documents, \u201celephant\u201d is common, if it contains 100,000 documents, \u201celephant\u201d is rare.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>TF-IDF: Attempt 3<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Instead of looking at DF by itself, let\u2019s look at N\/DF, where N is the size of the search index or corpus. Notice how N\/DF is low for common terms (100 occurrences of \u201celephant\u201d in a corpus of size 100 would give N\/DF = 1), and high for rare ones (100 occurrences of \u201celephant in a corpus of size 100,000 would give N\/DF = 1000). That\u2019s exactly what we want: matches for common terms should get low scores, matches for rare terms should get high ones. Our improved formula might go like this:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-cyan-blue-color\">score(D, T) = termFrequency(D, T) * (N \/ docFrequency(T))<\/mark><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">We\u2019re doing better, but let\u2019s take a closer look at how N\/DF behaves. Say we have 100 documents and \u201celephant\u201d occurs in 1 of them while \u201cgiraffe\u201d occurs in 2 of them. Both terms are similarly rare, but elephant\u2019s N\/DF value would come out to 100 and giraffe\u2019s would be half that, at 50. Should a match for giraffe get half the score of match for elephant just because giraffe\u2019s document frequency is one higher then elephant\u2019s? The penalty for one additional occurrence of the word in the corpus seems too high. Arguably, if we have 100 documents, it shouldn\u2019t make much of a difference whether a term\u2019s DF is 1, 2, 3, or 4&nbsp;.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>TF-IDF: Attempt 4<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">As we\u2019ve seen, when DF is in a very low range, small differences in DF can have a dramatic impact on N\/DF and hence on the score. We might like to smooth out the decline of N\/DF when DF is in the lowest end of its range. One way to do this is to take the&nbsp;<strong>log<\/strong>&nbsp;of N\/DF. If we wanted, we could try to use a different smoothing function here, but log is straightforward and it does what we want. This chart compares N\/DF and log(N\/DF) assuming N=100:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img fetchpriority=\"high\" width=\"1024\" height=\"707\" src=\"https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/ApplyingLogToIDF-1024x707.png\" alt=\"\" class=\"wp-image-26402\" srcset=\"https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/ApplyingLogToIDF-1024x707.png 1024w, https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/ApplyingLogToIDF-300x207.png 300w, https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/ApplyingLogToIDF-768x530.png 768w, https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/ApplyingLogToIDF.png 1087w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Let\u2019s call log(N\/DF) the&nbsp;<strong>inverse document frequency (IDF)<\/strong>&nbsp;of a term. Our ranking function can now be expressed as TF * IDF or:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-cyan-blue-color\">score(D, T) = termFrequency(D, T) * log(N \/ docFrequency(T))<\/mark><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">We\u2019ve arrived at the traditional definition of TF-IDF and even though we made some bold assumptions to get here, the function works pretty well in practice: it has gathered a long track record of successful application in search engines. Are we done or could we do even better?<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Developing BM25<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">As you might have guessed, we\u2019re not ready to stop at TF-IDF. In this section, we\u2019ll build the BM25 function, which can be seen as an improvement on TF-IDF. We\u2019re going to keep the same structure of the TF * IDF formula, but we\u2019ll replace the TF and IDF components with refinements of those values.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Step 1: Term Saturation<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We\u2019ve been saying that TF is a good thing, and indeed our TF-IDF formula rewards it. But if a document contains 200 occurrences of \u201celephant,\u201d is it really&nbsp;<em>twice<\/em>&nbsp;as relevant as a document that contains 100 occurrences? We could argue that if \u201celephant\u201d occurs a large enough number of times, say 100, the document is almost certainly relevant, and any further mentions don\u2019t really increase the likelihood of relevance. To put it a different way, once a document is&nbsp;<em>saturated<\/em>&nbsp;with occurrences of a term, more occurrences shouldn\u2019t a have a significant impact on the score. So we\u2019d like a way to control the contribution of TF to our score. We\u2019d like this contribution to increase fast when TF is small and then increase more slowly, approaching a limit, as TF gets very big.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">One common way to tame TF is to take the square root of it, but that\u2019s still an unbounded quantity. We\u2019d like to do something more sophisticated. We\u2019d like to put a bound on TF\u2019s contribution to the score, and we\u2019d like to be able to control how rapidly the contribution approaches that bound. Wouldn\u2019t it be nice if we had a parameter&nbsp;<strong>k<\/strong>&nbsp;that could control the shape of this saturation curve? That way, we\u2019d be able to experiment with different values of k and see what works best for a particular corpus.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To achieve this, we\u2019ll pull out a trick. Instead of using raw TF in our ranking formula, we\u2019ll use the value:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-cyan-blue-color\">TF \/ (TF + k)<\/mark><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">If k is set to 1, this would generate the sequence 1\/2, 2\/3, 3\/4, 4\/5, 5\/6 as TF increases 1, 2, 3, etc. Notice how this sequence grows fast in the beginning and then more slowly, approaching 1 in smaller and smaller increments. That\u2019s what we want. Now if we change k to 2, we\u2019d get 1\/3, 2\/4, 3\/5, 4\/6 which grows a little more slowly. Here\u2019s a graph of the formula TF\/(TF + k) for k = 1, 2, 3, 4:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img width=\"1024\" height=\"387\" src=\"https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/TFSaturation-1024x387.png\" alt=\"\" class=\"wp-image-26403\" srcset=\"https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/TFSaturation-1024x387.png 1024w, https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/TFSaturation-300x113.png 300w, https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/TFSaturation-768x290.png 768w, https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/TFSaturation.png 1347w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">This TF\/(TF + k) trick is really the backbone of BM25. It lets us control the contribution of TF to the score in a tunable way.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Aside: Term Saturation and Multi-Term Queries<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">A fortunate side-effect of using TF\/(TF + k) to account for term saturation is that we end up rewarding complete matches over partial ones. That\u2019s to say, we reward documents that match more of the terms in a multi-term query over documents that have lots of matches for just one of the terms.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Let\u2019s say that \u201ccat\u201d and \u201cdog\u201d have the same IDF values. If we search for \u201ccat dog\u201d we\u2019d like a document that contains one instance of each term to do better than a document that has two instances of \u201ccat\u201d and none of \u201cdog.\u201d If we were using raw TF they\u2019d both get the same score. But let\u2019s do our improved calculation assuming k=1. In our \u201ccat dog\u201d document, \u201ccat\u201d and \u201cdog\u201d each have TF=1, so each are going to contribute TF\/(TF+1) = 1\/2 to the score, for a total of 1. In our \u201ccat cat\u201d document, \u201ccat\u201d has a TF of 2, so it\u2019s going to contribute TF\/(TF+1) = 2\/3 to the score. The \u201ccat dog\u201d document wins, because \u201ccat\u201d and \u201cdog\u201d contribute more when each occurs once than \u201ccat\u201d contributes when it occurs twice.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Assuming the IDF of two terms is the same, it\u2019s always better to have one instance of each term than to have two instances of one of them.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Step 2: Document Length<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Now let\u2019s go back to the problem we skipped over when we were first building TF-IDF: document length. If a document happens to be really short and it contains \u201celephant\u201d once, that\u2019s a good indicator that \u201celephant\u201d is important to the content. But if the document is really, really long and it mentions elephant only once, the document is probably not about elephants. So we\u2019d like to reward matches in short documents, while penalizing matches in long documents. How can we achieve this?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">First, we\u2019ve got to decide what it means for a document to be short or long. We need a frame of reference, so we\u2019ll use the corpus itself as our frame of reference. A short document is simply one that is&nbsp;<em>shorter than average<\/em>&nbsp;for the corpus.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Let\u2019s go back to our TF\/(TF + k) trick. Of course as k increases, the value of TF\/(TF + k) decreases. To penalize long documents, we can adjust k up if the document is longer than average, and adjust it down if the document is shorter than average. We\u2019ll achieve this by multiplying k by the ratio&nbsp;<strong>dl\/adl<\/strong>. Here,&nbsp;<em>dl<\/em>&nbsp;is the document\u2019s length, and&nbsp;<em>adl<\/em>&nbsp;is the average document length across the corpus.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">When a document is of average length, dl\/adl =1, and our multiplier doesn\u2019t affect k at all. For a document that\u2019s shorter than average, we\u2019ll be multiplying k by a value between 0 and 1, thereby reducing it, and increasing TF\/(TF+k). For a document that\u2019s longer than average, we\u2019ll be multiplying k by a value greater than 1, thereby increasing it, and reducing TF\/(TF+k). The multiplier also puts us on a different TF saturation curve. Shorter documents will approach a TF saturation point more quickly while longer documents will approach it more gradually.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Step 3: Parameterizing Document Length<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In the last section, we updated our ranking function to account for document length, but is this always a good idea? Just how much importance should we place on document length in any particular corpus? Might there be some collections of documents where length matters a lot and some where it doesn\u2019t? We might like to treat the importance of document length as a second parameter that we can experiment with.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We\u2019re going to achieve this tunability with another trick. We\u2019ll add a new parameter&nbsp;<strong>b<\/strong>&nbsp;into the mix (it must be between 0 and 1). Instead of multiplying k by dl\/adl as we were doing before, we\u2019ll multiply k by the following value based on dl\/adl and b:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-cyan-blue-color\">1 \u2013 b + b*dl\/adl<\/mark><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">What does this do for us? You can see if b is 1, we get (1 \u2013 1 + 1*dl\/adl) and this reduces to the multiplier we had before, dl\/adl. On the other hand, if b is 0, the whole thing becomes 1 and document length isn\u2019t considered at all. As b is cranked up from 0 towards 1, the multiplier responds more quickly to changes in dl\/adl. The chart below shows how our multiplier behaves as dl\/adl grows, when b=.2 versus when b=.8.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img width=\"1024\" height=\"545\" src=\"https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/DocLength-1024x545.png\" alt=\"\" class=\"wp-image-26404\" srcset=\"https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/DocLength-1024x545.png 1024w, https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/DocLength-300x160.png 300w, https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/DocLength-768x409.png 768w, https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/DocLength.png 1056w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Recap: Fancy TF<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To recap, we\u2019ve been working modifying the TF term in TF * IDF so that it\u2019s responsive to term saturation and document length. To account for term saturation, we introduced the TF\/(TF + k) trick. To account for document length, we added the (1 \u2013 b + b*dl\/adl) multiplier. Now, instead of using raw TF in our ranking function, we\u2019re using this \u201cfancy\u201d version of TF:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-cyan-blue-color\">TF\/(TF + k*(1 - b + b*dl\/adl)) <\/mark><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Recall that k is the knob that control the term saturation curve, and b is the knob that controls the importance of document length.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Indeed, this is the version of TF that\u2019s used in BM25. And congratulations: if you\u2019ve followed this far, you now understand all the really interesting stuff about BM25.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Step 4: Fancy or Not-So-Fancy IDF<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We\u2019re not done just yet though, we have to return to the way BM25 handles document frequency. Earlier, we had defined IDF as log(N\/DF), but BM25 defines it as:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-cyan-blue-color\">log((N - DF + .5)\/(DF + .5)) <\/mark><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Why the difference?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">As you may have observed, we\u2019ve been developing our scoring function through a set of heuristics. Researchers in the field of Information Retrieval have wanted to put ranking functions on a more rigorous theoretical footing so they can actually prove things about their behavior rather than just experimenting and hoping for the best. To derive a theoretically sound version of IDF, researchers took something called the Robertson-Sp\u00e4rck Jones weight, made a simplifying assumption, and came up with log (N-DF+.5)\/(DF+.5). We\u2019re not going to go into the details, but we\u2019ll just focus on the practical significance of this flavor of IDF. The&nbsp;.5\u2019s don\u2019t really do much here, so let\u2019s just consider log (N-DF)\/DF, which is sometimes referred to as \u201cprobabilistic IDF.\u201d Here we compare our vanilla IDF with probabilistic IDF where N=10.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"477\" src=\"https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/ProbabilisticIDF-1024x477.png\" alt=\"\" class=\"wp-image-26405\" srcset=\"https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/ProbabilisticIDF-1024x477.png 1024w, https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/ProbabilisticIDF-300x140.png 300w, https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/ProbabilisticIDF-768x358.png 768w, https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/ProbabilisticIDF.png 1050w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">You can see that probabilistic IDF takes a sharp drop for terms that are in most of the documents. This might be desirable because if a term really exists in 98% of the documents, it\u2019s probably a stopword like \u201cand\u201d or \u201cor\u201d and it should get much, much less weight than a term that\u2019s very common, like in 70% of the documents, but still not utterly ubiquitous.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The catch is that log (N-DF)\/DF is negative for terms that are in more than half of the corpus. (Remember that the log function goes negative on values between 0 and 1.) We don\u2019t want negative values coming out of our ranking function because the presence of a query term in a document should never count against retrieval\u200a\u2014\u200ait should never cause a lower score than if the term was simply absent. In order to prevent negative values, Lucene\u2019s implementation of BM25 adds a 1 like this:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-cyan-blue-color\">IDF = log (1 + (N - DF + .5)\/(DF + .5))<\/mark><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">This 1 might seem like an innocent modification but it totally changes the behavior of the formula! If we forget again about those pesky&nbsp;.5\u2019s, and we note that adding 1 is the same as adding DF\/DF, you can see that the formula reduces to the vanilla version of IDF that we used before: log (N\/DF).<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-cyan-blue-color\">log (1 + (N - DF + .5)\/(DF + .5)) \u2248\nlog (1 + (N - DF)\/DF ) =\nlog (DF\/DF + (N - DF)\/DF) = \nlog ((DF + N - DF)\/DF) = \nlog (N\/DF)<\/mark><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">So although it looks like BM25 is using a fancy version of IDF, in practice (as implemented in Lucene) it\u2019s basically using the same old version of IDF that\u2019s used in traditional TF\/IDF, without the accelerated decline for high DF values.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Cashing In<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">We\u2019re ready to cash in on our new understanding by looking at the explain output from a Lucene query. You\u2019ll see something like this:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-cyan-blue-color\">\u201cscore(freq=3.0), product of:\u201d\n\n\u201cidf, computed as log(1 + (N \u2014 n + 0.5) \/ (n + 0.5)) from:\u201d\n\n\u201ctf, computed as freq \/ (freq + k1 * (1 \u2014 b + b * dl \/ avgdl)) from:\u201d<\/mark><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">We\u2019re finally prepared to understand this gobbledygook. You can see that Lucene is using a TF*IDF product where TF and IDF have their special BM25 definitions. Lowercase n means DF here. The IDF term is the supposedly fancy version that turns out to be the same as traditional IDF, N\/n.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The TF term is based on our saturation trick: freq\/(freq + k). The use of&nbsp;<strong>k1<\/strong>&nbsp;instead of k in the explain output it historical\u200a\u2014\u200ait comes from a time when there was more than one k in the formula. What we\u2019ve been calling raw TF is denoted as&nbsp;<strong>freq<\/strong>&nbsp;here.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We can see that k1 is multiplied by a factor that penalizes above-average document length while rewarding below-average document length: (1-b + b *dl\/avgdl). What we\u2019ve been calling adl is denoted as&nbsp;<strong>avgdl<\/strong>&nbsp;here.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">And of course we can see that there are parameters, which are set to k=1.2 and b =&nbsp;.75 in Lucene by default. You probably won\u2019t need to tweak these, but you can if you want.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>In summary, simple TF-IDF rewards term frequency and penalizes document frequency. BM25 goes beyond this to account for document length and term frequency saturation.&nbsp;<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">It\u2019s worth noting that before Lucene introduced BM25 as the default ranking function as of version 6, it implemented TF-IDF through something called the&nbsp;<a href=\"https:\/\/www.elastic.co\/guide\/en\/elasticsearch\/guide\/2.x\/practical-scoring-function.html\">Practical Scoring Function<\/a>, which was a set of enhancements (including \u201ccoord\u201d and field length normalization) that made TF-IDF more like BM25. So the behavior difference one might have observed when Lucene made the switch to BM25 was probably less dramatic than it would have been if Lucene had been using pure TF-IDF all along. In any case, the consensus is that BM25 is an improvement, and now you can see why.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">If you\u2019re a search engineer, the Lucene explain output is the most likely place where you\u2019ll encounter the details of the BM25 formula. However, if you delve into theoretical papers or check out the&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Okapi_BM25\">Wikipedia article on BM25<\/a>, you\u2019ll see it written out as an equation like this:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"441\" src=\"https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/bm25demystified-1024x441.png\" alt=\"\" class=\"wp-image-26406\" srcset=\"https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/bm25demystified-1024x441.png 1024w, https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/bm25demystified-300x129.png 300w, https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/bm25demystified-768x331.png 768w, https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/bm25demystified-1536x662.png 1536w, https:\/\/kmwllc.com\/wp-content\/uploads\/2021\/05\/bm25demystified.png 2000w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Hopefully this tour has made you more comfortable with how the two most popular search ranking functions work. Thanks for following along!<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Further Reading<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">This article follows in the footsteps of some other great tours of BM25 that are out there. These two are highly recommended:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/opensourceconnections.com\/blog\/2015\/10\/16\/bm25-the-next-generation-of-lucene-relevation\/\"><strong>BM25 The Next Generation of Lucene Relevance by Doug Turnbull<\/strong><\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/www.elastic.co\/blog\/practical-bm25-part-2-the-bm25-algorithm-and-its-variables\"><strong>Practical BM25 \u2013 Part 2: The BM25 Algorithm and its Variables by Shane Connelly<\/strong><\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">There are many theoretical treatments of ranking out there. A good starting place is&nbsp;<a href=\"http:\/\/www.staff.city.ac.uk\/~sb317\/papers\/foundations_bm25_review.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">\u201cThe Probabilistic Relevance Framework: BM25 and Beyond\u201d by Robertson and Zaragosa<\/a>.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">See also the paper&nbsp;<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/okapi_trec3.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">\u201cOkapi at TREC-3\u201d<\/a>&nbsp;where BM25 was first introduced.<\/p>\n\t\t\t\t\t\t\t<\/div>\r\n\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div><\/div>\r\n\t\t<\/section>\r\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-e0c25bc elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"e0c25bc\" data-element_type=\"section\" data-e-type=\"section\">\r\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-thegem\"><div class=\"elementor-row\">\r\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-a5e6f8a\" data-id=\"a5e6f8a\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-637fda5 flex-horizontal-align-default flex-horizontal-align-tablet-default flex-horizontal-align-mobile-default flex-vertical-align-default flex-vertical-align-tablet-default flex-vertical-align-mobile-default elementor-widget elementor-widget-global elementor-global-28083 elementor-widget-post-navigation\" data-id=\"637fda5\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"post-navigation.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"elementor-post-navigation\" role=\"navigation\" aria-label=\"Post Navigation\">\n\t\t\t<div class=\"elementor-post-navigation__prev elementor-post-navigation__link\">\n\t\t\t\t<a href=\"https:\/\/kmwllc.com\/index.php\/2020\/03\/13\/relevancy-tuning-in-elastic\/\" rel=\"prev\"><span class=\"elementor-post-navigation__link__prev\"><span class=\"post-navigation__prev--label\">Previous Post<\/span><span class=\"post-navigation__prev--title\">Relevancy Tuning in Elastic<\/span><\/span><\/a>\t\t\t<\/div>\n\t\t\t\t\t\t<div class=\"elementor-post-navigation__next elementor-post-navigation__link\">\n\t\t\t\t<a href=\"https:\/\/kmwllc.com\/index.php\/2020\/04\/01\/solr-json-facets-for-reporting-and-data-aggregation\/\" rel=\"next\"><span class=\"elementor-post-navigation__link__next\"><span class=\"post-navigation__next--label\">Next Post<\/span><span class=\"post-navigation__next--title\">Solr JSON Facets for Reporting and Data Aggregation<\/span><\/span><\/a>\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div><\/div>\r\n\t\t<\/section>\r\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>This post will show you precisely how BM25 builds upon TF-IDF, what its parameters do, and why it is so effective.<\/p>\n","protected":false},"author":4,"featured_media":29716,"comment_status":"closed","ping_status":"open","sticky":false,"template":"elementor_theme","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[46,47,36],"tags":[],"class_list":["post-26186","post","type-post","status-publish","format-standard","has-post-thumbnail","category-lucene","category-relevancy","category-search"],"aioseo_notices":[],"aioseo_head":"\n\t\t<!-- All in One SEO 4.9.8 - aioseo.com -->\n\t<meta name=\"description\" content=\"This post will show you precisely how BM25 builds upon TF-IDF, what its parameters do, and why it is so effective.\" \/>\n\t<meta name=\"robots\" content=\"max-image-preview:large\" \/>\n\t<meta name=\"author\" content=\"Rudi Seitz\"\/>\n\t<link rel=\"canonical\" href=\"https:\/\/kmwllc.com\/index.php\/2020\/03\/20\/understanding-tf-idf-and-bm-25\/\" \/>\n\t<meta name=\"generator\" content=\"All in One SEO (AIOSEO) 4.9.8\" \/>\n\t\t<meta property=\"og:locale\" content=\"en_US\" \/>\n\t\t<meta property=\"og:site_name\" content=\"KMW Technology - Search Professional Services\" \/>\n\t\t<meta property=\"og:type\" content=\"article\" \/>\n\t\t<meta property=\"og:title\" content=\"Understanding TF-IDF and BM-25 - KMW Technology\" \/>\n\t\t<meta property=\"og:description\" content=\"This post will show you precisely how BM25 builds upon TF-IDF, what its parameters do, and why it is so effective.\" \/>\n\t\t<meta property=\"og:url\" content=\"https:\/\/kmwllc.com\/index.php\/2020\/03\/20\/understanding-tf-idf-and-bm-25\/\" \/>\n\t\t<meta property=\"og:image\" content=\"https:\/\/kmwllc.com\/wp-content\/uploads\/2020\/03\/blog_understandingTFIDF1200x900.png\" \/>\n\t\t<meta property=\"og:image:secure_url\" content=\"https:\/\/kmwllc.com\/wp-content\/uploads\/2020\/03\/blog_understandingTFIDF1200x900.png\" \/>\n\t\t<meta property=\"og:image:width\" content=\"1201\" \/>\n\t\t<meta property=\"og:image:height\" content=\"901\" \/>\n\t\t<meta property=\"article:published_time\" content=\"2020-03-20T00:37:00+00:00\" \/>\n\t\t<meta property=\"article:modified_time\" content=\"2024-05-01T02:24:31+00:00\" \/>\n\t\t<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n\t\t<meta name=\"twitter:site\" content=\"@kmw_technology\" \/>\n\t\t<meta name=\"twitter:title\" content=\"Understanding TF-IDF and BM-25 - KMW Technology\" \/>\n\t\t<meta name=\"twitter:description\" content=\"This post will show you precisely how BM25 builds upon TF-IDF, what its parameters do, and why it is so effective.\" \/>\n\t\t<meta name=\"twitter:creator\" content=\"@kmw_technology\" \/>\n\t\t<meta name=\"twitter:image\" content=\"https:\/\/kmwllc.com\/wp-content\/uploads\/2020\/03\/blog_understandingTFIDF1200x900.png\" \/>\n\t\t<script type=\"application\/ld+json\" class=\"aioseo-schema\">\n\t\t\t{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"BlogPosting\",\"@id\":\"https:\\\/\\\/kmwllc.com\\\/index.php\\\/2020\\\/03\\\/20\\\/understanding-tf-idf-and-bm-25\\\/#blogposting\",\"name\":\"Understanding TF-IDF and BM-25 - KMW Technology\",\"headline\":\"Understanding TF-IDF and BM-25\",\"author\":{\"@id\":\"https:\\\/\\\/kmwllc.com\\\/index.php\\\/author\\\/rseitz\\\/#author\"},\"publisher\":{\"@id\":\"https:\\\/\\\/kmwllc.com\\\/#organization\"},\"image\":{\"@type\":\"ImageObject\",\"url\":\"https:\\\/\\\/kmwllc.com\\\/wp-content\\\/uploads\\\/2020\\\/03\\\/blog_understandingTFIDF1200x900.png\",\"width\":1201,\"height\":901},\"datePublished\":\"2020-03-20T05:37:00+05:00\",\"dateModified\":\"2024-05-01T07:24:31+05:00\",\"inLanguage\":\"en-US\",\"commentCount\":2,\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/kmwllc.com\\\/index.php\\\/2020\\\/03\\\/20\\\/understanding-tf-idf-and-bm-25\\\/#webpage\"},\"isPartOf\":{\"@id\":\"https:\\\/\\\/kmwllc.com\\\/index.php\\\/2020\\\/03\\\/20\\\/understanding-tf-idf-and-bm-25\\\/#webpage\"},\"articleSection\":\"Lucene, Relevancy, Search\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/kmwllc.com\\\/index.php\\\/2020\\\/03\\\/20\\\/understanding-tf-idf-and-bm-25\\\/#breadcrumblist\",\"itemListElement\":[{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/kmwllc.com#listItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/kmwllc.com\",\"nextItem\":{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/kmwllc.com\\\/index.php\\\/category\\\/search\\\/#listItem\",\"name\":\"Search\"}},{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/kmwllc.com\\\/index.php\\\/category\\\/search\\\/#listItem\",\"position\":2,\"name\":\"Search\",\"item\":\"https:\\\/\\\/kmwllc.com\\\/index.php\\\/category\\\/search\\\/\",\"nextItem\":{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/kmwllc.com\\\/index.php\\\/2020\\\/03\\\/20\\\/understanding-tf-idf-and-bm-25\\\/#listItem\",\"name\":\"Understanding TF-IDF and BM-25\"},\"previousItem\":{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/kmwllc.com#listItem\",\"name\":\"Home\"}},{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/kmwllc.com\\\/index.php\\\/2020\\\/03\\\/20\\\/understanding-tf-idf-and-bm-25\\\/#listItem\",\"position\":3,\"name\":\"Understanding TF-IDF and BM-25\",\"previousItem\":{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/kmwllc.com\\\/index.php\\\/category\\\/search\\\/#listItem\",\"name\":\"Search\"}}]},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/kmwllc.com\\\/#organization\",\"name\":\"KMW Technology\",\"description\":\"Search Professional Services\",\"url\":\"https:\\\/\\\/kmwllc.com\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"url\":\"https:\\\/\\\/kmwllc.com\\\/wp-content\\\/uploads\\\/2024\\\/06\\\/KMWLogo_blk-1-1.png\",\"@id\":\"https:\\\/\\\/kmwllc.com\\\/index.php\\\/2020\\\/03\\\/20\\\/understanding-tf-idf-and-bm-25\\\/#organizationLogo\",\"width\":858,\"height\":264},\"image\":{\"@id\":\"https:\\\/\\\/kmwllc.com\\\/index.php\\\/2020\\\/03\\\/20\\\/understanding-tf-idf-and-bm-25\\\/#organizationLogo\"},\"sameAs\":[\"https:\\\/\\\/twitter.com\\\/kmw_technology\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/kmwllc.com\\\/index.php\\\/author\\\/rseitz\\\/#author\",\"url\":\"https:\\\/\\\/kmwllc.com\\\/index.php\\\/author\\\/rseitz\\\/\",\"name\":\"Rudi Seitz\",\"image\":{\"@type\":\"ImageObject\",\"@id\":\"https:\\\/\\\/kmwllc.com\\\/index.php\\\/2020\\\/03\\\/20\\\/understanding-tf-idf-and-bm-25\\\/#authorImage\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/2f38cf292686a6230ea3ced94cd7923d3ef778a2e738f47bed23ab4b57c29e9c?s=96&d=mm&r=g\",\"width\":96,\"height\":96,\"caption\":\"Rudi Seitz\"}},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/kmwllc.com\\\/index.php\\\/2020\\\/03\\\/20\\\/understanding-tf-idf-and-bm-25\\\/#webpage\",\"url\":\"https:\\\/\\\/kmwllc.com\\\/index.php\\\/2020\\\/03\\\/20\\\/understanding-tf-idf-and-bm-25\\\/\",\"name\":\"Understanding TF-IDF and BM-25 - KMW Technology\",\"description\":\"This post will show you precisely how BM25 builds upon TF-IDF, what its parameters do, and why it is so effective.\",\"inLanguage\":\"en-US\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/kmwllc.com\\\/#website\"},\"breadcrumb\":{\"@id\":\"https:\\\/\\\/kmwllc.com\\\/index.php\\\/2020\\\/03\\\/20\\\/understanding-tf-idf-and-bm-25\\\/#breadcrumblist\"},\"author\":{\"@id\":\"https:\\\/\\\/kmwllc.com\\\/index.php\\\/author\\\/rseitz\\\/#author\"},\"creator\":{\"@id\":\"https:\\\/\\\/kmwllc.com\\\/index.php\\\/author\\\/rseitz\\\/#author\"},\"image\":{\"@type\":\"ImageObject\",\"url\":\"https:\\\/\\\/kmwllc.com\\\/wp-content\\\/uploads\\\/2020\\\/03\\\/blog_understandingTFIDF1200x900.png\",\"@id\":\"https:\\\/\\\/kmwllc.com\\\/index.php\\\/2020\\\/03\\\/20\\\/understanding-tf-idf-and-bm-25\\\/#mainImage\",\"width\":1201,\"height\":901},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/kmwllc.com\\\/index.php\\\/2020\\\/03\\\/20\\\/understanding-tf-idf-and-bm-25\\\/#mainImage\"},\"datePublished\":\"2020-03-20T05:37:00+05:00\",\"dateModified\":\"2024-05-01T07:24:31+05:00\"},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/kmwllc.com\\\/#website\",\"url\":\"https:\\\/\\\/kmwllc.com\\\/\",\"name\":\"KMW Technology\",\"description\":\"Search Professional Services\",\"inLanguage\":\"en-US\",\"publisher\":{\"@id\":\"https:\\\/\\\/kmwllc.com\\\/#organization\"}}]}\n\t\t<\/script>\n\t\t<!-- All in One SEO -->\n\n","aioseo_head_json":{"title":"Understanding TF-IDF and BM-25 - KMW Technology","description":"This post will show you precisely how BM25 builds upon TF-IDF, what its parameters do, and why it is so effective.","canonical_url":"https:\/\/kmwllc.com\/index.php\/2020\/03\/20\/understanding-tf-idf-and-bm-25\/","robots":"max-image-preview:large","keywords":"","webmasterTools":{"miscellaneous":""},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"BlogPosting","@id":"https:\/\/kmwllc.com\/index.php\/2020\/03\/20\/understanding-tf-idf-and-bm-25\/#blogposting","name":"Understanding TF-IDF and BM-25 - KMW Technology","headline":"Understanding TF-IDF and BM-25","author":{"@id":"https:\/\/kmwllc.com\/index.php\/author\/rseitz\/#author"},"publisher":{"@id":"https:\/\/kmwllc.com\/#organization"},"image":{"@type":"ImageObject","url":"https:\/\/kmwllc.com\/wp-content\/uploads\/2020\/03\/blog_understandingTFIDF1200x900.png","width":1201,"height":901},"datePublished":"2020-03-20T05:37:00+05:00","dateModified":"2024-05-01T07:24:31+05:00","inLanguage":"en-US","commentCount":2,"mainEntityOfPage":{"@id":"https:\/\/kmwllc.com\/index.php\/2020\/03\/20\/understanding-tf-idf-and-bm-25\/#webpage"},"isPartOf":{"@id":"https:\/\/kmwllc.com\/index.php\/2020\/03\/20\/understanding-tf-idf-and-bm-25\/#webpage"},"articleSection":"Lucene, Relevancy, Search"},{"@type":"BreadcrumbList","@id":"https:\/\/kmwllc.com\/index.php\/2020\/03\/20\/understanding-tf-idf-and-bm-25\/#breadcrumblist","itemListElement":[{"@type":"ListItem","@id":"https:\/\/kmwllc.com#listItem","position":1,"name":"Home","item":"https:\/\/kmwllc.com","nextItem":{"@type":"ListItem","@id":"https:\/\/kmwllc.com\/index.php\/category\/search\/#listItem","name":"Search"}},{"@type":"ListItem","@id":"https:\/\/kmwllc.com\/index.php\/category\/search\/#listItem","position":2,"name":"Search","item":"https:\/\/kmwllc.com\/index.php\/category\/search\/","nextItem":{"@type":"ListItem","@id":"https:\/\/kmwllc.com\/index.php\/2020\/03\/20\/understanding-tf-idf-and-bm-25\/#listItem","name":"Understanding TF-IDF and BM-25"},"previousItem":{"@type":"ListItem","@id":"https:\/\/kmwllc.com#listItem","name":"Home"}},{"@type":"ListItem","@id":"https:\/\/kmwllc.com\/index.php\/2020\/03\/20\/understanding-tf-idf-and-bm-25\/#listItem","position":3,"name":"Understanding TF-IDF and BM-25","previousItem":{"@type":"ListItem","@id":"https:\/\/kmwllc.com\/index.php\/category\/search\/#listItem","name":"Search"}}]},{"@type":"Organization","@id":"https:\/\/kmwllc.com\/#organization","name":"KMW Technology","description":"Search Professional Services","url":"https:\/\/kmwllc.com\/","logo":{"@type":"ImageObject","url":"https:\/\/kmwllc.com\/wp-content\/uploads\/2024\/06\/KMWLogo_blk-1-1.png","@id":"https:\/\/kmwllc.com\/index.php\/2020\/03\/20\/understanding-tf-idf-and-bm-25\/#organizationLogo","width":858,"height":264},"image":{"@id":"https:\/\/kmwllc.com\/index.php\/2020\/03\/20\/understanding-tf-idf-and-bm-25\/#organizationLogo"},"sameAs":["https:\/\/twitter.com\/kmw_technology"]},{"@type":"Person","@id":"https:\/\/kmwllc.com\/index.php\/author\/rseitz\/#author","url":"https:\/\/kmwllc.com\/index.php\/author\/rseitz\/","name":"Rudi Seitz","image":{"@type":"ImageObject","@id":"https:\/\/kmwllc.com\/index.php\/2020\/03\/20\/understanding-tf-idf-and-bm-25\/#authorImage","url":"https:\/\/secure.gravatar.com\/avatar\/2f38cf292686a6230ea3ced94cd7923d3ef778a2e738f47bed23ab4b57c29e9c?s=96&d=mm&r=g","width":96,"height":96,"caption":"Rudi Seitz"}},{"@type":"WebPage","@id":"https:\/\/kmwllc.com\/index.php\/2020\/03\/20\/understanding-tf-idf-and-bm-25\/#webpage","url":"https:\/\/kmwllc.com\/index.php\/2020\/03\/20\/understanding-tf-idf-and-bm-25\/","name":"Understanding TF-IDF and BM-25 - KMW Technology","description":"This post will show you precisely how BM25 builds upon TF-IDF, what its parameters do, and why it is so effective.","inLanguage":"en-US","isPartOf":{"@id":"https:\/\/kmwllc.com\/#website"},"breadcrumb":{"@id":"https:\/\/kmwllc.com\/index.php\/2020\/03\/20\/understanding-tf-idf-and-bm-25\/#breadcrumblist"},"author":{"@id":"https:\/\/kmwllc.com\/index.php\/author\/rseitz\/#author"},"creator":{"@id":"https:\/\/kmwllc.com\/index.php\/author\/rseitz\/#author"},"image":{"@type":"ImageObject","url":"https:\/\/kmwllc.com\/wp-content\/uploads\/2020\/03\/blog_understandingTFIDF1200x900.png","@id":"https:\/\/kmwllc.com\/index.php\/2020\/03\/20\/understanding-tf-idf-and-bm-25\/#mainImage","width":1201,"height":901},"primaryImageOfPage":{"@id":"https:\/\/kmwllc.com\/index.php\/2020\/03\/20\/understanding-tf-idf-and-bm-25\/#mainImage"},"datePublished":"2020-03-20T05:37:00+05:00","dateModified":"2024-05-01T07:24:31+05:00"},{"@type":"WebSite","@id":"https:\/\/kmwllc.com\/#website","url":"https:\/\/kmwllc.com\/","name":"KMW Technology","description":"Search Professional Services","inLanguage":"en-US","publisher":{"@id":"https:\/\/kmwllc.com\/#organization"}}]},"og:locale":"en_US","og:site_name":"KMW Technology - Search Professional Services","og:type":"article","og:title":"Understanding TF-IDF and BM-25 - KMW Technology","og:description":"This post will show you precisely how BM25 builds upon TF-IDF, what its parameters do, and why it is so effective.","og:url":"https:\/\/kmwllc.com\/index.php\/2020\/03\/20\/understanding-tf-idf-and-bm-25\/","og:image":"https:\/\/kmwllc.com\/wp-content\/uploads\/2020\/03\/blog_understandingTFIDF1200x900.png","og:image:secure_url":"https:\/\/kmwllc.com\/wp-content\/uploads\/2020\/03\/blog_understandingTFIDF1200x900.png","og:image:width":1201,"og:image:height":901,"article:published_time":"2020-03-20T00:37:00+00:00","article:modified_time":"2024-05-01T02:24:31+00:00","twitter:card":"summary_large_image","twitter:site":"@kmw_technology","twitter:title":"Understanding TF-IDF and BM-25 - KMW Technology","twitter:description":"This post will show you precisely how BM25 builds upon TF-IDF, what its parameters do, and why it is so effective.","twitter:creator":"@kmw_technology","twitter:image":"https:\/\/kmwllc.com\/wp-content\/uploads\/2020\/03\/blog_understandingTFIDF1200x900.png"},"aioseo_meta_data":{"post_id":"26186","title":null,"description":null,"keywords":null,"keyphrases":null,"primary_term":null,"canonical_url":null,"og_title":null,"og_description":null,"og_object_type":"default","og_image_type":"default","og_image_url":null,"og_image_width":null,"og_image_height":null,"og_image_custom_url":null,"og_image_custom_fields":null,"og_video":null,"og_custom_url":null,"og_article_section":null,"og_article_tags":null,"twitter_use_og":false,"twitter_card":"default","twitter_image_type":"default","twitter_image_url":null,"twitter_image_custom_url":null,"twitter_image_custom_fields":null,"twitter_title":null,"twitter_description":null,"schema":{"blockGraphs":[],"customGraphs":[],"default":{"data":{"Article":[],"Course":[],"Dataset":[],"FAQPage":[],"Movie":[],"Person":[],"Product":[],"ProductReview":[],"Car":[],"Recipe":[],"Service":[],"SoftwareApplication":[],"WebPage":[]},"graphName":"","isEnabled":true},"graphs":[]},"schema_type":"default","schema_type_options":null,"pillar_content":false,"robots_default":true,"robots_noindex":false,"robots_noarchive":false,"robots_nosnippet":false,"robots_nofollow":false,"robots_noimageindex":false,"robots_noodp":false,"robots_notranslate":false,"robots_max_snippet":null,"robots_max_videopreview":null,"robots_max_imagepreview":"large","priority":null,"frequency":null,"local_seo":null,"breadcrumb_settings":null,"limit_modified_date":false,"ai":null,"created":"2024-06-26 16:31:53","updated":"2025-12-10 05:12:01","seo_analyzer_scan_date":null},"aioseo_breadcrumb":"<div class=\"aioseo-breadcrumbs\"><span class=\"aioseo-breadcrumb\">\n\t\t\t<a href=\"https:\/\/kmwllc.com\" title=\"Home\">Home<\/a>\n\t\t<\/span><span class=\"aioseo-breadcrumb-separator\">&raquo;<\/span><span class=\"aioseo-breadcrumb\">\n\t\t\t<a href=\"https:\/\/kmwllc.com\/index.php\/category\/search\/\" title=\"Search\">Search<\/a>\n\t\t<\/span><span class=\"aioseo-breadcrumb-separator\">&raquo;<\/span><span class=\"aioseo-breadcrumb\">\n\t\t\tUnderstanding TF-IDF and BM-25\n\t\t<\/span><\/div>","aioseo_breadcrumb_json":[{"label":"Home","link":"https:\/\/kmwllc.com"},{"label":"Search","link":"https:\/\/kmwllc.com\/index.php\/category\/search\/"},{"label":"Understanding TF-IDF and BM-25","link":"https:\/\/kmwllc.com\/index.php\/2020\/03\/20\/understanding-tf-idf-and-bm-25\/"}],"post_meta_fields":{"_wp_page_template":["elementor_theme"],"_edit_lock":["1714530277:7"],"_zilla_likes":["29"],"_wp_old_date":["2021-03-13"],"_edit_last":["7"],"_customize_sidebars":["yes"],"thegem_page_data":["a:190:{s:10:\"title_show\";s:7:\"default\";s:11:\"title_style\";s:1:\"2\";s:14:\"title_template\";s:5:\"27835\";s:23:\"title_use_page_settings\";i:0;s:12:\"title_xlarge\";i:0;s:18:\"title_rich_content\";i:0;s:13:\"title_content\";s:0:\"\";s:21:\"title_background_type\";s:5:\"color\";s:22:\"title_background_image\";s:0:\"\";s:29:\"title_background_image_repeat\";i:0;s:27:\"title_background_position_x\";s:6:\"center\";s:27:\"title_background_position_y\";s:3:\"top\";s:21:\"title_background_size\";s:5:\"cover\";s:28:\"title_background_image_color\";s:0:\"\";s:30:\"title_background_image_overlay\";s:0:\"\";s:30:\"title_background_gradient_type\";s:6:\"linear\";s:31:\"title_background_gradient_angle\";i:90;s:34:\"title_background_gradient_position\";s:13:\"center center\";s:38:\"title_background_gradient_point1_color\";s:9:\"#00BCD4BF\";s:41:\"title_background_gradient_point1_position\";i:0;s:38:\"title_background_gradient_point2_color\";s:9:\"#354093BF\";s:41:\"title_background_gradient_point2_position\";i:100;s:23:\"title_background_effect\";s:6:\"normal\";s:36:\"title_background_ken_burns_direction\";s:7:\"zoom_in\";s:43:\"title_background_ken_burns_transition_speed\";i:15000;s:37:\"title_background_video_play_on_mobile\";i:0;s:22:\"title_background_color\";s:7:\"#333144\";s:27:\"title_background_video_type\";s:0:\"\";s:22:\"title_background_video\";s:0:\"\";s:35:\"title_background_video_aspect_ratio\";s:0:\"\";s:36:\"title_background_video_overlay_color\";s:0:\"\";s:38:\"title_background_video_overlay_opacity\";s:0:\"\";s:29:\"title_background_video_poster\";s:0:\"\";s:19:\"title_menu_on_video\";s:0:\"\";s:16:\"title_text_color\";s:7:\"#ffffff\";s:24:\"title_excerpt_text_color\";s:7:\"#ffffff\";s:13:\"title_excerpt\";s:0:\"\";s:17:\"title_title_width\";i:0;s:19:\"title_excerpt_width\";i:0;s:22:\"title_font_preset_html\";s:0:\"\";s:23:\"title_font_preset_style\";s:0:\"\";s:24:\"title_font_preset_weight\";s:0:\"\";s:27:\"title_font_preset_transform\";s:0:\"\";s:30:\"title_excerpt_font_preset_html\";s:0:\"\";s:31:\"title_excerpt_font_preset_style\";s:0:\"\";s:32:\"title_excerpt_font_preset_weight\";s:0:\"\";s:35:\"title_excerpt_font_preset_transform\";s:0:\"\";s:17:\"title_padding_top\";i:80;s:24:\"title_padding_top_tablet\";i:80;s:24:\"title_padding_top_mobile\";i:80;s:20:\"title_padding_bottom\";i:80;s:27:\"title_padding_bottom_tablet\";i:80;s:27:\"title_padding_bottom_mobile\";i:80;s:18:\"title_padding_left\";i:0;s:25:\"title_padding_left_tablet\";i:0;s:25:\"title_padding_left_mobile\";i:0;s:19:\"title_padding_right\";i:0;s:26:\"title_padding_right_tablet\";i:0;s:26:\"title_padding_right_mobile\";i:0;s:16:\"title_top_margin\";s:0:\"\";s:23:\"title_top_margin_tablet\";i:0;s:23:\"title_top_margin_mobile\";i:0;s:24:\"title_excerpt_top_margin\";i:18;s:31:\"title_excerpt_top_margin_tablet\";i:18;s:31:\"title_excerpt_top_margin_mobile\";i:18;s:17:\"title_breadcrumbs\";i:1;s:15:\"title_alignment\";s:0:\"\";s:15:\"title_icon_pack\";s:7:\"elegant\";s:10:\"title_icon\";s:0:\"\";s:16:\"title_icon_color\";s:0:\"\";s:18:\"title_icon_color_2\";s:0:\"\";s:27:\"title_icon_background_color\";s:0:\"\";s:16:\"title_icon_shape\";s:6:\"circle\";s:23:\"title_icon_border_color\";s:0:\"\";s:15:\"title_icon_size\";s:5:\"large\";s:16:\"title_icon_style\";s:0:\"\";s:18:\"title_icon_opacity\";d:0;s:25:\"breadcrumbs_default_color\";s:0:\"\";s:24:\"breadcrumbs_active_color\";s:0:\"\";s:23:\"breadcrumbs_hover_color\";s:0:\"\";s:27:\"title_breadcrumbs_alignment\";s:6:\"center\";s:18:\"header_transparent\";i:0;s:14:\"header_opacity\";i:0;s:22:\"header_menu_logo_light\";i:0;s:20:\"header_hide_top_area\";s:7:\"default\";s:27:\"header_hide_top_area_tablet\";s:7:\"default\";s:27:\"header_hide_top_area_mobile\";s:7:\"default\";s:9:\"menu_show\";s:7:\"default\";s:12:\"menu_options\";s:7:\"default\";s:18:\"header_custom_menu\";i:0;s:27:\"header_top_area_transparent\";i:0;s:23:\"header_top_area_opacity\";i:0;s:16:\"top_area_options\";s:7:\"default\";s:13:\"header_source\";s:7:\"default\";s:14:\"header_builder\";s:1:\"0\";s:29:\"header_builder_sticky_desktop\";i:0;s:28:\"header_builder_sticky_mobile\";i:0;s:34:\"header_builder_sticky_hide_desktop\";i:0;s:33:\"header_builder_sticky_hide_mobile\";i:1;s:21:\"header_builder_sticky\";s:1:\"0\";s:29:\"header_builder_sticky_opacity\";i:80;s:26:\"header_builder_light_color\";s:7:\"#FFFFFF\";s:32:\"header_builder_light_color_hover\";s:7:\"#00bcd4\";s:20:\"main_background_type\";b:0;s:21:\"main_background_color\";s:7:\"#ffffff\";s:21:\"main_background_image\";s:0:\"\";s:28:\"main_background_image_repeat\";i:0;s:26:\"main_background_position_x\";s:6:\"center\";s:26:\"main_background_position_y\";s:6:\"center\";s:20:\"main_background_size\";s:4:\"auto\";s:27:\"main_background_image_color\";s:0:\"\";s:29:\"main_background_image_overlay\";s:0:\"\";s:29:\"main_background_gradient_type\";s:6:\"linear\";s:30:\"main_background_gradient_angle\";i:90;s:33:\"main_background_gradient_position\";s:0:\"\";s:37:\"main_background_gradient_point1_color\";s:9:\"#E9ECDAFF\";s:40:\"main_background_gradient_point1_position\";i:0;s:37:\"main_background_gradient_point2_color\";s:9:\"#D5F6FAFF\";s:40:\"main_background_gradient_point2_position\";i:100;s:23:\"main_background_pattern\";s:0:\"\";s:19:\"content_padding_top\";i:135;s:26:\"content_padding_top_tablet\";s:0:\"\";s:26:\"content_padding_top_mobile\";s:0:\"\";s:22:\"content_padding_bottom\";i:110;s:29:\"content_padding_bottom_tablet\";s:0:\"\";s:29:\"content_padding_bottom_mobile\";s:0:\"\";s:20:\"content_area_options\";s:7:\"default\";s:18:\"footer_custom_show\";s:7:\"default\";s:13:\"footer_custom\";s:5:\"24822\";s:19:\"footer_hide_default\";s:7:\"default\";s:23:\"footer_hide_widget_area\";s:7:\"default\";s:16:\"effects_disabled\";i:0;s:17:\"effects_one_pager\";i:0;s:23:\"effects_parallax_footer\";i:0;s:24:\"effects_no_bottom_margin\";i:0;s:21:\"effects_no_top_margin\";i:0;s:19:\"redirect_to_subpage\";i:0;s:19:\"effects_hide_header\";s:7:\"default\";s:19:\"effects_hide_footer\";s:7:\"default\";s:21:\"effects_page_scroller\";i:0;s:28:\"effects_page_scroller_mobile\";i:0;s:26:\"effects_page_scroller_type\";s:8:\"advanced\";s:22:\"fullpage_disabled_dots\";i:0;s:19:\"fullpage_style_dots\";s:7:\"outline\";s:31:\"fullpage_disabled_tooltips_dots\";i:0;s:25:\"fullpage_fixed_background\";b:0;s:26:\"fullpage_enable_continuous\";i:0;s:24:\"fullpage_disabled_mobile\";i:0;s:22:\"fullpage_scroll_effect\";s:6:\"normal\";s:21:\"enable_page_preloader\";s:7:\"default\";s:14:\"slideshow_type\";s:0:\"\";s:19:\"slideshow_slideshow\";s:0:\"\";s:21:\"slideshow_layerslider\";s:0:\"\";s:19:\"slideshow_revslider\";s:0:\"\";s:19:\"slideshow_preloader\";i:1;s:12:\"sidebar_show\";s:8:\"disabled\";s:16:\"sidebar_position\";s:4:\"left\";s:14:\"sidebar_sticky\";i:0;s:24:\"product_header_separator\";i:0;s:23:\"page_layout_breadcrumbs\";s:7:\"default\";s:37:\"page_layout_breadcrumbs_default_color\";s:9:\"#99A9B5FF\";s:36:\"page_layout_breadcrumbs_active_color\";s:9:\"#3C3950FF\";s:35:\"page_layout_breadcrumbs_hover_color\";s:9:\"#3C3950FF\";s:33:\"page_layout_breadcrumbs_alignment\";s:4:\"left\";s:38:\"page_layout_breadcrumbs_bottom_spacing\";s:1:\"0\";s:37:\"page_layout_breadcrumbs_shop_category\";i:0;s:18:\"delay_js_execution\";i:0;s:13:\"disable_cache\";i:0;s:27:\"title_video_overlay_opacity\";s:0:\"\";s:31:\"title_breadcrumbs_shop_category\";s:1:\"0\";s:20:\"title_padding_locked\";s:0:\"\";s:27:\"title_padding_tablet_locked\";s:0:\"\";s:27:\"title_padding_mobile_locked\";s:0:\"\";s:24:\"title_background_pattern\";s:0:\"\";s:30:\"title_background_video_overlay\";s:0:\"\";s:16:\"title_icon__pack\";s:0:\"\";s:21:\"title_icon_shape_show\";s:0:\"\";s:25:\"footer_widget_woocommerce\";s:1:\"1\";s:19:\"portfolio_item_data\";a:9:{s:8:\"back_url\";s:0:\"\";s:9:\"highlight\";s:0:\"\";s:14:\"highlight_type\";s:0:\"\";s:14:\"overview_title\";s:0:\"\";s:16:\"overview_summary\";s:0:\"\";s:12:\"project_link\";s:0:\"\";s:12:\"project_text\";s:0:\"\";s:9:\"fullwidth\";s:0:\"\";s:19:\"project_button_show\";s:0:\"\";}s:23:\"portfolio_elements_data\";a:7:{s:23:\"portfolio_page_elements\";s:7:\"default\";s:19:\"portfolio_hide_date\";s:0:\"\";s:19:\"portfolio_hide_sets\";s:0:\"\";s:20:\"portfolio_hide_likes\";s:0:\"\";s:22:\"portfolio_hide_socials\";s:0:\"\";s:29:\"portfolio_hide_top_navigation\";s:0:\"\";s:32:\"portfolio_hide_bottom_navigation\";s:0:\"\";}s:17:\"product_item_data\";a:109:{s:9:\"highlight\";s:0:\"\";s:14:\"highlight_type\";s:7:\"squared\";s:28:\"thegem_product_disable_hover\";s:1:\"0\";s:10:\"size_guide\";s:7:\"default\";s:16:\"size_guide_image\";s:0:\"\";s:23:\"product_layout_settings\";s:7:\"default\";s:21:\"product_layout_source\";s:7:\"default\";s:24:\"product_builder_template\";s:0:\"\";s:19:\"product_page_layout\";s:7:\"default\";s:25:\"product_page_layout_style\";s:15:\"horizontal_tabs\";s:28:\"product_page_layout_centered\";s:1:\"0\";s:39:\"product_page_layout_centered_top_margin\";s:2:\"42\";s:34:\"product_page_layout_centered_boxed\";s:1:\"0\";s:40:\"product_page_layout_centered_boxed_color\";s:0:\"\";s:29:\"product_page_layout_fullwidth\";s:1:\"0\";s:26:\"product_page_layout_sticky\";s:1:\"0\";s:33:\"product_page_layout_sticky_offset\";s:1:\"0\";s:28:\"product_page_skeleton_loader\";s:1:\"0\";s:30:\"product_page_layout_background\";s:0:\"\";s:30:\"product_page_layout_title_area\";s:8:\"disabled\";s:29:\"product_page_ajax_add_to_cart\";s:1:\"1\";s:31:\"product_page_desc_review_source\";s:17:\"extra_description\";s:31:\"product_page_desc_review_layout\";s:4:\"tabs\";s:42:\"product_page_desc_review_layout_tabs_style\";s:10:\"horizontal\";s:46:\"product_page_desc_review_layout_tabs_alignment\";s:4:\"left\";s:44:\"product_page_desc_review_layout_acc_position\";s:13:\"below_gallery\";s:65:\"product_page_desc_review_layout_one_by_one_description_background\";s:9:\"#F4F6F7FF\";s:69:\"product_page_desc_review_layout_one_by_one_additional_info_background\";s:9:\"#FFFFFFFF\";s:61:\"product_page_desc_review_layout_one_by_one_reviews_background\";s:9:\"#F4F6F7FF\";s:36:\"product_page_desc_review_description\";s:1:\"1\";s:42:\"product_page_desc_review_description_title\";s:11:\"Description\";s:40:\"product_page_desc_review_additional_info\";s:1:\"1\";s:46:\"product_page_desc_review_additional_info_title\";s:15:\"Additional Info\";s:32:\"product_page_desc_review_reviews\";s:1:\"1\";s:38:\"product_page_desc_review_reviews_title\";s:7:\"Reviews\";s:36:\"product_page_button_add_to_cart_text\";s:11:\"Add to Cart\";s:36:\"product_page_button_add_to_cart_icon\";s:4:\"f1e7\";s:41:\"product_page_button_add_to_cart_icon_pack\";s:8:\"material\";s:45:\"product_page_button_add_to_cart_icon_position\";s:4:\"left\";s:40:\"product_page_button_add_to_wishlist_icon\";s:4:\"f37b\";s:45:\"product_page_button_add_to_wishlist_icon_pack\";s:8:\"material\";s:42:\"product_page_button_added_to_wishlist_icon\";s:4:\"f377\";s:47:\"product_page_button_added_to_wishlist_icon_pack\";s:8:\"material\";s:41:\"product_page_button_clear_attributes_text\";s:15:\"Clear selection\";s:31:\"product_page_elements_prev_next\";s:1:\"1\";s:38:\"product_page_elements_preview_on_hover\";s:1:\"1\";s:34:\"product_page_elements_back_to_shop\";s:1:\"1\";s:39:\"product_page_elements_back_to_shop_link\";s:9:\"main_shop\";s:50:\"product_page_elements_back_to_shop_link_custom_url\";s:0:\"\";s:27:\"product_page_elements_title\";s:1:\"1\";s:32:\"product_page_elements_attributes\";s:1:\"0\";s:37:\"product_page_elements_attributes_data\";s:0:\"\";s:29:\"product_page_elements_reviews\";s:1:\"1\";s:34:\"product_page_elements_reviews_text\";s:16:\"customer reviews\";s:27:\"product_page_elements_price\";s:1:\"1\";s:41:\"product_page_elements_price_strikethrough\";s:1:\"1\";s:33:\"product_page_elements_description\";s:1:\"1\";s:34:\"product_page_elements_stock_amount\";s:1:\"1\";s:39:\"product_page_elements_stock_amount_text\";s:17:\"Products in stock\";s:32:\"product_page_elements_size_guide\";s:1:\"1\";s:25:\"product_page_elements_sku\";s:1:\"1\";s:31:\"product_page_elements_sku_title\";s:3:\"SKU\";s:32:\"product_page_elements_categories\";s:1:\"1\";s:38:\"product_page_elements_categories_title\";s:10:\"Categories\";s:26:\"product_page_elements_tags\";s:1:\"1\";s:32:\"product_page_elements_tags_title\";s:4:\"Tags\";s:27:\"product_page_elements_share\";s:1:\"1\";s:33:\"product_page_elements_share_title\";s:5:\"Share\";s:36:\"product_page_elements_share_facebook\";s:1:\"1\";s:35:\"product_page_elements_share_twitter\";s:1:\"1\";s:37:\"product_page_elements_share_pinterest\";s:1:\"1\";s:34:\"product_page_elements_share_tumblr\";s:1:\"1\";s:36:\"product_page_elements_share_linkedin\";s:1:\"1\";s:34:\"product_page_elements_share_reddit\";s:1:\"1\";s:28:\"product_page_elements_upsell\";s:1:\"1\";s:34:\"product_page_elements_upsell_title\";s:17:\"You may also like\";s:44:\"product_page_elements_upsell_title_alignment\";s:4:\"left\";s:34:\"product_page_elements_upsell_items\";s:2:\"-1\";s:44:\"product_page_elements_upsell_columns_desktop\";s:2:\"4x\";s:43:\"product_page_elements_upsell_columns_tablet\";s:2:\"3x\";s:43:\"product_page_elements_upsell_columns_mobile\";s:2:\"2x\";s:40:\"product_page_elements_upsell_columns_100\";s:1:\"5\";s:29:\"product_page_elements_related\";s:1:\"1\";s:35:\"product_page_elements_related_title\";s:16:\"Related Products\";s:45:\"product_page_elements_related_title_alignment\";s:4:\"left\";s:35:\"product_page_elements_related_items\";s:2:\"-1\";s:45:\"product_page_elements_related_columns_desktop\";s:2:\"4x\";s:44:\"product_page_elements_related_columns_tablet\";s:2:\"3x\";s:44:\"product_page_elements_related_columns_mobile\";s:2:\"2x\";s:41:\"product_page_elements_related_columns_100\";s:1:\"5\";s:15:\"product_gallery\";s:7:\"enabled\";s:20:\"product_gallery_type\";s:10:\"horizontal\";s:31:\"product_gallery_column_position\";s:4:\"left\";s:28:\"product_gallery_column_width\";s:2:\"50\";s:26:\"product_gallery_show_image\";s:5:\"hover\";s:20:\"product_gallery_zoom\";s:1:\"1\";s:24:\"product_gallery_lightbox\";s:1:\"1\";s:22:\"product_gallery_labels\";s:1:\"1\";s:26:\"product_gallery_label_sale\";s:1:\"1\";s:25:\"product_gallery_label_new\";s:1:\"1\";s:31:\"product_gallery_label_out_stock\";s:1:\"1\";s:27:\"product_gallery_auto_height\";s:1:\"1\";s:30:\"product_gallery_elements_color\";s:0:\"\";s:28:\"product_gallery_grid_columns\";s:2:\"1x\";s:25:\"product_gallery_grid_gaps\";s:2:\"42\";s:30:\"product_gallery_grid_gaps_hide\";s:1:\"0\";s:31:\"product_gallery_grid_top_margin\";s:1:\"0\";s:30:\"product_gallery_video_autoplay\";s:1:\"0\";s:15:\"size_guide_text\";s:10:\"Size guide\";}s:25:\"product_archive_item_data\";a:2:{s:29:\"product_archive_layout_source\";s:7:\"default\";s:32:\"product_archive_builder_template\";s:0:\"\";}s:22:\"blog_archive_item_data\";a:2:{s:26:\"blog_archive_layout_source\";s:7:\"default\";s:29:\"blog_archive_builder_template\";s:0:\"\";}s:24:\"options_current_contents\";N;s:16:\"options_modified\";N;s:34:\"options_outside_parameter_modified\";b:0;s:22:\"options_saved_contents\";N;s:8:\"settings\";a:3:{s:5:\"theme\";s:5:\"light\";s:24:\"background_image_gallery\";a:0:{}s:21:\"colorpicker_favorites\";a:1:{s:7:\"default\";a:0:{}}}s:26:\"delay_js_execution_desktop\";s:1:\"0\";s:25:\"delay_js_execution_mobile\";s:1:\"0\";}"],"thegem_post_general_item_data":["a:22:{s:20:\"post_layout_settings\";s:7:\"default\";s:18:\"post_layout_source\";s:7:\"default\";s:21:\"post_builder_template\";s:1:\"0\";s:26:\"show_featured_posts_slider\";i:0;s:21:\"show_featured_content\";s:8:\"disabled\";s:10:\"video_type\";s:7:\"youtube\";s:5:\"video\";s:0:\"\";s:18:\"video_aspect_ratio\";s:0:\"\";s:10:\"quote_text\";s:0:\"\";s:12:\"quote_author\";s:0:\"\";s:16:\"quote_background\";s:0:\"\";s:18:\"quote_author_color\";s:0:\"\";s:5:\"audio\";s:0:\"\";s:7:\"gallery\";i:0;s:18:\"gallery_autoscroll\";i:0;s:9:\"highlight\";i:0;s:14:\"highlight_type\";s:7:\"squared\";s:15:\"highlight_style\";s:7:\"default\";s:31:\"highlight_title_left_background\";s:0:\"\";s:26:\"highlight_title_left_color\";s:0:\"\";s:32:\"highlight_title_right_background\";s:0:\"\";s:27:\"highlight_title_right_color\";s:0:\"\";}"],"thegem_show_featured_posts_slider":["0"],"_thumbnail_id":["29716"],"_oembed_79d813f23eb66ee4bdc588fa6494d268":["{{unknown}}"],"thegem_page_data_old":["a:46:{s:11:\"title_style\";s:1:\"1\";s:14:\"title_template\";i:0;s:23:\"title_use_page_settings\";i:0;s:12:\"title_xlarge\";i:0;s:18:\"title_rich_content\";i:0;s:13:\"title_content\";s:0:\"\";s:22:\"title_background_image\";s:0:\"\";s:25:\"title_background_parallax\";i:0;s:22:\"title_background_color\";s:0:\"\";s:16:\"title_video_type\";s:0:\"\";s:22:\"title_video_background\";s:0:\"\";s:24:\"title_video_aspect_ratio\";s:0:\"\";s:25:\"title_video_overlay_color\";s:0:\"\";s:27:\"title_video_overlay_opacity\";s:0:\"\";s:18:\"title_video_poster\";s:0:\"\";s:19:\"title_menu_on_video\";s:0:\"\";s:16:\"title_text_color\";s:0:\"\";s:24:\"title_excerpt_text_color\";s:0:\"\";s:13:\"title_excerpt\";s:0:\"\";s:17:\"title_title_width\";s:1:\"0\";s:19:\"title_excerpt_width\";s:1:\"0\";s:17:\"title_padding_top\";s:2:\"80\";s:20:\"title_padding_bottom\";s:2:\"80\";s:16:\"title_top_margin\";s:1:\"0\";s:24:\"title_excerpt_top_margin\";s:2:\"18\";s:17:\"title_breadcrumbs\";s:1:\"1\";s:15:\"title_alignment\";s:0:\"\";s:15:\"title_icon_pack\";s:7:\"elegant\";s:10:\"title_icon\";s:0:\"\";s:16:\"title_icon_color\";s:0:\"\";s:18:\"title_icon_color_2\";s:0:\"\";s:27:\"title_icon_background_color\";s:0:\"\";s:16:\"title_icon_shape\";s:6:\"circle\";s:23:\"title_icon_border_color\";s:0:\"\";s:15:\"title_icon_size\";s:5:\"large\";s:16:\"title_icon_style\";s:0:\"\";s:18:\"title_icon_opacity\";d:0;s:14:\"header_opacity\";s:1:\"0\";s:23:\"header_top_area_opacity\";s:1:\"0\";s:13:\"footer_custom\";s:5:\"24822\";s:16:\"sidebar_position\";s:0:\"\";s:14:\"slideshow_type\";s:0:\"\";s:19:\"slideshow_slideshow\";s:0:\"\";s:19:\"fullpage_style_dots\";s:7:\"outline\";s:22:\"fullpage_scroll_effect\";s:6:\"normal\";s:14:\"sidebar_sticky\";i:0;}"],"thegem_post_general_item_data_old":["a:19:{s:26:\"show_featured_posts_slider\";i:0;s:21:\"show_featured_content\";i:0;s:10:\"video_type\";s:7:\"youtube\";s:5:\"video\";s:0:\"\";s:18:\"video_aspect_ratio\";s:0:\"\";s:10:\"quote_text\";s:0:\"\";s:12:\"quote_author\";s:0:\"\";s:16:\"quote_background\";s:0:\"\";s:18:\"quote_author_color\";s:0:\"\";s:5:\"audio\";s:0:\"\";s:7:\"gallery\";i:0;s:18:\"gallery_autoscroll\";i:0;s:9:\"highlight\";i:0;s:14:\"highlight_type\";s:7:\"squared\";s:15:\"highlight_style\";s:7:\"default\";s:31:\"highlight_title_left_background\";s:0:\"\";s:26:\"highlight_title_left_color\";s:0:\"\";s:32:\"highlight_title_right_background\";s:0:\"\";s:27:\"highlight_title_right_color\";s:0:\"\";}"],"thegem_post_page_elements_data":["a:12:{s:13:\"post_elements\";s:7:\"default\";s:11:\"show_author\";i:0;s:16:\"blog_hide_author\";i:0;s:14:\"blog_hide_date\";i:0;s:26:\"blog_hide_date_in_blog_cat\";i:0;s:20:\"blog_hide_categories\";i:0;s:14:\"blog_hide_tags\";i:0;s:18:\"blog_hide_comments\";i:0;s:15:\"blog_hide_likes\";i:0;s:20:\"blog_hide_navigation\";i:0;s:17:\"blog_hide_socials\";i:0;s:17:\"blog_hide_realted\";i:0;}"],"_elementor_edit_mode":["builder"],"_elementor_template_type":["wp-post"],"_elementor_version":["3.18.3"],"_elementor_data":["[{\"id\":\"3568240c\",\"elType\":\"section\",\"settings\":{\"stretch_section\":\"section-stretched\"},\"elements\":[{\"id\":\"20788c4a\",\"elType\":\"column\",\"settings\":{\"_column_size\":100,\"thegem_column_breakpoints_list\":[]},\"elements\":[{\"id\":\"46ee24a0\",\"elType\":\"widget\",\"settings\":{\"editor\":\"<!-- wp:heading {\\\"level\\\":4} -->\\n<h4>Introduction<\\\/h4>\\n<!-- \\\/wp:heading -->\\n\\n<!-- wp:paragraph -->\\n<p>This article is for search practitioners who want to achieve a deep understanding of the ranking functions TF-IDF and BM25 (also called \\u201csimilarities\\u201d in Lucene). If you\\u2019re like many practitioners, you\\u2019re already familiar with TF-IDF, but when you first saw the complicated BM25 formula, you thought \\u201cmaybe later.\\u201d Now is the time to finally understand it! You\\u2019ve probably heard that BM25 is similar to TF-IDF but works better in practice. This article will show you precisely how BM25 builds upon TF-IDF, what its parameters do, and why it is so effective. If you\\u2019d rather skip over the math and work with practical examples that demonstrate BM25\\u2019s behaviors, check out our companion article on\\u00a0<a href=\\\"https:\\\/\\\/kmwllc.com\\\/index.php\\\/2020\\\/03\\\/10\\\/understanding-scoring-through-examples\\\/\\\">Understanding Scoring Through Examples<\\\/a>.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:heading {\\\"level\\\":4} -->\\n<h4>Reviewing TF-IDF<\\\/h4>\\n<!-- \\\/wp:heading -->\\n\\n<!-- wp:paragraph -->\\n<p>Let\\u2019s review TF-IDF by trying to develop it from scratch. Imagine we\\u2019re building a search engine. Assume we\\u2019ve already got a way to find the documents that&nbsp;<em>match<\\\/em>&nbsp;a user\\u2019s search. What we need now is a&nbsp;<strong>ranking function<\\\/strong>&nbsp;that will tell us how to order those documents. The higher a document\\u2019s score according to this function, the higher up we\\u2019ll place it in the list of results that we return to the user.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>The goal of TF-IDF and similar ranking functions is to reward&nbsp;<em>relevance<\\\/em>. Say a user searches for the term \\u201cdogs.\\u201d If Document 1 is more relevant to the subject of dogs than Document 2, then we want the score of Document 1 to be higher than the score of Document 2, so we\\u2019ll show the better result first and the user will be happy. How much higher does Document 1\\u2019s score have to be? It doesn\\u2019t really matter, as long as the score order matches the relevance order.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>You might feel a little shocked by the audacity of what we\\u2019re attempting to do: we\\u2019re going to try to judge the relevance of millions or billions of documents using a mathematical function, without knowing anything about the person who\\u2019s doing the search, and without actually reading the documents and understanding what they\\u2019re about! How is this possible?<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>We\\u2019ll make a simple but profoundly helpful assumption. We\\u2019ll assume that the more times a document contains a term, the more likely it is to be&nbsp;<em>about<\\\/em>&nbsp;that term. That\\u2019s to say, we\\u2019ll use&nbsp;<strong>term frequency (TF)<\\\/strong>, the number of occurrences of a term in a document, as a proxy for relevance. This one assumption creates a path for us to solve a seemingly impossible problem using simple math. Our assumption isn\\u2019t perfect, and it goes very wrong sometimes, but it works often enough to be useful. So from here on, we\\u2019ll view term frequency as a good thing\\u200a\\u2014\\u200aa thing we want to reward.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p><strong>TF-IDF: Attempt 1<\\\/strong><\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>As a starting point for our ranking function, let\\u2019s do the simplest, easiest thing possible. We\\u2019ll set the score of a document equal to its term frequency. If we\\u2019re searching for a term T and evaluating the relevance of a document D, then:<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:preformatted -->\\n<pre class=\\\"wp-block-preformatted\\\"><mark style=\\\"background-color:rgba(0, 0, 0, 0)\\\" class=\\\"has-inline-color has-vivid-cyan-blue-color\\\">score(D, T) = termFrequency(D, T)<\\\/mark><\\\/pre>\\n<!-- \\\/wp:preformatted -->\\n\\n<!-- wp:paragraph -->\\n<p>When a query has multiple terms, like \\u201cdogs and cats,\\u201d how should we handle that? Should we try to analyze the relationships between the various terms and then blend the per-term scores together in a complex way? Not so fast! The simplest approach is to just add the scores for each term together. So we\\u2019ll do that, and hope for the best. If we have a multi-term query Q, then we\\u2019ll set:<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:preformatted -->\\n<pre class=\\\"wp-block-preformatted\\\"><mark style=\\\"background-color:rgba(0, 0, 0, 0)\\\" class=\\\"has-inline-color has-vivid-cyan-blue-color\\\">score(D, Q) = sum over all terms T in Q of score(D, T)<\\\/mark>\\n<\\\/pre>\\n<!-- \\\/wp:preformatted -->\\n\\n<!-- wp:paragraph -->\\n<p>How well does our simple ranking function work? Unfortunately, it\\u2019s got some problems:<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>1) Longer documents are given an unfair advantage over shorter ones because they have more space to include more occurrences of a term, even though they might not be more relevant to the term. Let\\u2019s ignore this problem for now.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>2) All terms in a query are treated equally, with no consideration for which ones are more meaningful or important. When we sum the scores for each term together, insignificant terms like \\u201cand\\u201d and \\u201cthe\\u201d which happen to be very frequent will dominate the combined score. Say you search for \\u201celephants and cows.\\u201d Perhaps there\\u2019s a single document in the index that includes all three terms (\\u201celephants\\u201d, \\u201cand\\u201d, \\u201ccows\\u201d), but instead of seeing this ideal result first, you see the document that has the most occurrences of \\u201cand\\u201d\\u200a\\u2014\\u200amaybe it has 10,000 of them. This preference for filler words is clearly not what we want.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p><strong>TF-IDF: Attempt 2<\\\/strong><\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>To prevent filler words from dominating, we need some way of judging the&nbsp;<em>importance<\\\/em>&nbsp;of the terms in a query. Since we can\\u2019t encode an understanding of natural language into our scoring function, we\\u2019ll try to find a proxy for importance. Our best bet is&nbsp;<em>rarity<\\\/em>. If a term doesn\\u2019t occur in most documents in the corpus, then whenever it does occur, we\\u2019ll guess that this occurrence is significant. On the other hand, if a term occurs in most of the documents in our corpus, then the presence of that term in any particular document will lose its value as an indicator of relevance.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>So high term frequency is a good thing, but its goodness is offset by high&nbsp;<strong>document frequency (DF)\\u200a<\\\/strong>\\u2014\\u200athe number of documents that contain the term\\u200a\\u2014\\u200awhich we\\u2019ll think of as a bad thing.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>To update our function in a way that rewards term frequency but penalizes document frequency, we could try dividing TF by DF:<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:preformatted -->\\n<pre class=\\\"wp-block-preformatted\\\"><mark style=\\\"background-color:rgba(0, 0, 0, 0)\\\" class=\\\"has-inline-color has-vivid-cyan-blue-color\\\">score(D, T) = termFrequency(D, T) \\\/ docFrequency(T)<\\\/mark><\\\/pre>\\n<!-- \\\/wp:preformatted -->\\n\\n<!-- wp:paragraph -->\\n<p>What\\u2019s wrong with this? Unfortunately, DF by itself tells us nothing. If DF for the term \\u201celephant\\u201d is 100, then is \\u201celephant\\u201d a rare term or a common term? It depends on the size of the corpus. If the corpus contains 100 documents, \\u201celephant\\u201d is common, if it contains 100,000 documents, \\u201celephant\\u201d is rare.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p><strong>TF-IDF: Attempt 3<\\\/strong><\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>Instead of looking at DF by itself, let\\u2019s look at N\\\/DF, where N is the size of the search index or corpus. Notice how N\\\/DF is low for common terms (100 occurrences of \\u201celephant\\u201d in a corpus of size 100 would give N\\\/DF = 1), and high for rare ones (100 occurrences of \\u201celephant in a corpus of size 100,000 would give N\\\/DF = 1000). That\\u2019s exactly what we want: matches for common terms should get low scores, matches for rare terms should get high ones. Our improved formula might go like this:<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:preformatted -->\\n<pre class=\\\"wp-block-preformatted\\\"><mark style=\\\"background-color:rgba(0, 0, 0, 0)\\\" class=\\\"has-inline-color has-vivid-cyan-blue-color\\\">score(D, T) = termFrequency(D, T) * (N \\\/ docFrequency(T))<\\\/mark><\\\/pre>\\n<!-- \\\/wp:preformatted -->\\n\\n<!-- wp:paragraph -->\\n<p>We\\u2019re doing better, but let\\u2019s take a closer look at how N\\\/DF behaves. Say we have 100 documents and \\u201celephant\\u201d occurs in 1 of them while \\u201cgiraffe\\u201d occurs in 2 of them. Both terms are similarly rare, but elephant\\u2019s N\\\/DF value would come out to 100 and giraffe\\u2019s would be half that, at 50. Should a match for giraffe get half the score of match for elephant just because giraffe\\u2019s document frequency is one higher then elephant\\u2019s? The penalty for one additional occurrence of the word in the corpus seems too high. Arguably, if we have 100 documents, it shouldn\\u2019t make much of a difference whether a term\\u2019s DF is 1, 2, 3, or 4&nbsp;.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p><strong>TF-IDF: Attempt 4<\\\/strong><\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>As we\\u2019ve seen, when DF is in a very low range, small differences in DF can have a dramatic impact on N\\\/DF and hence on the score. We might like to smooth out the decline of N\\\/DF when DF is in the lowest end of its range. One way to do this is to take the&nbsp;<strong>log<\\\/strong>&nbsp;of N\\\/DF. If we wanted, we could try to use a different smoothing function here, but log is straightforward and it does what we want. This chart compares N\\\/DF and log(N\\\/DF) assuming N=100:<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:image {\\\"id\\\":26402,\\\"sizeSlug\\\":\\\"large\\\",\\\"linkDestination\\\":\\\"none\\\"} -->\\n<figure class=\\\"wp-block-image size-large\\\"><img src=\\\"https:\\\/\\\/kmwllc.com\\\/wp-content\\\/uploads\\\/2021\\\/05\\\/ApplyingLogToIDF-1024x707.png\\\" alt=\\\"\\\" class=\\\"wp-image-26402\\\"\\\/><\\\/figure>\\n<!-- \\\/wp:image -->\\n\\n<!-- wp:paragraph -->\\n<p>Let\\u2019s call log(N\\\/DF) the&nbsp;<strong>inverse document frequency (IDF)<\\\/strong>&nbsp;of a term. Our ranking function can now be expressed as TF * IDF or:<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:preformatted -->\\n<pre class=\\\"wp-block-preformatted\\\"><mark style=\\\"background-color:rgba(0, 0, 0, 0)\\\" class=\\\"has-inline-color has-vivid-cyan-blue-color\\\">score(D, T) = termFrequency(D, T) * log(N \\\/ docFrequency(T))<\\\/mark><\\\/pre>\\n<!-- \\\/wp:preformatted -->\\n\\n<!-- wp:paragraph -->\\n<p>We\\u2019ve arrived at the traditional definition of TF-IDF and even though we made some bold assumptions to get here, the function works pretty well in practice: it has gathered a long track record of successful application in search engines. Are we done or could we do even better?<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:heading {\\\"level\\\":4} -->\\n<h4>Developing BM25<\\\/h4>\\n<!-- \\\/wp:heading -->\\n\\n<!-- wp:paragraph -->\\n<p>As you might have guessed, we\\u2019re not ready to stop at TF-IDF. In this section, we\\u2019ll build the BM25 function, which can be seen as an improvement on TF-IDF. We\\u2019re going to keep the same structure of the TF * IDF formula, but we\\u2019ll replace the TF and IDF components with refinements of those values.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p><strong>Step 1: Term Saturation<\\\/strong><\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>We\\u2019ve been saying that TF is a good thing, and indeed our TF-IDF formula rewards it. But if a document contains 200 occurrences of \\u201celephant,\\u201d is it really&nbsp;<em>twice<\\\/em>&nbsp;as relevant as a document that contains 100 occurrences? We could argue that if \\u201celephant\\u201d occurs a large enough number of times, say 100, the document is almost certainly relevant, and any further mentions don\\u2019t really increase the likelihood of relevance. To put it a different way, once a document is&nbsp;<em>saturated<\\\/em>&nbsp;with occurrences of a term, more occurrences shouldn\\u2019t a have a significant impact on the score. So we\\u2019d like a way to control the contribution of TF to our score. We\\u2019d like this contribution to increase fast when TF is small and then increase more slowly, approaching a limit, as TF gets very big.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>One common way to tame TF is to take the square root of it, but that\\u2019s still an unbounded quantity. We\\u2019d like to do something more sophisticated. We\\u2019d like to put a bound on TF\\u2019s contribution to the score, and we\\u2019d like to be able to control how rapidly the contribution approaches that bound. Wouldn\\u2019t it be nice if we had a parameter&nbsp;<strong>k<\\\/strong>&nbsp;that could control the shape of this saturation curve? That way, we\\u2019d be able to experiment with different values of k and see what works best for a particular corpus.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>To achieve this, we\\u2019ll pull out a trick. Instead of using raw TF in our ranking formula, we\\u2019ll use the value:<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:preformatted -->\\n<pre class=\\\"wp-block-preformatted\\\"><mark style=\\\"background-color:rgba(0, 0, 0, 0)\\\" class=\\\"has-inline-color has-vivid-cyan-blue-color\\\">TF \\\/ (TF + k)<\\\/mark><\\\/pre>\\n<!-- \\\/wp:preformatted -->\\n\\n<!-- wp:paragraph -->\\n<p>If k is set to 1, this would generate the sequence 1\\\/2, 2\\\/3, 3\\\/4, 4\\\/5, 5\\\/6 as TF increases 1, 2, 3, etc. Notice how this sequence grows fast in the beginning and then more slowly, approaching 1 in smaller and smaller increments. That\\u2019s what we want. Now if we change k to 2, we\\u2019d get 1\\\/3, 2\\\/4, 3\\\/5, 4\\\/6 which grows a little more slowly. Here\\u2019s a graph of the formula TF\\\/(TF + k) for k = 1, 2, 3, 4:<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:image {\\\"id\\\":26403,\\\"sizeSlug\\\":\\\"large\\\",\\\"linkDestination\\\":\\\"none\\\"} -->\\n<figure class=\\\"wp-block-image size-large\\\"><img src=\\\"https:\\\/\\\/kmwllc.com\\\/wp-content\\\/uploads\\\/2021\\\/05\\\/TFSaturation-1024x387.png\\\" alt=\\\"\\\" class=\\\"wp-image-26403\\\"\\\/><\\\/figure>\\n<!-- \\\/wp:image -->\\n\\n<!-- wp:paragraph -->\\n<p>This TF\\\/(TF + k) trick is really the backbone of BM25. It lets us control the contribution of TF to the score in a tunable way.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p><strong>Aside: Term Saturation and Multi-Term Queries<\\\/strong><\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>A fortunate side-effect of using TF\\\/(TF + k) to account for term saturation is that we end up rewarding complete matches over partial ones. That\\u2019s to say, we reward documents that match more of the terms in a multi-term query over documents that have lots of matches for just one of the terms.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>Let\\u2019s say that \\u201ccat\\u201d and \\u201cdog\\u201d have the same IDF values. If we search for \\u201ccat dog\\u201d we\\u2019d like a document that contains one instance of each term to do better than a document that has two instances of \\u201ccat\\u201d and none of \\u201cdog.\\u201d If we were using raw TF they\\u2019d both get the same score. But let\\u2019s do our improved calculation assuming k=1. In our \\u201ccat dog\\u201d document, \\u201ccat\\u201d and \\u201cdog\\u201d each have TF=1, so each are going to contribute TF\\\/(TF+1) = 1\\\/2 to the score, for a total of 1. In our \\u201ccat cat\\u201d document, \\u201ccat\\u201d has a TF of 2, so it\\u2019s going to contribute TF\\\/(TF+1) = 2\\\/3 to the score. The \\u201ccat dog\\u201d document wins, because \\u201ccat\\u201d and \\u201cdog\\u201d contribute more when each occurs once than \\u201ccat\\u201d contributes when it occurs twice.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>Assuming the IDF of two terms is the same, it\\u2019s always better to have one instance of each term than to have two instances of one of them.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p><strong>Step 2: Document Length<\\\/strong><\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>Now let\\u2019s go back to the problem we skipped over when we were first building TF-IDF: document length. If a document happens to be really short and it contains \\u201celephant\\u201d once, that\\u2019s a good indicator that \\u201celephant\\u201d is important to the content. But if the document is really, really long and it mentions elephant only once, the document is probably not about elephants. So we\\u2019d like to reward matches in short documents, while penalizing matches in long documents. How can we achieve this?<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>First, we\\u2019ve got to decide what it means for a document to be short or long. We need a frame of reference, so we\\u2019ll use the corpus itself as our frame of reference. A short document is simply one that is&nbsp;<em>shorter than average<\\\/em>&nbsp;for the corpus.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>Let\\u2019s go back to our TF\\\/(TF + k) trick. Of course as k increases, the value of TF\\\/(TF + k) decreases. To penalize long documents, we can adjust k up if the document is longer than average, and adjust it down if the document is shorter than average. We\\u2019ll achieve this by multiplying k by the ratio&nbsp;<strong>dl\\\/adl<\\\/strong>. Here,&nbsp;<em>dl<\\\/em>&nbsp;is the document\\u2019s length, and&nbsp;<em>adl<\\\/em>&nbsp;is the average document length across the corpus.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>When a document is of average length, dl\\\/adl =1, and our multiplier doesn\\u2019t affect k at all. For a document that\\u2019s shorter than average, we\\u2019ll be multiplying k by a value between 0 and 1, thereby reducing it, and increasing TF\\\/(TF+k). For a document that\\u2019s longer than average, we\\u2019ll be multiplying k by a value greater than 1, thereby increasing it, and reducing TF\\\/(TF+k). The multiplier also puts us on a different TF saturation curve. Shorter documents will approach a TF saturation point more quickly while longer documents will approach it more gradually.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p><strong>Step 3: Parameterizing Document Length<\\\/strong><\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>In the last section, we updated our ranking function to account for document length, but is this always a good idea? Just how much importance should we place on document length in any particular corpus? Might there be some collections of documents where length matters a lot and some where it doesn\\u2019t? We might like to treat the importance of document length as a second parameter that we can experiment with.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>We\\u2019re going to achieve this tunability with another trick. We\\u2019ll add a new parameter&nbsp;<strong>b<\\\/strong>&nbsp;into the mix (it must be between 0 and 1). Instead of multiplying k by dl\\\/adl as we were doing before, we\\u2019ll multiply k by the following value based on dl\\\/adl and b:<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:preformatted -->\\n<pre class=\\\"wp-block-preformatted\\\"><mark style=\\\"background-color:rgba(0, 0, 0, 0)\\\" class=\\\"has-inline-color has-vivid-cyan-blue-color\\\">1 \\u2013 b + b*dl\\\/adl<\\\/mark><\\\/pre>\\n<!-- \\\/wp:preformatted -->\\n\\n<!-- wp:paragraph -->\\n<p>What does this do for us? You can see if b is 1, we get (1 \\u2013 1 + 1*dl\\\/adl) and this reduces to the multiplier we had before, dl\\\/adl. On the other hand, if b is 0, the whole thing becomes 1 and document length isn\\u2019t considered at all. As b is cranked up from 0 towards 1, the multiplier responds more quickly to changes in dl\\\/adl. The chart below shows how our multiplier behaves as dl\\\/adl grows, when b=.2 versus when b=.8.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:image {\\\"id\\\":26404,\\\"sizeSlug\\\":\\\"large\\\",\\\"linkDestination\\\":\\\"none\\\"} -->\\n<figure class=\\\"wp-block-image size-large\\\"><img src=\\\"https:\\\/\\\/kmwllc.com\\\/wp-content\\\/uploads\\\/2021\\\/05\\\/DocLength-1024x545.png\\\" alt=\\\"\\\" class=\\\"wp-image-26404\\\"\\\/><\\\/figure>\\n<!-- \\\/wp:image -->\\n\\n<!-- wp:paragraph -->\\n<p><strong>Recap: Fancy TF<\\\/strong><\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>To recap, we\\u2019ve been working modifying the TF term in TF * IDF so that it\\u2019s responsive to term saturation and document length. To account for term saturation, we introduced the TF\\\/(TF + k) trick. To account for document length, we added the (1 \\u2013 b + b*dl\\\/adl) multiplier. Now, instead of using raw TF in our ranking function, we\\u2019re using this \\u201cfancy\\u201d version of TF:<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:preformatted -->\\n<pre class=\\\"wp-block-preformatted\\\"><mark style=\\\"background-color:rgba(0, 0, 0, 0)\\\" class=\\\"has-inline-color has-vivid-cyan-blue-color\\\">TF\\\/(TF + k*(1 - b + b*dl\\\/adl)) <\\\/mark><\\\/pre>\\n<!-- \\\/wp:preformatted -->\\n\\n<!-- wp:paragraph -->\\n<p>Recall that k is the knob that control the term saturation curve, and b is the knob that controls the importance of document length.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>Indeed, this is the version of TF that\\u2019s used in BM25. And congratulations: if you\\u2019ve followed this far, you now understand all the really interesting stuff about BM25.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p><strong>Step 4: Fancy or Not-So-Fancy IDF<\\\/strong><\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>We\\u2019re not done just yet though, we have to return to the way BM25 handles document frequency. Earlier, we had defined IDF as log(N\\\/DF), but BM25 defines it as:<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:preformatted -->\\n<pre class=\\\"wp-block-preformatted\\\"><mark style=\\\"background-color:rgba(0, 0, 0, 0)\\\" class=\\\"has-inline-color has-vivid-cyan-blue-color\\\">log((N - DF + .5)\\\/(DF + .5)) <\\\/mark><\\\/pre>\\n<!-- \\\/wp:preformatted -->\\n\\n<!-- wp:paragraph -->\\n<p>Why the difference?<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>As you may have observed, we\\u2019ve been developing our scoring function through a set of heuristics. Researchers in the field of Information Retrieval have wanted to put ranking functions on a more rigorous theoretical footing so they can actually prove things about their behavior rather than just experimenting and hoping for the best. To derive a theoretically sound version of IDF, researchers took something called the Robertson-Sp\\u00e4rck Jones weight, made a simplifying assumption, and came up with log (N-DF+.5)\\\/(DF+.5). We\\u2019re not going to go into the details, but we\\u2019ll just focus on the practical significance of this flavor of IDF. The&nbsp;.5\\u2019s don\\u2019t really do much here, so let\\u2019s just consider log (N-DF)\\\/DF, which is sometimes referred to as \\u201cprobabilistic IDF.\\u201d Here we compare our vanilla IDF with probabilistic IDF where N=10.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:image {\\\"id\\\":26405,\\\"sizeSlug\\\":\\\"large\\\",\\\"linkDestination\\\":\\\"none\\\"} -->\\n<figure class=\\\"wp-block-image size-large\\\"><img src=\\\"https:\\\/\\\/kmwllc.com\\\/wp-content\\\/uploads\\\/2021\\\/05\\\/ProbabilisticIDF-1024x477.png\\\" alt=\\\"\\\" class=\\\"wp-image-26405\\\"\\\/><\\\/figure>\\n<!-- \\\/wp:image -->\\n\\n<!-- wp:paragraph -->\\n<p>You can see that probabilistic IDF takes a sharp drop for terms that are in most of the documents. This might be desirable because if a term really exists in 98% of the documents, it\\u2019s probably a stopword like \\u201cand\\u201d or \\u201cor\\u201d and it should get much, much less weight than a term that\\u2019s very common, like in 70% of the documents, but still not utterly ubiquitous.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>The catch is that log (N-DF)\\\/DF is negative for terms that are in more than half of the corpus. (Remember that the log function goes negative on values between 0 and 1.) We don\\u2019t want negative values coming out of our ranking function because the presence of a query term in a document should never count against retrieval\\u200a\\u2014\\u200ait should never cause a lower score than if the term was simply absent. In order to prevent negative values, Lucene\\u2019s implementation of BM25 adds a 1 like this:<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:preformatted -->\\n<pre class=\\\"wp-block-preformatted\\\"><mark style=\\\"background-color:rgba(0, 0, 0, 0)\\\" class=\\\"has-inline-color has-vivid-cyan-blue-color\\\">IDF = log (1 + (N - DF + .5)\\\/(DF + .5))<\\\/mark><\\\/pre>\\n<!-- \\\/wp:preformatted -->\\n\\n<!-- wp:paragraph -->\\n<p>This 1 might seem like an innocent modification but it totally changes the behavior of the formula! If we forget again about those pesky&nbsp;.5\\u2019s, and we note that adding 1 is the same as adding DF\\\/DF, you can see that the formula reduces to the vanilla version of IDF that we used before: log (N\\\/DF).<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:preformatted -->\\n<pre class=\\\"wp-block-preformatted\\\"><mark style=\\\"background-color:rgba(0, 0, 0, 0)\\\" class=\\\"has-inline-color has-vivid-cyan-blue-color\\\">log (1 + (N - DF + .5)\\\/(DF + .5)) \\u2248\\nlog (1 + (N - DF)\\\/DF ) =\\nlog (DF\\\/DF + (N - DF)\\\/DF) = \\nlog ((DF + N - DF)\\\/DF) = \\nlog (N\\\/DF)<\\\/mark><\\\/pre>\\n<!-- \\\/wp:preformatted -->\\n\\n<!-- wp:paragraph -->\\n<p>So although it looks like BM25 is using a fancy version of IDF, in practice (as implemented in Lucene) it\\u2019s basically using the same old version of IDF that\\u2019s used in traditional TF\\\/IDF, without the accelerated decline for high DF values.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:heading {\\\"level\\\":4} -->\\n<h4>Cashing In<\\\/h4>\\n<!-- \\\/wp:heading -->\\n\\n<!-- wp:paragraph -->\\n<p>We\\u2019re ready to cash in on our new understanding by looking at the explain output from a Lucene query. You\\u2019ll see something like this:<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:preformatted -->\\n<pre class=\\\"wp-block-preformatted\\\"><mark style=\\\"background-color:rgba(0, 0, 0, 0)\\\" class=\\\"has-inline-color has-vivid-cyan-blue-color\\\">\\u201cscore(freq=3.0), product of:\\u201d\\n\\n\\u201cidf, computed as log(1 + (N \\u2014 n + 0.5) \\\/ (n + 0.5)) from:\\u201d\\n\\n\\u201ctf, computed as freq \\\/ (freq + k1 * (1 \\u2014 b + b * dl \\\/ avgdl)) from:\\u201d<\\\/mark><\\\/pre>\\n<!-- \\\/wp:preformatted -->\\n\\n<!-- wp:paragraph -->\\n<p>We\\u2019re finally prepared to understand this gobbledygook. You can see that Lucene is using a TF*IDF product where TF and IDF have their special BM25 definitions. Lowercase n means DF here. The IDF term is the supposedly fancy version that turns out to be the same as traditional IDF, N\\\/n.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>The TF term is based on our saturation trick: freq\\\/(freq + k). The use of&nbsp;<strong>k1<\\\/strong>&nbsp;instead of k in the explain output it historical\\u200a\\u2014\\u200ait comes from a time when there was more than one k in the formula. What we\\u2019ve been calling raw TF is denoted as&nbsp;<strong>freq<\\\/strong>&nbsp;here.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>We can see that k1 is multiplied by a factor that penalizes above-average document length while rewarding below-average document length: (1-b + b *dl\\\/avgdl). What we\\u2019ve been calling adl is denoted as&nbsp;<strong>avgdl<\\\/strong>&nbsp;here.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>And of course we can see that there are parameters, which are set to k=1.2 and b =&nbsp;.75 in Lucene by default. You probably won\\u2019t need to tweak these, but you can if you want.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p><strong>In summary, simple TF-IDF rewards term frequency and penalizes document frequency. BM25 goes beyond this to account for document length and term frequency saturation.&nbsp;<\\\/strong><\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>It\\u2019s worth noting that before Lucene introduced BM25 as the default ranking function as of version 6, it implemented TF-IDF through something called the&nbsp;<a href=\\\"https:\\\/\\\/www.elastic.co\\\/guide\\\/en\\\/elasticsearch\\\/guide\\\/2.x\\\/practical-scoring-function.html\\\">Practical Scoring Function<\\\/a>, which was a set of enhancements (including \\u201ccoord\\u201d and field length normalization) that made TF-IDF more like BM25. So the behavior difference one might have observed when Lucene made the switch to BM25 was probably less dramatic than it would have been if Lucene had been using pure TF-IDF all along. In any case, the consensus is that BM25 is an improvement, and now you can see why.<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>If you\\u2019re a search engineer, the Lucene explain output is the most likely place where you\\u2019ll encounter the details of the BM25 formula. However, if you delve into theoretical papers or check out the&nbsp;<a href=\\\"https:\\\/\\\/en.wikipedia.org\\\/wiki\\\/Okapi_BM25\\\">Wikipedia article on BM25<\\\/a>, you\\u2019ll see it written out as an equation like this:<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:image {\\\"id\\\":26406,\\\"sizeSlug\\\":\\\"large\\\",\\\"linkDestination\\\":\\\"none\\\"} -->\\n<figure class=\\\"wp-block-image size-large\\\"><img src=\\\"https:\\\/\\\/kmwllc.com\\\/wp-content\\\/uploads\\\/2021\\\/05\\\/bm25demystified-1024x441.png\\\" alt=\\\"\\\" class=\\\"wp-image-26406\\\"\\\/><\\\/figure>\\n<!-- \\\/wp:image -->\\n\\n<!-- wp:paragraph -->\\n<p>Hopefully this tour has made you more comfortable with how the two most popular search ranking functions work. Thanks for following along!<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:heading {\\\"level\\\":4} -->\\n<h4>Further Reading<\\\/h4>\\n<!-- \\\/wp:heading -->\\n\\n<!-- wp:paragraph -->\\n<p>This article follows in the footsteps of some other great tours of BM25 that are out there. These two are highly recommended:<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p><a href=\\\"https:\\\/\\\/opensourceconnections.com\\\/blog\\\/2015\\\/10\\\/16\\\/bm25-the-next-generation-of-lucene-relevation\\\/\\\"><strong>BM25 The Next Generation of Lucene Relevance by Doug Turnbull<\\\/strong><\\\/a><\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p><a href=\\\"https:\\\/\\\/www.elastic.co\\\/blog\\\/practical-bm25-part-2-the-bm25-algorithm-and-its-variables\\\"><strong>Practical BM25 \\u2013 Part 2: The BM25 Algorithm and its Variables by Shane Connelly<\\\/strong><\\\/a><\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>There are many theoretical treatments of ranking out there. A good starting place is&nbsp;<a href=\\\"http:\\\/\\\/www.staff.city.ac.uk\\\/~sb317\\\/papers\\\/foundations_bm25_review.pdf\\\" target=\\\"_blank\\\" rel=\\\"noreferrer noopener\\\">\\u201cThe Probabilistic Relevance Framework: BM25 and Beyond\\u201d by Robertson and Zaragosa<\\\/a>.&nbsp;<\\\/p>\\n<!-- \\\/wp:paragraph -->\\n\\n<!-- wp:paragraph -->\\n<p>See also the paper&nbsp;<a href=\\\"https:\\\/\\\/www.microsoft.com\\\/en-us\\\/research\\\/wp-content\\\/uploads\\\/2016\\\/02\\\/okapi_trec3.pdf\\\" target=\\\"_blank\\\" rel=\\\"noreferrer noopener\\\">\\u201cOkapi at TREC-3\\u201d<\\\/a>&nbsp;where BM25 was first introduced.<\\\/p>\\n<!-- \\\/wp:paragraph -->\"},\"elements\":[],\"widgetType\":\"text-editor\"}],\"isInner\":false}],\"isInner\":false},{\"id\":\"e0c25bc\",\"elType\":\"section\",\"settings\":[],\"elements\":[{\"id\":\"a5e6f8a\",\"elType\":\"column\",\"settings\":{\"_column_size\":100,\"_inline_size\":null,\"thegem_column_breakpoints_list\":[]},\"elements\":[{\"id\":\"637fda5\",\"elType\":\"widget\",\"settings\":{\"prev_label\":\"Previous Post\",\"next_label\":\"Next Post\",\"show_borders\":\"\",\"title_typography_typography\":\"custom\",\"title_typography_font_size\":{\"unit\":\"px\",\"size\":14,\"sizes\":[]},\"title_typography_font_weight\":\"700\",\"__globals__\":{\"arrow_color\":\"globals\\\/colors?id=primary\",\"label_color\":\"globals\\\/colors?id=secondary\"},\"arrow\":\"fa fa-caret-left\",\"show_arrow\":\"\"},\"elements\":[],\"widgetType\":\"global\",\"templateID\":\"28083\"}],\"isInner\":false}],\"isInner\":false}]"],"_elementor_pro_version":["3.18.3"],"_last_editor_used_jetpack":["block-editor"],"thegem_popups_data":["a:2:{s:20:\"popups_layout_source\";s:7:\"default\";s:12:\"thegemPopups\";a:0:{}}"],"_yoast_wpseo_primary_category":["36"],"_yoast_wpseo_content_score":["30"],"_yoast_wpseo_estimated-reading-time-minutes":["17"],"_yoast_wpseo_wordproof_timestamp":[""],"_elementor_controls_usage":["a:4:{s:11:\"text-editor\";a:3:{s:5:\"count\";i:1;s:15:\"control_percent\";i:0;s:8:\"controls\";a:1:{s:7:\"content\";a:1:{s:14:\"section_editor\";a:1:{s:6:\"editor\";i:1;}}}}s:6:\"column\";a:3:{s:5:\"count\";i:2;s:15:\"control_percent\";i:0;s:8:\"controls\";a:1:{s:6:\"layout\";a:1:{s:6:\"layout\";a:1:{s:12:\"_inline_size\";i:1;}}}}s:7:\"section\";a:3:{s:5:\"count\";i:2;s:15:\"control_percent\";i:0;s:8:\"controls\";a:1:{s:6:\"layout\";a:1:{s:14:\"section_layout\";a:1:{s:15:\"stretch_section\";i:1;}}}}s:15:\"post-navigation\";a:3:{s:5:\"count\";i:1;s:15:\"control_percent\";i:2;s:8:\"controls\";a:2:{s:7:\"content\";a:1:{s:31:\"section_post_navigation_content\";a:4:{s:10:\"prev_label\";i:1;s:10:\"next_label\";i:1;s:12:\"show_borders\";i:1;s:10:\"show_arrow\";i:1;}}s:5:\"style\";a:1:{s:11:\"title_style\";a:3:{s:27:\"title_typography_typography\";i:1;s:26:\"title_typography_font_size\";i:1;s:28:\"title_typography_font_weight\";i:1;}}}}}"],"_elementor_page_assets":["a:2:{s:7:\"scripts\";a:2:{i:0;s:18:\"elementor-frontend\";i:1;s:15:\"post-navigation\";}s:6:\"styles\";a:1:{i:0;s:22:\"widget-post-navigation\";}}"],"_elementor_css":["a:6:{s:4:\"time\";i:1782356529;s:5:\"fonts\";a:0:{}s:5:\"icons\";a:0:{}s:20:\"dynamic_elements_ids\";a:0:{}s:6:\"status\";s:4:\"file\";i:0;s:0:\"\";}"],"_elementor_element_cache":["{\"timeout\":1783962801,\"value\":{\"content\":\"\\t\\t<section class=\\\"elementor-section elementor-top-section elementor-element elementor-element-3568240c elementor-section-stretched elementor-section-boxed elementor-section-height-default elementor-section-height-default\\\" data-id=\\\"3568240c\\\" data-element_type=\\\"section\\\" data-e-type=\\\"section\\\" data-settings=\\\"{&quot;stretch_section&quot;:&quot;section-stretched&quot;}\\\">\\r\\n\\t\\t\\t\\t\\t\\t<div class=\\\"elementor-container elementor-column-gap-thegem\\\"><div class=\\\"elementor-row\\\">\\r\\n\\t\\t\\t\\t\\t<div class=\\\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-20788c4a\\\" data-id=\\\"20788c4a\\\" data-element_type=\\\"column\\\" data-e-type=\\\"column\\\">\\n\\t\\t\\t<div class=\\\"elementor-widget-wrap elementor-element-populated\\\">\\n\\t\\t\\t\\t[elementor-element k=\\\"9109a976d8649ee6d2c8fef8daebbb8b\\\" data=\\\"eyJpZCI6IjQ2ZWUyNGEwIiwiZWxUeXBlIjoid2lkZ2V0Iiwic2V0dGluZ3MiOnsiZWRpdG9yIjoiPCEtLSB3cDpoZWFkaW5nIHtcImxldmVsXCI6NH0gLS0+XG48aDQ+SW50cm9kdWN0aW9uPFwvaDQ+XG48IS0tIFwvd3A6aGVhZGluZyAtLT5cblxuPCEtLSB3cDpwYXJhZ3JhcGggLS0+XG48cD5UaGlzIGFydGljbGUgaXMgZm9yIHNlYXJjaCBwcmFjdGl0aW9uZXJzIHdobyB3YW50IHRvIGFjaGlldmUgYSBkZWVwIHVuZGVyc3RhbmRpbmcgb2YgdGhlIHJhbmtpbmcgZnVuY3Rpb25zIFRGLUlERiBhbmQgQk0yNSAoYWxzbyBjYWxsZWQgXHUyMDFjc2ltaWxhcml0aWVzXHUyMDFkIGluIEx1Y2VuZSkuIElmIHlvdVx1MjAxOXJlIGxpa2UgbWFueSBwcmFjdGl0aW9uZXJzLCB5b3VcdTIwMTlyZSBhbHJlYWR5IGZhbWlsaWFyIHdpdGggVEYtSURGLCBidXQgd2hlbiB5b3UgZmlyc3Qgc2F3IHRoZSBjb21wbGljYXRlZCBCTTI1IGZvcm11bGEsIHlvdSB0aG91Z2h0IFx1MjAxY21heWJlIGxhdGVyLlx1MjAxZCBOb3cgaXMgdGhlIHRpbWUgdG8gZmluYWxseSB1bmRlcnN0YW5kIGl0ISBZb3VcdTIwMTl2ZSBwcm9iYWJseSBoZWFyZCB0aGF0IEJNMjUgaXMgc2ltaWxhciB0byBURi1JREYgYnV0IHdvcmtzIGJldHRlciBpbiBwcmFjdGljZS4gVGhpcyBhcnRpY2xlIHdpbGwgc2hvdyB5b3UgcHJlY2lzZWx5IGhvdyBCTTI1IGJ1aWxkcyB1cG9uIFRGLUlERiwgd2hhdCBpdHMgcGFyYW1ldGVycyBkbywgYW5kIHdoeSBpdCBpcyBzbyBlZmZlY3RpdmUuIElmIHlvdVx1MjAxOWQgcmF0aGVyIHNraXAgb3ZlciB0aGUgbWF0aCBhbmQgd29yayB3aXRoIHByYWN0aWNhbCBleGFtcGxlcyB0aGF0IGRlbW9uc3RyYXRlIEJNMjVcdTIwMTlzIGJlaGF2aW9ycywgY2hlY2sgb3V0IG91ciBjb21wYW5pb24gYXJ0aWNsZSBvblx1MDBhMDxhIGhyZWY9XCJodHRwczpcL1wva213bGxjLmNvbVwvaW5kZXgucGhwXC8yMDIwXC8wM1wvMTBcL3VuZGVyc3RhbmRpbmctc2NvcmluZy10aHJvdWdoLWV4YW1wbGVzXC9cIj5VbmRlcnN0YW5kaW5nIFNjb3JpbmcgVGhyb3VnaCBFeGFtcGxlczxcL2E+LjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOmhlYWRpbmcge1wibGV2ZWxcIjo0fSAtLT5cbjxoND5SZXZpZXdpbmcgVEYtSURGPFwvaDQ+XG48IS0tIFwvd3A6aGVhZGluZyAtLT5cblxuPCEtLSB3cDpwYXJhZ3JhcGggLS0+XG48cD5MZXRcdTIwMTlzIHJldmlldyBURi1JREYgYnkgdHJ5aW5nIHRvIGRldmVsb3AgaXQgZnJvbSBzY3JhdGNoLiBJbWFnaW5lIHdlXHUyMDE5cmUgYnVpbGRpbmcgYSBzZWFyY2ggZW5naW5lLiBBc3N1bWUgd2VcdTIwMTl2ZSBhbHJlYWR5IGdvdCBhIHdheSB0byBmaW5kIHRoZSBkb2N1bWVudHMgdGhhdCZuYnNwOzxlbT5tYXRjaDxcL2VtPiZuYnNwO2EgdXNlclx1MjAxOXMgc2VhcmNoLiBXaGF0IHdlIG5lZWQgbm93IGlzIGEmbmJzcDs8c3Ryb25nPnJhbmtpbmcgZnVuY3Rpb248XC9zdHJvbmc+Jm5ic3A7dGhhdCB3aWxsIHRlbGwgdXMgaG93IHRvIG9yZGVyIHRob3NlIGRvY3VtZW50cy4gVGhlIGhpZ2hlciBhIGRvY3VtZW50XHUyMDE5cyBzY29yZSBhY2NvcmRpbmcgdG8gdGhpcyBmdW5jdGlvbiwgdGhlIGhpZ2hlciB1cCB3ZVx1MjAxOWxsIHBsYWNlIGl0IGluIHRoZSBsaXN0IG9mIHJlc3VsdHMgdGhhdCB3ZSByZXR1cm4gdG8gdGhlIHVzZXIuPFwvcD5cbjwhLS0gXC93cDpwYXJhZ3JhcGggLS0+XG5cbjwhLS0gd3A6cGFyYWdyYXBoIC0tPlxuPHA+VGhlIGdvYWwgb2YgVEYtSURGIGFuZCBzaW1pbGFyIHJhbmtpbmcgZnVuY3Rpb25zIGlzIHRvIHJld2FyZCZuYnNwOzxlbT5yZWxldmFuY2U8XC9lbT4uIFNheSBhIHVzZXIgc2VhcmNoZXMgZm9yIHRoZSB0ZXJtIFx1MjAxY2RvZ3MuXHUyMDFkIElmIERvY3VtZW50IDEgaXMgbW9yZSByZWxldmFudCB0byB0aGUgc3ViamVjdCBvZiBkb2dzIHRoYW4gRG9jdW1lbnQgMiwgdGhlbiB3ZSB3YW50IHRoZSBzY29yZSBvZiBEb2N1bWVudCAxIHRvIGJlIGhpZ2hlciB0aGFuIHRoZSBzY29yZSBvZiBEb2N1bWVudCAyLCBzbyB3ZVx1MjAxOWxsIHNob3cgdGhlIGJldHRlciByZXN1bHQgZmlyc3QgYW5kIHRoZSB1c2VyIHdpbGwgYmUgaGFwcHkuIEhvdyBtdWNoIGhpZ2hlciBkb2VzIERvY3VtZW50IDFcdTIwMTlzIHNjb3JlIGhhdmUgdG8gYmU\\\/IEl0IGRvZXNuXHUyMDE5dCByZWFsbHkgbWF0dGVyLCBhcyBsb25nIGFzIHRoZSBzY29yZSBvcmRlciBtYXRjaGVzIHRoZSByZWxldmFuY2Ugb3JkZXIuPFwvcD5cbjwhLS0gXC93cDpwYXJhZ3JhcGggLS0+XG5cbjwhLS0gd3A6cGFyYWdyYXBoIC0tPlxuPHA+WW91IG1pZ2h0IGZlZWwgYSBsaXR0bGUgc2hvY2tlZCBieSB0aGUgYXVkYWNpdHkgb2Ygd2hhdCB3ZVx1MjAxOXJlIGF0dGVtcHRpbmcgdG8gZG86IHdlXHUyMDE5cmUgZ29pbmcgdG8gdHJ5IHRvIGp1ZGdlIHRoZSByZWxldmFuY2Ugb2YgbWlsbGlvbnMgb3IgYmlsbGlvbnMgb2YgZG9jdW1lbnRzIHVzaW5nIGEgbWF0aGVtYXRpY2FsIGZ1bmN0aW9uLCB3aXRob3V0IGtub3dpbmcgYW55dGhpbmcgYWJvdXQgdGhlIHBlcnNvbiB3aG9cdTIwMTlzIGRvaW5nIHRoZSBzZWFyY2gsIGFuZCB3aXRob3V0IGFjdHVhbGx5IHJlYWRpbmcgdGhlIGRvY3VtZW50cyBhbmQgdW5kZXJzdGFuZGluZyB3aGF0IHRoZXlcdTIwMTlyZSBhYm91dCEgSG93IGlzIHRoaXMgcG9zc2libGU\\\/PFwvcD5cbjwhLS0gXC93cDpwYXJhZ3JhcGggLS0+XG5cbjwhLS0gd3A6cGFyYWdyYXBoIC0tPlxuPHA+V2VcdTIwMTlsbCBtYWtlIGEgc2ltcGxlIGJ1dCBwcm9mb3VuZGx5IGhlbHBmdWwgYXNzdW1wdGlvbi4gV2VcdTIwMTlsbCBhc3N1bWUgdGhhdCB0aGUgbW9yZSB0aW1lcyBhIGRvY3VtZW50IGNvbnRhaW5zIGEgdGVybSwgdGhlIG1vcmUgbGlrZWx5IGl0IGlzIHRvIGJlJm5ic3A7PGVtPmFib3V0PFwvZW0+Jm5ic3A7dGhhdCB0ZXJtLiBUaGF0XHUyMDE5cyB0byBzYXksIHdlXHUyMDE5bGwgdXNlJm5ic3A7PHN0cm9uZz50ZXJtIGZyZXF1ZW5jeSAoVEYpPFwvc3Ryb25nPiwgdGhlIG51bWJlciBvZiBvY2N1cnJlbmNlcyBvZiBhIHRlcm0gaW4gYSBkb2N1bWVudCwgYXMgYSBwcm94eSBmb3IgcmVsZXZhbmNlLiBUaGlzIG9uZSBhc3N1bXB0aW9uIGNyZWF0ZXMgYSBwYXRoIGZvciB1cyB0byBzb2x2ZSBhIHNlZW1pbmdseSBpbXBvc3NpYmxlIHByb2JsZW0gdXNpbmcgc2ltcGxlIG1hdGguIE91ciBhc3N1bXB0aW9uIGlzblx1MjAxOXQgcGVyZmVjdCwgYW5kIGl0IGdvZXMgdmVyeSB3cm9uZyBzb21ldGltZXMsIGJ1dCBpdCB3b3JrcyBvZnRlbiBlbm91Z2ggdG8gYmUgdXNlZnVsLiBTbyBmcm9tIGhlcmUgb24sIHdlXHUyMDE5bGwgdmlldyB0ZXJtIGZyZXF1ZW5jeSBhcyBhIGdvb2QgdGhpbmdcdTIwMGFcdTIwMTRcdTIwMGFhIHRoaW5nIHdlIHdhbnQgdG8gcmV3YXJkLjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPjxzdHJvbmc+VEYtSURGOiBBdHRlbXB0IDE8XC9zdHJvbmc+PFwvcD5cbjwhLS0gXC93cDpwYXJhZ3JhcGggLS0+XG5cbjwhLS0gd3A6cGFyYWdyYXBoIC0tPlxuPHA+QXMgYSBzdGFydGluZyBwb2ludCBmb3Igb3VyIHJhbmtpbmcgZnVuY3Rpb24sIGxldFx1MjAxOXMgZG8gdGhlIHNpbXBsZXN0LCBlYXNpZXN0IHRoaW5nIHBvc3NpYmxlLiBXZVx1MjAxOWxsIHNldCB0aGUgc2NvcmUgb2YgYSBkb2N1bWVudCBlcXVhbCB0byBpdHMgdGVybSBmcmVxdWVuY3kuIElmIHdlXHUyMDE5cmUgc2VhcmNoaW5nIGZvciBhIHRlcm0gVCBhbmQgZXZhbHVhdGluZyB0aGUgcmVsZXZhbmNlIG9mIGEgZG9jdW1lbnQgRCwgdGhlbjo8XC9wPlxuPCEtLSBcL3dwOnBhcmFncmFwaCAtLT5cblxuPCEtLSB3cDpwcmVmb3JtYXR0ZWQgLS0+XG48cHJlIGNsYXNzPVwid3AtYmxvY2stcHJlZm9ybWF0dGVkXCI+PG1hcmsgc3R5bGU9XCJiYWNrZ3JvdW5kLWNvbG9yOnJnYmEoMCwgMCwgMCwgMClcIiBjbGFzcz1cImhhcy1pbmxpbmUtY29sb3IgaGFzLXZpdmlkLWN5YW4tYmx1ZS1jb2xvclwiPnNjb3JlKEQsIFQpID0gdGVybUZyZXF1ZW5jeShELCBUKTxcL21hcms+PFwvcHJlPlxuPCEtLSBcL3dwOnByZWZvcm1hdHRlZCAtLT5cblxuPCEtLSB3cDpwYXJhZ3JhcGggLS0+XG48cD5XaGVuIGEgcXVlcnkgaGFzIG11bHRpcGxlIHRlcm1zLCBsaWtlIFx1MjAxY2RvZ3MgYW5kIGNhdHMsXHUyMDFkIGhvdyBzaG91bGQgd2UgaGFuZGxlIHRoYXQ\\\/IFNob3VsZCB3ZSB0cnkgdG8gYW5hbHl6ZSB0aGUgcmVsYXRpb25zaGlwcyBiZXR3ZWVuIHRoZSB2YXJpb3VzIHRlcm1zIGFuZCB0aGVuIGJsZW5kIHRoZSBwZXItdGVybSBzY29yZXMgdG9nZXRoZXIgaW4gYSBjb21wbGV4IHdheT8gTm90IHNvIGZhc3QhIFRoZSBzaW1wbGVzdCBhcHByb2FjaCBpcyB0byBqdXN0IGFkZCB0aGUgc2NvcmVzIGZvciBlYWNoIHRlcm0gdG9nZXRoZXIuIFNvIHdlXHUyMDE5bGwgZG8gdGhhdCwgYW5kIGhvcGUgZm9yIHRoZSBiZXN0LiBJZiB3ZSBoYXZlIGEgbXVsdGktdGVybSBxdWVyeSBRLCB0aGVuIHdlXHUyMDE5bGwgc2V0OjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOnByZWZvcm1hdHRlZCAtLT5cbjxwcmUgY2xhc3M9XCJ3cC1ibG9jay1wcmVmb3JtYXR0ZWRcIj48bWFyayBzdHlsZT1cImJhY2tncm91bmQtY29sb3I6cmdiYSgwLCAwLCAwLCAwKVwiIGNsYXNzPVwiaGFzLWlubGluZS1jb2xvciBoYXMtdml2aWQtY3lhbi1ibHVlLWNvbG9yXCI+c2NvcmUoRCwgUSkgPSBzdW0gb3ZlciBhbGwgdGVybXMgVCBpbiBRIG9mIHNjb3JlKEQsIFQpPFwvbWFyaz5cbjxcL3ByZT5cbjwhLS0gXC93cDpwcmVmb3JtYXR0ZWQgLS0+XG5cbjwhLS0gd3A6cGFyYWdyYXBoIC0tPlxuPHA+SG93IHdlbGwgZG9lcyBvdXIgc2ltcGxlIHJhbmtpbmcgZnVuY3Rpb24gd29yaz8gVW5mb3J0dW5hdGVseSwgaXRcdTIwMTlzIGdvdCBzb21lIHByb2JsZW1zOjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPjEpIExvbmdlciBkb2N1bWVudHMgYXJlIGdpdmVuIGFuIHVuZmFpciBhZHZhbnRhZ2Ugb3ZlciBzaG9ydGVyIG9uZXMgYmVjYXVzZSB0aGV5IGhhdmUgbW9yZSBzcGFjZSB0byBpbmNsdWRlIG1vcmUgb2NjdXJyZW5jZXMgb2YgYSB0ZXJtLCBldmVuIHRob3VnaCB0aGV5IG1pZ2h0IG5vdCBiZSBtb3JlIHJlbGV2YW50IHRvIHRoZSB0ZXJtLiBMZXRcdTIwMTlzIGlnbm9yZSB0aGlzIHByb2JsZW0gZm9yIG5vdy48XC9wPlxuPCEtLSBcL3dwOnBhcmFncmFwaCAtLT5cblxuPCEtLSB3cDpwYXJhZ3JhcGggLS0+XG48cD4yKSBBbGwgdGVybXMgaW4gYSBxdWVyeSBhcmUgdHJlYXRlZCBlcXVhbGx5LCB3aXRoIG5vIGNvbnNpZGVyYXRpb24gZm9yIHdoaWNoIG9uZXMgYXJlIG1vcmUgbWVhbmluZ2Z1bCBvciBpbXBvcnRhbnQuIFdoZW4gd2Ugc3VtIHRoZSBzY29yZXMgZm9yIGVhY2ggdGVybSB0b2dldGhlciwgaW5zaWduaWZpY2FudCB0ZXJtcyBsaWtlIFx1MjAxY2FuZFx1MjAxZCBhbmQgXHUyMDFjdGhlXHUyMDFkIHdoaWNoIGhhcHBlbiB0byBiZSB2ZXJ5IGZyZXF1ZW50IHdpbGwgZG9taW5hdGUgdGhlIGNvbWJpbmVkIHNjb3JlLiBTYXkgeW91IHNlYXJjaCBmb3IgXHUyMDFjZWxlcGhhbnRzIGFuZCBjb3dzLlx1MjAxZCBQZXJoYXBzIHRoZXJlXHUyMDE5cyBhIHNpbmdsZSBkb2N1bWVudCBpbiB0aGUgaW5kZXggdGhhdCBpbmNsdWRlcyBhbGwgdGhyZWUgdGVybXMgKFx1MjAxY2VsZXBoYW50c1x1MjAxZCwgXHUyMDFjYW5kXHUyMDFkLCBcdTIwMWNjb3dzXHUyMDFkKSwgYnV0IGluc3RlYWQgb2Ygc2VlaW5nIHRoaXMgaWRlYWwgcmVzdWx0IGZpcnN0LCB5b3Ugc2VlIHRoZSBkb2N1bWVudCB0aGF0IGhhcyB0aGUgbW9zdCBvY2N1cnJlbmNlcyBvZiBcdTIwMWNhbmRcdTIwMWRcdTIwMGFcdTIwMTRcdTIwMGFtYXliZSBpdCBoYXMgMTAsMDAwIG9mIHRoZW0uIFRoaXMgcHJlZmVyZW5jZSBmb3IgZmlsbGVyIHdvcmRzIGlzIGNsZWFybHkgbm90IHdoYXQgd2Ugd2FudC48XC9wPlxuPCEtLSBcL3dwOnBhcmFncmFwaCAtLT5cblxuPCEtLSB3cDpwYXJhZ3JhcGggLS0+XG48cD48c3Ryb25nPlRGLUlERjogQXR0ZW1wdCAyPFwvc3Ryb25nPjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPlRvIHByZXZlbnQgZmlsbGVyIHdvcmRzIGZyb20gZG9taW5hdGluZywgd2UgbmVlZCBzb21lIHdheSBvZiBqdWRnaW5nIHRoZSZuYnNwOzxlbT5pbXBvcnRhbmNlPFwvZW0+Jm5ic3A7b2YgdGhlIHRlcm1zIGluIGEgcXVlcnkuIFNpbmNlIHdlIGNhblx1MjAxOXQgZW5jb2RlIGFuIHVuZGVyc3RhbmRpbmcgb2YgbmF0dXJhbCBsYW5ndWFnZSBpbnRvIG91ciBzY29yaW5nIGZ1bmN0aW9uLCB3ZVx1MjAxOWxsIHRyeSB0byBmaW5kIGEgcHJveHkgZm9yIGltcG9ydGFuY2UuIE91ciBiZXN0IGJldCBpcyZuYnNwOzxlbT5yYXJpdHk8XC9lbT4uIElmIGEgdGVybSBkb2Vzblx1MjAxOXQgb2NjdXIgaW4gbW9zdCBkb2N1bWVudHMgaW4gdGhlIGNvcnB1cywgdGhlbiB3aGVuZXZlciBpdCBkb2VzIG9jY3VyLCB3ZVx1MjAxOWxsIGd1ZXNzIHRoYXQgdGhpcyBvY2N1cnJlbmNlIGlzIHNpZ25pZmljYW50LiBPbiB0aGUgb3RoZXIgaGFuZCwgaWYgYSB0ZXJtIG9jY3VycyBpbiBtb3N0IG9mIHRoZSBkb2N1bWVudHMgaW4gb3VyIGNvcnB1cywgdGhlbiB0aGUgcHJlc2VuY2Ugb2YgdGhhdCB0ZXJtIGluIGFueSBwYXJ0aWN1bGFyIGRvY3VtZW50IHdpbGwgbG9zZSBpdHMgdmFsdWUgYXMgYW4gaW5kaWNhdG9yIG9mIHJlbGV2YW5jZS48XC9wPlxuPCEtLSBcL3dwOnBhcmFncmFwaCAtLT5cblxuPCEtLSB3cDpwYXJhZ3JhcGggLS0+XG48cD5TbyBoaWdoIHRlcm0gZnJlcXVlbmN5IGlzIGEgZ29vZCB0aGluZywgYnV0IGl0cyBnb29kbmVzcyBpcyBvZmZzZXQgYnkgaGlnaCZuYnNwOzxzdHJvbmc+ZG9jdW1lbnQgZnJlcXVlbmN5IChERilcdTIwMGE8XC9zdHJvbmc+XHUyMDE0XHUyMDBhdGhlIG51bWJlciBvZiBkb2N1bWVudHMgdGhhdCBjb250YWluIHRoZSB0ZXJtXHUyMDBhXHUyMDE0XHUyMDBhd2hpY2ggd2VcdTIwMTlsbCB0aGluayBvZiBhcyBhIGJhZCB0aGluZy48XC9wPlxuPCEtLSBcL3dwOnBhcmFncmFwaCAtLT5cblxuPCEtLSB3cDpwYXJhZ3JhcGggLS0+XG48cD5UbyB1cGRhdGUgb3VyIGZ1bmN0aW9uIGluIGEgd2F5IHRoYXQgcmV3YXJkcyB0ZXJtIGZyZXF1ZW5jeSBidXQgcGVuYWxpemVzIGRvY3VtZW50IGZyZXF1ZW5jeSwgd2UgY291bGQgdHJ5IGRpdmlkaW5nIFRGIGJ5IERGOjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOnByZWZvcm1hdHRlZCAtLT5cbjxwcmUgY2xhc3M9XCJ3cC1ibG9jay1wcmVmb3JtYXR0ZWRcIj48bWFyayBzdHlsZT1cImJhY2tncm91bmQtY29sb3I6cmdiYSgwLCAwLCAwLCAwKVwiIGNsYXNzPVwiaGFzLWlubGluZS1jb2xvciBoYXMtdml2aWQtY3lhbi1ibHVlLWNvbG9yXCI+c2NvcmUoRCwgVCkgPSB0ZXJtRnJlcXVlbmN5KEQsIFQpIFwvIGRvY0ZyZXF1ZW5jeShUKTxcL21hcms+PFwvcHJlPlxuPCEtLSBcL3dwOnByZWZvcm1hdHRlZCAtLT5cblxuPCEtLSB3cDpwYXJhZ3JhcGggLS0+XG48cD5XaGF0XHUyMDE5cyB3cm9uZyB3aXRoIHRoaXM\\\/IFVuZm9ydHVuYXRlbHksIERGIGJ5IGl0c2VsZiB0ZWxscyB1cyBub3RoaW5nLiBJZiBERiBmb3IgdGhlIHRlcm0gXHUyMDFjZWxlcGhhbnRcdTIwMWQgaXMgMTAwLCB0aGVuIGlzIFx1MjAxY2VsZXBoYW50XHUyMDFkIGEgcmFyZSB0ZXJtIG9yIGEgY29tbW9uIHRlcm0\\\/IEl0IGRlcGVuZHMgb24gdGhlIHNpemUgb2YgdGhlIGNvcnB1cy4gSWYgdGhlIGNvcnB1cyBjb250YWlucyAxMDAgZG9jdW1lbnRzLCBcdTIwMWNlbGVwaGFudFx1MjAxZCBpcyBjb21tb24sIGlmIGl0IGNvbnRhaW5zIDEwMCwwMDAgZG9jdW1lbnRzLCBcdTIwMWNlbGVwaGFudFx1MjAxZCBpcyByYXJlLjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPjxzdHJvbmc+VEYtSURGOiBBdHRlbXB0IDM8XC9zdHJvbmc+PFwvcD5cbjwhLS0gXC93cDpwYXJhZ3JhcGggLS0+XG5cbjwhLS0gd3A6cGFyYWdyYXBoIC0tPlxuPHA+SW5zdGVhZCBvZiBsb29raW5nIGF0IERGIGJ5IGl0c2VsZiwgbGV0XHUyMDE5cyBsb29rIGF0IE5cL0RGLCB3aGVyZSBOIGlzIHRoZSBzaXplIG9mIHRoZSBzZWFyY2ggaW5kZXggb3IgY29ycHVzLiBOb3RpY2UgaG93IE5cL0RGIGlzIGxvdyBmb3IgY29tbW9uIHRlcm1zICgxMDAgb2NjdXJyZW5jZXMgb2YgXHUyMDFjZWxlcGhhbnRcdTIwMWQgaW4gYSBjb3JwdXMgb2Ygc2l6ZSAxMDAgd291bGQgZ2l2ZSBOXC9ERiA9IDEpLCBhbmQgaGlnaCBmb3IgcmFyZSBvbmVzICgxMDAgb2NjdXJyZW5jZXMgb2YgXHUyMDFjZWxlcGhhbnQgaW4gYSBjb3JwdXMgb2Ygc2l6ZSAxMDAsMDAwIHdvdWxkIGdpdmUgTlwvREYgPSAxMDAwKS4gVGhhdFx1MjAxOXMgZXhhY3RseSB3aGF0IHdlIHdhbnQ6IG1hdGNoZXMgZm9yIGNvbW1vbiB0ZXJtcyBzaG91bGQgZ2V0IGxvdyBzY29yZXMsIG1hdGNoZXMgZm9yIHJhcmUgdGVybXMgc2hvdWxkIGdldCBoaWdoIG9uZXMuIE91ciBpbXByb3ZlZCBmb3JtdWxhIG1pZ2h0IGdvIGxpa2UgdGhpczo8XC9wPlxuPCEtLSBcL3dwOnBhcmFncmFwaCAtLT5cblxuPCEtLSB3cDpwcmVmb3JtYXR0ZWQgLS0+XG48cHJlIGNsYXNzPVwid3AtYmxvY2stcHJlZm9ybWF0dGVkXCI+PG1hcmsgc3R5bGU9XCJiYWNrZ3JvdW5kLWNvbG9yOnJnYmEoMCwgMCwgMCwgMClcIiBjbGFzcz1cImhhcy1pbmxpbmUtY29sb3IgaGFzLXZpdmlkLWN5YW4tYmx1ZS1jb2xvclwiPnNjb3JlKEQsIFQpID0gdGVybUZyZXF1ZW5jeShELCBUKSAqIChOIFwvIGRvY0ZyZXF1ZW5jeShUKSk8XC9tYXJrPjxcL3ByZT5cbjwhLS0gXC93cDpwcmVmb3JtYXR0ZWQgLS0+XG5cbjwhLS0gd3A6cGFyYWdyYXBoIC0tPlxuPHA+V2VcdTIwMTlyZSBkb2luZyBiZXR0ZXIsIGJ1dCBsZXRcdTIwMTlzIHRha2UgYSBjbG9zZXIgbG9vayBhdCBob3cgTlwvREYgYmVoYXZlcy4gU2F5IHdlIGhhdmUgMTAwIGRvY3VtZW50cyBhbmQgXHUyMDFjZWxlcGhhbnRcdTIwMWQgb2NjdXJzIGluIDEgb2YgdGhlbSB3aGlsZSBcdTIwMWNnaXJhZmZlXHUyMDFkIG9jY3VycyBpbiAyIG9mIHRoZW0uIEJvdGggdGVybXMgYXJlIHNpbWlsYXJseSByYXJlLCBidXQgZWxlcGhhbnRcdTIwMTlzIE5cL0RGIHZhbHVlIHdvdWxkIGNvbWUgb3V0IHRvIDEwMCBhbmQgZ2lyYWZmZVx1MjAxOXMgd291bGQgYmUgaGFsZiB0aGF0LCBhdCA1MC4gU2hvdWxkIGEgbWF0Y2ggZm9yIGdpcmFmZmUgZ2V0IGhhbGYgdGhlIHNjb3JlIG9mIG1hdGNoIGZvciBlbGVwaGFudCBqdXN0IGJlY2F1c2UgZ2lyYWZmZVx1MjAxOXMgZG9jdW1lbnQgZnJlcXVlbmN5IGlzIG9uZSBoaWdoZXIgdGhlbiBlbGVwaGFudFx1MjAxOXM\\\/IFRoZSBwZW5hbHR5IGZvciBvbmUgYWRkaXRpb25hbCBvY2N1cnJlbmNlIG9mIHRoZSB3b3JkIGluIHRoZSBjb3JwdXMgc2VlbXMgdG9vIGhpZ2guIEFyZ3VhYmx5LCBpZiB3ZSBoYXZlIDEwMCBkb2N1bWVudHMsIGl0IHNob3VsZG5cdTIwMTl0IG1ha2UgbXVjaCBvZiBhIGRpZmZlcmVuY2Ugd2hldGhlciBhIHRlcm1cdTIwMTlzIERGIGlzIDEsIDIsIDMsIG9yIDQmbmJzcDsuPFwvcD5cbjwhLS0gXC93cDpwYXJhZ3JhcGggLS0+XG5cbjwhLS0gd3A6cGFyYWdyYXBoIC0tPlxuPHA+PHN0cm9uZz5URi1JREY6IEF0dGVtcHQgNDxcL3N0cm9uZz48XC9wPlxuPCEtLSBcL3dwOnBhcmFncmFwaCAtLT5cblxuPCEtLSB3cDpwYXJhZ3JhcGggLS0+XG48cD5BcyB3ZVx1MjAxOXZlIHNlZW4sIHdoZW4gREYgaXMgaW4gYSB2ZXJ5IGxvdyByYW5nZSwgc21hbGwgZGlmZmVyZW5jZXMgaW4gREYgY2FuIGhhdmUgYSBkcmFtYXRpYyBpbXBhY3Qgb24gTlwvREYgYW5kIGhlbmNlIG9uIHRoZSBzY29yZS4gV2UgbWlnaHQgbGlrZSB0byBzbW9vdGggb3V0IHRoZSBkZWNsaW5lIG9mIE5cL0RGIHdoZW4gREYgaXMgaW4gdGhlIGxvd2VzdCBlbmQgb2YgaXRzIHJhbmdlLiBPbmUgd2F5IHRvIGRvIHRoaXMgaXMgdG8gdGFrZSB0aGUmbmJzcDs8c3Ryb25nPmxvZzxcL3N0cm9uZz4mbmJzcDtvZiBOXC9ERi4gSWYgd2Ugd2FudGVkLCB3ZSBjb3VsZCB0cnkgdG8gdXNlIGEgZGlmZmVyZW50IHNtb290aGluZyBmdW5jdGlvbiBoZXJlLCBidXQgbG9nIGlzIHN0cmFpZ2h0Zm9yd2FyZCBhbmQgaXQgZG9lcyB3aGF0IHdlIHdhbnQuIFRoaXMgY2hhcnQgY29tcGFyZXMgTlwvREYgYW5kIGxvZyhOXC9ERikgYXNzdW1pbmcgTj0xMDA6PFwvcD5cbjwhLS0gXC93cDpwYXJhZ3JhcGggLS0+XG5cbjwhLS0gd3A6aW1hZ2Uge1wiaWRcIjoyNjQwMixcInNpemVTbHVnXCI6XCJsYXJnZVwiLFwibGlua0Rlc3RpbmF0aW9uXCI6XCJub25lXCJ9IC0tPlxuPGZpZ3VyZSBjbGFzcz1cIndwLWJsb2NrLWltYWdlIHNpemUtbGFyZ2VcIj48aW1nIHNyYz1cImh0dHBzOlwvXC9rbXdsbGMuY29tXC93cC1jb250ZW50XC91cGxvYWRzXC8yMDIxXC8wNVwvQXBwbHlpbmdMb2dUb0lERi0xMDI0eDcwNy5wbmdcIiBhbHQ9XCJcIiBjbGFzcz1cIndwLWltYWdlLTI2NDAyXCJcLz48XC9maWd1cmU+XG48IS0tIFwvd3A6aW1hZ2UgLS0+XG5cbjwhLS0gd3A6cGFyYWdyYXBoIC0tPlxuPHA+TGV0XHUyMDE5cyBjYWxsIGxvZyhOXC9ERikgdGhlJm5ic3A7PHN0cm9uZz5pbnZlcnNlIGRvY3VtZW50IGZyZXF1ZW5jeSAoSURGKTxcL3N0cm9uZz4mbmJzcDtvZiBhIHRlcm0uIE91ciByYW5raW5nIGZ1bmN0aW9uIGNhbiBub3cgYmUgZXhwcmVzc2VkIGFzIFRGICogSURGIG9yOjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOnByZWZvcm1hdHRlZCAtLT5cbjxwcmUgY2xhc3M9XCJ3cC1ibG9jay1wcmVmb3JtYXR0ZWRcIj48bWFyayBzdHlsZT1cImJhY2tncm91bmQtY29sb3I6cmdiYSgwLCAwLCAwLCAwKVwiIGNsYXNzPVwiaGFzLWlubGluZS1jb2xvciBoYXMtdml2aWQtY3lhbi1ibHVlLWNvbG9yXCI+c2NvcmUoRCwgVCkgPSB0ZXJtRnJlcXVlbmN5KEQsIFQpICogbG9nKE4gXC8gZG9jRnJlcXVlbmN5KFQpKTxcL21hcms+PFwvcHJlPlxuPCEtLSBcL3dwOnByZWZvcm1hdHRlZCAtLT5cblxuPCEtLSB3cDpwYXJhZ3JhcGggLS0+XG48cD5XZVx1MjAxOXZlIGFycml2ZWQgYXQgdGhlIHRyYWRpdGlvbmFsIGRlZmluaXRpb24gb2YgVEYtSURGIGFuZCBldmVuIHRob3VnaCB3ZSBtYWRlIHNvbWUgYm9sZCBhc3N1bXB0aW9ucyB0byBnZXQgaGVyZSwgdGhlIGZ1bmN0aW9uIHdvcmtzIHByZXR0eSB3ZWxsIGluIHByYWN0aWNlOiBpdCBoYXMgZ2F0aGVyZWQgYSBsb25nIHRyYWNrIHJlY29yZCBvZiBzdWNjZXNzZnVsIGFwcGxpY2F0aW9uIGluIHNlYXJjaCBlbmdpbmVzLiBBcmUgd2UgZG9uZSBvciBjb3VsZCB3ZSBkbyBldmVuIGJldHRlcj88XC9wPlxuPCEtLSBcL3dwOnBhcmFncmFwaCAtLT5cblxuPCEtLSB3cDpoZWFkaW5nIHtcImxldmVsXCI6NH0gLS0+XG48aDQ+RGV2ZWxvcGluZyBCTTI1PFwvaDQ+XG48IS0tIFwvd3A6aGVhZGluZyAtLT5cblxuPCEtLSB3cDpwYXJhZ3JhcGggLS0+XG48cD5BcyB5b3UgbWlnaHQgaGF2ZSBndWVzc2VkLCB3ZVx1MjAxOXJlIG5vdCByZWFkeSB0byBzdG9wIGF0IFRGLUlERi4gSW4gdGhpcyBzZWN0aW9uLCB3ZVx1MjAxOWxsIGJ1aWxkIHRoZSBCTTI1IGZ1bmN0aW9uLCB3aGljaCBjYW4gYmUgc2VlbiBhcyBhbiBpbXByb3ZlbWVudCBvbiBURi1JREYuIFdlXHUyMDE5cmUgZ29pbmcgdG8ga2VlcCB0aGUgc2FtZSBzdHJ1Y3R1cmUgb2YgdGhlIFRGICogSURGIGZvcm11bGEsIGJ1dCB3ZVx1MjAxOWxsIHJlcGxhY2UgdGhlIFRGIGFuZCBJREYgY29tcG9uZW50cyB3aXRoIHJlZmluZW1lbnRzIG9mIHRob3NlIHZhbHVlcy48XC9wPlxuPCEtLSBcL3dwOnBhcmFncmFwaCAtLT5cblxuPCEtLSB3cDpwYXJhZ3JhcGggLS0+XG48cD48c3Ryb25nPlN0ZXAgMTogVGVybSBTYXR1cmF0aW9uPFwvc3Ryb25nPjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPldlXHUyMDE5dmUgYmVlbiBzYXlpbmcgdGhhdCBURiBpcyBhIGdvb2QgdGhpbmcsIGFuZCBpbmRlZWQgb3VyIFRGLUlERiBmb3JtdWxhIHJld2FyZHMgaXQuIEJ1dCBpZiBhIGRvY3VtZW50IGNvbnRhaW5zIDIwMCBvY2N1cnJlbmNlcyBvZiBcdTIwMWNlbGVwaGFudCxcdTIwMWQgaXMgaXQgcmVhbGx5Jm5ic3A7PGVtPnR3aWNlPFwvZW0+Jm5ic3A7YXMgcmVsZXZhbnQgYXMgYSBkb2N1bWVudCB0aGF0IGNvbnRhaW5zIDEwMCBvY2N1cnJlbmNlcz8gV2UgY291bGQgYXJndWUgdGhhdCBpZiBcdTIwMWNlbGVwaGFudFx1MjAxZCBvY2N1cnMgYSBsYXJnZSBlbm91Z2ggbnVtYmVyIG9mIHRpbWVzLCBzYXkgMTAwLCB0aGUgZG9jdW1lbnQgaXMgYWxtb3N0IGNlcnRhaW5seSByZWxldmFudCwgYW5kIGFueSBmdXJ0aGVyIG1lbnRpb25zIGRvblx1MjAxOXQgcmVhbGx5IGluY3JlYXNlIHRoZSBsaWtlbGlob29kIG9mIHJlbGV2YW5jZS4gVG8gcHV0IGl0IGEgZGlmZmVyZW50IHdheSwgb25jZSBhIGRvY3VtZW50IGlzJm5ic3A7PGVtPnNhdHVyYXRlZDxcL2VtPiZuYnNwO3dpdGggb2NjdXJyZW5jZXMgb2YgYSB0ZXJtLCBtb3JlIG9jY3VycmVuY2VzIHNob3VsZG5cdTIwMTl0IGEgaGF2ZSBhIHNpZ25pZmljYW50IGltcGFjdCBvbiB0aGUgc2NvcmUuIFNvIHdlXHUyMDE5ZCBsaWtlIGEgd2F5IHRvIGNvbnRyb2wgdGhlIGNvbnRyaWJ1dGlvbiBvZiBURiB0byBvdXIgc2NvcmUuIFdlXHUyMDE5ZCBsaWtlIHRoaXMgY29udHJpYnV0aW9uIHRvIGluY3JlYXNlIGZhc3Qgd2hlbiBURiBpcyBzbWFsbCBhbmQgdGhlbiBpbmNyZWFzZSBtb3JlIHNsb3dseSwgYXBwcm9hY2hpbmcgYSBsaW1pdCwgYXMgVEYgZ2V0cyB2ZXJ5IGJpZy48XC9wPlxuPCEtLSBcL3dwOnBhcmFncmFwaCAtLT5cblxuPCEtLSB3cDpwYXJhZ3JhcGggLS0+XG48cD5PbmUgY29tbW9uIHdheSB0byB0YW1lIFRGIGlzIHRvIHRha2UgdGhlIHNxdWFyZSByb290IG9mIGl0LCBidXQgdGhhdFx1MjAxOXMgc3RpbGwgYW4gdW5ib3VuZGVkIHF1YW50aXR5LiBXZVx1MjAxOWQgbGlrZSB0byBkbyBzb21ldGhpbmcgbW9yZSBzb3BoaXN0aWNhdGVkLiBXZVx1MjAxOWQgbGlrZSB0byBwdXQgYSBib3VuZCBvbiBURlx1MjAxOXMgY29udHJpYnV0aW9uIHRvIHRoZSBzY29yZSwgYW5kIHdlXHUyMDE5ZCBsaWtlIHRvIGJlIGFibGUgdG8gY29udHJvbCBob3cgcmFwaWRseSB0aGUgY29udHJpYnV0aW9uIGFwcHJvYWNoZXMgdGhhdCBib3VuZC4gV291bGRuXHUyMDE5dCBpdCBiZSBuaWNlIGlmIHdlIGhhZCBhIHBhcmFtZXRlciZuYnNwOzxzdHJvbmc+azxcL3N0cm9uZz4mbmJzcDt0aGF0IGNvdWxkIGNvbnRyb2wgdGhlIHNoYXBlIG9mIHRoaXMgc2F0dXJhdGlvbiBjdXJ2ZT8gVGhhdCB3YXksIHdlXHUyMDE5ZCBiZSBhYmxlIHRvIGV4cGVyaW1lbnQgd2l0aCBkaWZmZXJlbnQgdmFsdWVzIG9mIGsgYW5kIHNlZSB3aGF0IHdvcmtzIGJlc3QgZm9yIGEgcGFydGljdWxhciBjb3JwdXMuPFwvcD5cbjwhLS0gXC93cDpwYXJhZ3JhcGggLS0+XG5cbjwhLS0gd3A6cGFyYWdyYXBoIC0tPlxuPHA+VG8gYWNoaWV2ZSB0aGlzLCB3ZVx1MjAxOWxsIHB1bGwgb3V0IGEgdHJpY2suIEluc3RlYWQgb2YgdXNpbmcgcmF3IFRGIGluIG91ciByYW5raW5nIGZvcm11bGEsIHdlXHUyMDE5bGwgdXNlIHRoZSB2YWx1ZTo8XC9wPlxuPCEtLSBcL3dwOnBhcmFncmFwaCAtLT5cblxuPCEtLSB3cDpwcmVmb3JtYXR0ZWQgLS0+XG48cHJlIGNsYXNzPVwid3AtYmxvY2stcHJlZm9ybWF0dGVkXCI+PG1hcmsgc3R5bGU9XCJiYWNrZ3JvdW5kLWNvbG9yOnJnYmEoMCwgMCwgMCwgMClcIiBjbGFzcz1cImhhcy1pbmxpbmUtY29sb3IgaGFzLXZpdmlkLWN5YW4tYmx1ZS1jb2xvclwiPlRGIFwvIChURiArIGspPFwvbWFyaz48XC9wcmU+XG48IS0tIFwvd3A6cHJlZm9ybWF0dGVkIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPklmIGsgaXMgc2V0IHRvIDEsIHRoaXMgd291bGQgZ2VuZXJhdGUgdGhlIHNlcXVlbmNlIDFcLzIsIDJcLzMsIDNcLzQsIDRcLzUsIDVcLzYgYXMgVEYgaW5jcmVhc2VzIDEsIDIsIDMsIGV0Yy4gTm90aWNlIGhvdyB0aGlzIHNlcXVlbmNlIGdyb3dzIGZhc3QgaW4gdGhlIGJlZ2lubmluZyBhbmQgdGhlbiBtb3JlIHNsb3dseSwgYXBwcm9hY2hpbmcgMSBpbiBzbWFsbGVyIGFuZCBzbWFsbGVyIGluY3JlbWVudHMuIFRoYXRcdTIwMTlzIHdoYXQgd2Ugd2FudC4gTm93IGlmIHdlIGNoYW5nZSBrIHRvIDIsIHdlXHUyMDE5ZCBnZXQgMVwvMywgMlwvNCwgM1wvNSwgNFwvNiB3aGljaCBncm93cyBhIGxpdHRsZSBtb3JlIHNsb3dseS4gSGVyZVx1MjAxOXMgYSBncmFwaCBvZiB0aGUgZm9ybXVsYSBURlwvKFRGICsgaykgZm9yIGsgPSAxLCAyLCAzLCA0OjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOmltYWdlIHtcImlkXCI6MjY0MDMsXCJzaXplU2x1Z1wiOlwibGFyZ2VcIixcImxpbmtEZXN0aW5hdGlvblwiOlwibm9uZVwifSAtLT5cbjxmaWd1cmUgY2xhc3M9XCJ3cC1ibG9jay1pbWFnZSBzaXplLWxhcmdlXCI+PGltZyBzcmM9XCJodHRwczpcL1wva213bGxjLmNvbVwvd3AtY29udGVudFwvdXBsb2Fkc1wvMjAyMVwvMDVcL1RGU2F0dXJhdGlvbi0xMDI0eDM4Ny5wbmdcIiBhbHQ9XCJcIiBjbGFzcz1cIndwLWltYWdlLTI2NDAzXCJcLz48XC9maWd1cmU+XG48IS0tIFwvd3A6aW1hZ2UgLS0+XG5cbjwhLS0gd3A6cGFyYWdyYXBoIC0tPlxuPHA+VGhpcyBURlwvKFRGICsgaykgdHJpY2sgaXMgcmVhbGx5IHRoZSBiYWNrYm9uZSBvZiBCTTI1LiBJdCBsZXRzIHVzIGNvbnRyb2wgdGhlIGNvbnRyaWJ1dGlvbiBvZiBURiB0byB0aGUgc2NvcmUgaW4gYSB0dW5hYmxlIHdheS48XC9wPlxuPCEtLSBcL3dwOnBhcmFncmFwaCAtLT5cblxuPCEtLSB3cDpwYXJhZ3JhcGggLS0+XG48cD48c3Ryb25nPkFzaWRlOiBUZXJtIFNhdHVyYXRpb24gYW5kIE11bHRpLVRlcm0gUXVlcmllczxcL3N0cm9uZz48XC9wPlxuPCEtLSBcL3dwOnBhcmFncmFwaCAtLT5cblxuPCEtLSB3cDpwYXJhZ3JhcGggLS0+XG48cD5BIGZvcnR1bmF0ZSBzaWRlLWVmZmVjdCBvZiB1c2luZyBURlwvKFRGICsgaykgdG8gYWNjb3VudCBmb3IgdGVybSBzYXR1cmF0aW9uIGlzIHRoYXQgd2UgZW5kIHVwIHJld2FyZGluZyBjb21wbGV0ZSBtYXRjaGVzIG92ZXIgcGFydGlhbCBvbmVzLiBUaGF0XHUyMDE5cyB0byBzYXksIHdlIHJld2FyZCBkb2N1bWVudHMgdGhhdCBtYXRjaCBtb3JlIG9mIHRoZSB0ZXJtcyBpbiBhIG11bHRpLXRlcm0gcXVlcnkgb3ZlciBkb2N1bWVudHMgdGhhdCBoYXZlIGxvdHMgb2YgbWF0Y2hlcyBmb3IganVzdCBvbmUgb2YgdGhlIHRlcm1zLjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPkxldFx1MjAxOXMgc2F5IHRoYXQgXHUyMDFjY2F0XHUyMDFkIGFuZCBcdTIwMWNkb2dcdTIwMWQgaGF2ZSB0aGUgc2FtZSBJREYgdmFsdWVzLiBJZiB3ZSBzZWFyY2ggZm9yIFx1MjAxY2NhdCBkb2dcdTIwMWQgd2VcdTIwMTlkIGxpa2UgYSBkb2N1bWVudCB0aGF0IGNvbnRhaW5zIG9uZSBpbnN0YW5jZSBvZiBlYWNoIHRlcm0gdG8gZG8gYmV0dGVyIHRoYW4gYSBkb2N1bWVudCB0aGF0IGhhcyB0d28gaW5zdGFuY2VzIG9mIFx1MjAxY2NhdFx1MjAxZCBhbmQgbm9uZSBvZiBcdTIwMWNkb2cuXHUyMDFkIElmIHdlIHdlcmUgdXNpbmcgcmF3IFRGIHRoZXlcdTIwMTlkIGJvdGggZ2V0IHRoZSBzYW1lIHNjb3JlLiBCdXQgbGV0XHUyMDE5cyBkbyBvdXIgaW1wcm92ZWQgY2FsY3VsYXRpb24gYXNzdW1pbmcgaz0xLiBJbiBvdXIgXHUyMDFjY2F0IGRvZ1x1MjAxZCBkb2N1bWVudCwgXHUyMDFjY2F0XHUyMDFkIGFuZCBcdTIwMWNkb2dcdTIwMWQgZWFjaCBoYXZlIFRGPTEsIHNvIGVhY2ggYXJlIGdvaW5nIHRvIGNvbnRyaWJ1dGUgVEZcLyhURisxKSA9IDFcLzIgdG8gdGhlIHNjb3JlLCBmb3IgYSB0b3RhbCBvZiAxLiBJbiBvdXIgXHUyMDFjY2F0IGNhdFx1MjAxZCBkb2N1bWVudCwgXHUyMDFjY2F0XHUyMDFkIGhhcyBhIFRGIG9mIDIsIHNvIGl0XHUyMDE5cyBnb2luZyB0byBjb250cmlidXRlIFRGXC8oVEYrMSkgPSAyXC8zIHRvIHRoZSBzY29yZS4gVGhlIFx1MjAxY2NhdCBkb2dcdTIwMWQgZG9jdW1lbnQgd2lucywgYmVjYXVzZSBcdTIwMWNjYXRcdTIwMWQgYW5kIFx1MjAxY2RvZ1x1MjAxZCBjb250cmlidXRlIG1vcmUgd2hlbiBlYWNoIG9jY3VycyBvbmNlIHRoYW4gXHUyMDFjY2F0XHUyMDFkIGNvbnRyaWJ1dGVzIHdoZW4gaXQgb2NjdXJzIHR3aWNlLjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPkFzc3VtaW5nIHRoZSBJREYgb2YgdHdvIHRlcm1zIGlzIHRoZSBzYW1lLCBpdFx1MjAxOXMgYWx3YXlzIGJldHRlciB0byBoYXZlIG9uZSBpbnN0YW5jZSBvZiBlYWNoIHRlcm0gdGhhbiB0byBoYXZlIHR3byBpbnN0YW5jZXMgb2Ygb25lIG9mIHRoZW0uPFwvcD5cbjwhLS0gXC93cDpwYXJhZ3JhcGggLS0+XG5cbjwhLS0gd3A6cGFyYWdyYXBoIC0tPlxuPHA+PHN0cm9uZz5TdGVwIDI6IERvY3VtZW50IExlbmd0aDxcL3N0cm9uZz48XC9wPlxuPCEtLSBcL3dwOnBhcmFncmFwaCAtLT5cblxuPCEtLSB3cDpwYXJhZ3JhcGggLS0+XG48cD5Ob3cgbGV0XHUyMDE5cyBnbyBiYWNrIHRvIHRoZSBwcm9ibGVtIHdlIHNraXBwZWQgb3ZlciB3aGVuIHdlIHdlcmUgZmlyc3QgYnVpbGRpbmcgVEYtSURGOiBkb2N1bWVudCBsZW5ndGguIElmIGEgZG9jdW1lbnQgaGFwcGVucyB0byBiZSByZWFsbHkgc2hvcnQgYW5kIGl0IGNvbnRhaW5zIFx1MjAxY2VsZXBoYW50XHUyMDFkIG9uY2UsIHRoYXRcdTIwMTlzIGEgZ29vZCBpbmRpY2F0b3IgdGhhdCBcdTIwMWNlbGVwaGFudFx1MjAxZCBpcyBpbXBvcnRhbnQgdG8gdGhlIGNvbnRlbnQuIEJ1dCBpZiB0aGUgZG9jdW1lbnQgaXMgcmVhbGx5LCByZWFsbHkgbG9uZyBhbmQgaXQgbWVudGlvbnMgZWxlcGhhbnQgb25seSBvbmNlLCB0aGUgZG9jdW1lbnQgaXMgcHJvYmFibHkgbm90IGFib3V0IGVsZXBoYW50cy4gU28gd2VcdTIwMTlkIGxpa2UgdG8gcmV3YXJkIG1hdGNoZXMgaW4gc2hvcnQgZG9jdW1lbnRzLCB3aGlsZSBwZW5hbGl6aW5nIG1hdGNoZXMgaW4gbG9uZyBkb2N1bWVudHMuIEhvdyBjYW4gd2UgYWNoaWV2ZSB0aGlzPzxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPkZpcnN0LCB3ZVx1MjAxOXZlIGdvdCB0byBkZWNpZGUgd2hhdCBpdCBtZWFucyBmb3IgYSBkb2N1bWVudCB0byBiZSBzaG9ydCBvciBsb25nLiBXZSBuZWVkIGEgZnJhbWUgb2YgcmVmZXJlbmNlLCBzbyB3ZVx1MjAxOWxsIHVzZSB0aGUgY29ycHVzIGl0c2VsZiBhcyBvdXIgZnJhbWUgb2YgcmVmZXJlbmNlLiBBIHNob3J0IGRvY3VtZW50IGlzIHNpbXBseSBvbmUgdGhhdCBpcyZuYnNwOzxlbT5zaG9ydGVyIHRoYW4gYXZlcmFnZTxcL2VtPiZuYnNwO2ZvciB0aGUgY29ycHVzLjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPkxldFx1MjAxOXMgZ28gYmFjayB0byBvdXIgVEZcLyhURiArIGspIHRyaWNrLiBPZiBjb3Vyc2UgYXMgayBpbmNyZWFzZXMsIHRoZSB2YWx1ZSBvZiBURlwvKFRGICsgaykgZGVjcmVhc2VzLiBUbyBwZW5hbGl6ZSBsb25nIGRvY3VtZW50cywgd2UgY2FuIGFkanVzdCBrIHVwIGlmIHRoZSBkb2N1bWVudCBpcyBsb25nZXIgdGhhbiBhdmVyYWdlLCBhbmQgYWRqdXN0IGl0IGRvd24gaWYgdGhlIGRvY3VtZW50IGlzIHNob3J0ZXIgdGhhbiBhdmVyYWdlLiBXZVx1MjAxOWxsIGFjaGlldmUgdGhpcyBieSBtdWx0aXBseWluZyBrIGJ5IHRoZSByYXRpbyZuYnNwOzxzdHJvbmc+ZGxcL2FkbDxcL3N0cm9uZz4uIEhlcmUsJm5ic3A7PGVtPmRsPFwvZW0+Jm5ic3A7aXMgdGhlIGRvY3VtZW50XHUyMDE5cyBsZW5ndGgsIGFuZCZuYnNwOzxlbT5hZGw8XC9lbT4mbmJzcDtpcyB0aGUgYXZlcmFnZSBkb2N1bWVudCBsZW5ndGggYWNyb3NzIHRoZSBjb3JwdXMuPFwvcD5cbjwhLS0gXC93cDpwYXJhZ3JhcGggLS0+XG5cbjwhLS0gd3A6cGFyYWdyYXBoIC0tPlxuPHA+V2hlbiBhIGRvY3VtZW50IGlzIG9mIGF2ZXJhZ2UgbGVuZ3RoLCBkbFwvYWRsID0xLCBhbmQgb3VyIG11bHRpcGxpZXIgZG9lc25cdTIwMTl0IGFmZmVjdCBrIGF0IGFsbC4gRm9yIGEgZG9jdW1lbnQgdGhhdFx1MjAxOXMgc2hvcnRlciB0aGFuIGF2ZXJhZ2UsIHdlXHUyMDE5bGwgYmUgbXVsdGlwbHlpbmcgayBieSBhIHZhbHVlIGJldHdlZW4gMCBhbmQgMSwgdGhlcmVieSByZWR1Y2luZyBpdCwgYW5kIGluY3JlYXNpbmcgVEZcLyhURitrKS4gRm9yIGEgZG9jdW1lbnQgdGhhdFx1MjAxOXMgbG9uZ2VyIHRoYW4gYXZlcmFnZSwgd2VcdTIwMTlsbCBiZSBtdWx0aXBseWluZyBrIGJ5IGEgdmFsdWUgZ3JlYXRlciB0aGFuIDEsIHRoZXJlYnkgaW5jcmVhc2luZyBpdCwgYW5kIHJlZHVjaW5nIFRGXC8oVEYraykuIFRoZSBtdWx0aXBsaWVyIGFsc28gcHV0cyB1cyBvbiBhIGRpZmZlcmVudCBURiBzYXR1cmF0aW9uIGN1cnZlLiBTaG9ydGVyIGRvY3VtZW50cyB3aWxsIGFwcHJvYWNoIGEgVEYgc2F0dXJhdGlvbiBwb2ludCBtb3JlIHF1aWNrbHkgd2hpbGUgbG9uZ2VyIGRvY3VtZW50cyB3aWxsIGFwcHJvYWNoIGl0IG1vcmUgZ3JhZHVhbGx5LjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPjxzdHJvbmc+U3RlcCAzOiBQYXJhbWV0ZXJpemluZyBEb2N1bWVudCBMZW5ndGg8XC9zdHJvbmc+PFwvcD5cbjwhLS0gXC93cDpwYXJhZ3JhcGggLS0+XG5cbjwhLS0gd3A6cGFyYWdyYXBoIC0tPlxuPHA+SW4gdGhlIGxhc3Qgc2VjdGlvbiwgd2UgdXBkYXRlZCBvdXIgcmFua2luZyBmdW5jdGlvbiB0byBhY2NvdW50IGZvciBkb2N1bWVudCBsZW5ndGgsIGJ1dCBpcyB0aGlzIGFsd2F5cyBhIGdvb2QgaWRlYT8gSnVzdCBob3cgbXVjaCBpbXBvcnRhbmNlIHNob3VsZCB3ZSBwbGFjZSBvbiBkb2N1bWVudCBsZW5ndGggaW4gYW55IHBhcnRpY3VsYXIgY29ycHVzPyBNaWdodCB0aGVyZSBiZSBzb21lIGNvbGxlY3Rpb25zIG9mIGRvY3VtZW50cyB3aGVyZSBsZW5ndGggbWF0dGVycyBhIGxvdCBhbmQgc29tZSB3aGVyZSBpdCBkb2Vzblx1MjAxOXQ\\\/IFdlIG1pZ2h0IGxpa2UgdG8gdHJlYXQgdGhlIGltcG9ydGFuY2Ugb2YgZG9jdW1lbnQgbGVuZ3RoIGFzIGEgc2Vjb25kIHBhcmFtZXRlciB0aGF0IHdlIGNhbiBleHBlcmltZW50IHdpdGguPFwvcD5cbjwhLS0gXC93cDpwYXJhZ3JhcGggLS0+XG5cbjwhLS0gd3A6cGFyYWdyYXBoIC0tPlxuPHA+V2VcdTIwMTlyZSBnb2luZyB0byBhY2hpZXZlIHRoaXMgdHVuYWJpbGl0eSB3aXRoIGFub3RoZXIgdHJpY2suIFdlXHUyMDE5bGwgYWRkIGEgbmV3IHBhcmFtZXRlciZuYnNwOzxzdHJvbmc+YjxcL3N0cm9uZz4mbmJzcDtpbnRvIHRoZSBtaXggKGl0IG11c3QgYmUgYmV0d2VlbiAwIGFuZCAxKS4gSW5zdGVhZCBvZiBtdWx0aXBseWluZyBrIGJ5IGRsXC9hZGwgYXMgd2Ugd2VyZSBkb2luZyBiZWZvcmUsIHdlXHUyMDE5bGwgbXVsdGlwbHkgayBieSB0aGUgZm9sbG93aW5nIHZhbHVlIGJhc2VkIG9uIGRsXC9hZGwgYW5kIGI6PFwvcD5cbjwhLS0gXC93cDpwYXJhZ3JhcGggLS0+XG5cbjwhLS0gd3A6cHJlZm9ybWF0dGVkIC0tPlxuPHByZSBjbGFzcz1cIndwLWJsb2NrLXByZWZvcm1hdHRlZFwiPjxtYXJrIHN0eWxlPVwiYmFja2dyb3VuZC1jb2xvcjpyZ2JhKDAsIDAsIDAsIDApXCIgY2xhc3M9XCJoYXMtaW5saW5lLWNvbG9yIGhhcy12aXZpZC1jeWFuLWJsdWUtY29sb3JcIj4xIFx1MjAxMyBiICsgYipkbFwvYWRsPFwvbWFyaz48XC9wcmU+XG48IS0tIFwvd3A6cHJlZm9ybWF0dGVkIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPldoYXQgZG9lcyB0aGlzIGRvIGZvciB1cz8gWW91IGNhbiBzZWUgaWYgYiBpcyAxLCB3ZSBnZXQgKDEgXHUyMDEzIDEgKyAxKmRsXC9hZGwpIGFuZCB0aGlzIHJlZHVjZXMgdG8gdGhlIG11bHRpcGxpZXIgd2UgaGFkIGJlZm9yZSwgZGxcL2FkbC4gT24gdGhlIG90aGVyIGhhbmQsIGlmIGIgaXMgMCwgdGhlIHdob2xlIHRoaW5nIGJlY29tZXMgMSBhbmQgZG9jdW1lbnQgbGVuZ3RoIGlzblx1MjAxOXQgY29uc2lkZXJlZCBhdCBhbGwuIEFzIGIgaXMgY3JhbmtlZCB1cCBmcm9tIDAgdG93YXJkcyAxLCB0aGUgbXVsdGlwbGllciByZXNwb25kcyBtb3JlIHF1aWNrbHkgdG8gY2hhbmdlcyBpbiBkbFwvYWRsLiBUaGUgY2hhcnQgYmVsb3cgc2hvd3MgaG93IG91ciBtdWx0aXBsaWVyIGJlaGF2ZXMgYXMgZGxcL2FkbCBncm93cywgd2hlbiBiPS4yIHZlcnN1cyB3aGVuIGI9LjguPFwvcD5cbjwhLS0gXC93cDpwYXJhZ3JhcGggLS0+XG5cbjwhLS0gd3A6aW1hZ2Uge1wiaWRcIjoyNjQwNCxcInNpemVTbHVnXCI6XCJsYXJnZVwiLFwibGlua0Rlc3RpbmF0aW9uXCI6XCJub25lXCJ9IC0tPlxuPGZpZ3VyZSBjbGFzcz1cIndwLWJsb2NrLWltYWdlIHNpemUtbGFyZ2VcIj48aW1nIHNyYz1cImh0dHBzOlwvXC9rbXdsbGMuY29tXC93cC1jb250ZW50XC91cGxvYWRzXC8yMDIxXC8wNVwvRG9jTGVuZ3RoLTEwMjR4NTQ1LnBuZ1wiIGFsdD1cIlwiIGNsYXNzPVwid3AtaW1hZ2UtMjY0MDRcIlwvPjxcL2ZpZ3VyZT5cbjwhLS0gXC93cDppbWFnZSAtLT5cblxuPCEtLSB3cDpwYXJhZ3JhcGggLS0+XG48cD48c3Ryb25nPlJlY2FwOiBGYW5jeSBURjxcL3N0cm9uZz48XC9wPlxuPCEtLSBcL3dwOnBhcmFncmFwaCAtLT5cblxuPCEtLSB3cDpwYXJhZ3JhcGggLS0+XG48cD5UbyByZWNhcCwgd2VcdTIwMTl2ZSBiZWVuIHdvcmtpbmcgbW9kaWZ5aW5nIHRoZSBURiB0ZXJtIGluIFRGICogSURGIHNvIHRoYXQgaXRcdTIwMTlzIHJlc3BvbnNpdmUgdG8gdGVybSBzYXR1cmF0aW9uIGFuZCBkb2N1bWVudCBsZW5ndGguIFRvIGFjY291bnQgZm9yIHRlcm0gc2F0dXJhdGlvbiwgd2UgaW50cm9kdWNlZCB0aGUgVEZcLyhURiArIGspIHRyaWNrLiBUbyBhY2NvdW50IGZvciBkb2N1bWVudCBsZW5ndGgsIHdlIGFkZGVkIHRoZSAoMSBcdTIwMTMgYiArIGIqZGxcL2FkbCkgbXVsdGlwbGllci4gTm93LCBpbnN0ZWFkIG9mIHVzaW5nIHJhdyBURiBpbiBvdXIgcmFua2luZyBmdW5jdGlvbiwgd2VcdTIwMTlyZSB1c2luZyB0aGlzIFx1MjAxY2ZhbmN5XHUyMDFkIHZlcnNpb24gb2YgVEY6PFwvcD5cbjwhLS0gXC93cDpwYXJhZ3JhcGggLS0+XG5cbjwhLS0gd3A6cHJlZm9ybWF0dGVkIC0tPlxuPHByZSBjbGFzcz1cIndwLWJsb2NrLXByZWZvcm1hdHRlZFwiPjxtYXJrIHN0eWxlPVwiYmFja2dyb3VuZC1jb2xvcjpyZ2JhKDAsIDAsIDAsIDApXCIgY2xhc3M9XCJoYXMtaW5saW5lLWNvbG9yIGhhcy12aXZpZC1jeWFuLWJsdWUtY29sb3JcIj5URlwvKFRGICsgayooMSAtIGIgKyBiKmRsXC9hZGwpKSA8XC9tYXJrPjxcL3ByZT5cbjwhLS0gXC93cDpwcmVmb3JtYXR0ZWQgLS0+XG5cbjwhLS0gd3A6cGFyYWdyYXBoIC0tPlxuPHA+UmVjYWxsIHRoYXQgayBpcyB0aGUga25vYiB0aGF0IGNvbnRyb2wgdGhlIHRlcm0gc2F0dXJhdGlvbiBjdXJ2ZSwgYW5kIGIgaXMgdGhlIGtub2IgdGhhdCBjb250cm9scyB0aGUgaW1wb3J0YW5jZSBvZiBkb2N1bWVudCBsZW5ndGguPFwvcD5cbjwhLS0gXC93cDpwYXJhZ3JhcGggLS0+XG5cbjwhLS0gd3A6cGFyYWdyYXBoIC0tPlxuPHA+SW5kZWVkLCB0aGlzIGlzIHRoZSB2ZXJzaW9uIG9mIFRGIHRoYXRcdTIwMTlzIHVzZWQgaW4gQk0yNS4gQW5kIGNvbmdyYXR1bGF0aW9uczogaWYgeW91XHUyMDE5dmUgZm9sbG93ZWQgdGhpcyBmYXIsIHlvdSBub3cgdW5kZXJzdGFuZCBhbGwgdGhlIHJlYWxseSBpbnRlcmVzdGluZyBzdHVmZiBhYm91dCBCTTI1LjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPjxzdHJvbmc+U3RlcCA0OiBGYW5jeSBvciBOb3QtU28tRmFuY3kgSURGPFwvc3Ryb25nPjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPldlXHUyMDE5cmUgbm90IGRvbmUganVzdCB5ZXQgdGhvdWdoLCB3ZSBoYXZlIHRvIHJldHVybiB0byB0aGUgd2F5IEJNMjUgaGFuZGxlcyBkb2N1bWVudCBmcmVxdWVuY3kuIEVhcmxpZXIsIHdlIGhhZCBkZWZpbmVkIElERiBhcyBsb2coTlwvREYpLCBidXQgQk0yNSBkZWZpbmVzIGl0IGFzOjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOnByZWZvcm1hdHRlZCAtLT5cbjxwcmUgY2xhc3M9XCJ3cC1ibG9jay1wcmVmb3JtYXR0ZWRcIj48bWFyayBzdHlsZT1cImJhY2tncm91bmQtY29sb3I6cmdiYSgwLCAwLCAwLCAwKVwiIGNsYXNzPVwiaGFzLWlubGluZS1jb2xvciBoYXMtdml2aWQtY3lhbi1ibHVlLWNvbG9yXCI+bG9nKChOIC0gREYgKyAuNSlcLyhERiArIC41KSkgPFwvbWFyaz48XC9wcmU+XG48IS0tIFwvd3A6cHJlZm9ybWF0dGVkIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPldoeSB0aGUgZGlmZmVyZW5jZT88XC9wPlxuPCEtLSBcL3dwOnBhcmFncmFwaCAtLT5cblxuPCEtLSB3cDpwYXJhZ3JhcGggLS0+XG48cD5BcyB5b3UgbWF5IGhhdmUgb2JzZXJ2ZWQsIHdlXHUyMDE5dmUgYmVlbiBkZXZlbG9waW5nIG91ciBzY29yaW5nIGZ1bmN0aW9uIHRocm91Z2ggYSBzZXQgb2YgaGV1cmlzdGljcy4gUmVzZWFyY2hlcnMgaW4gdGhlIGZpZWxkIG9mIEluZm9ybWF0aW9uIFJldHJpZXZhbCBoYXZlIHdhbnRlZCB0byBwdXQgcmFua2luZyBmdW5jdGlvbnMgb24gYSBtb3JlIHJpZ29yb3VzIHRoZW9yZXRpY2FsIGZvb3Rpbmcgc28gdGhleSBjYW4gYWN0dWFsbHkgcHJvdmUgdGhpbmdzIGFib3V0IHRoZWlyIGJlaGF2aW9yIHJhdGhlciB0aGFuIGp1c3QgZXhwZXJpbWVudGluZyBhbmQgaG9waW5nIGZvciB0aGUgYmVzdC4gVG8gZGVyaXZlIGEgdGhlb3JldGljYWxseSBzb3VuZCB2ZXJzaW9uIG9mIElERiwgcmVzZWFyY2hlcnMgdG9vayBzb21ldGhpbmcgY2FsbGVkIHRoZSBSb2JlcnRzb24tU3BcdTAwZTRyY2sgSm9uZXMgd2VpZ2h0LCBtYWRlIGEgc2ltcGxpZnlpbmcgYXNzdW1wdGlvbiwgYW5kIGNhbWUgdXAgd2l0aCBsb2cgKE4tREYrLjUpXC8oREYrLjUpLiBXZVx1MjAxOXJlIG5vdCBnb2luZyB0byBnbyBpbnRvIHRoZSBkZXRhaWxzLCBidXQgd2VcdTIwMTlsbCBqdXN0IGZvY3VzIG9uIHRoZSBwcmFjdGljYWwgc2lnbmlmaWNhbmNlIG9mIHRoaXMgZmxhdm9yIG9mIElERi4gVGhlJm5ic3A7LjVcdTIwMTlzIGRvblx1MjAxOXQgcmVhbGx5IGRvIG11Y2ggaGVyZSwgc28gbGV0XHUyMDE5cyBqdXN0IGNvbnNpZGVyIGxvZyAoTi1ERilcL0RGLCB3aGljaCBpcyBzb21ldGltZXMgcmVmZXJyZWQgdG8gYXMgXHUyMDFjcHJvYmFiaWxpc3RpYyBJREYuXHUyMDFkIEhlcmUgd2UgY29tcGFyZSBvdXIgdmFuaWxsYSBJREYgd2l0aCBwcm9iYWJpbGlzdGljIElERiB3aGVyZSBOPTEwLjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOmltYWdlIHtcImlkXCI6MjY0MDUsXCJzaXplU2x1Z1wiOlwibGFyZ2VcIixcImxpbmtEZXN0aW5hdGlvblwiOlwibm9uZVwifSAtLT5cbjxmaWd1cmUgY2xhc3M9XCJ3cC1ibG9jay1pbWFnZSBzaXplLWxhcmdlXCI+PGltZyBzcmM9XCJodHRwczpcL1wva213bGxjLmNvbVwvd3AtY29udGVudFwvdXBsb2Fkc1wvMjAyMVwvMDVcL1Byb2JhYmlsaXN0aWNJREYtMTAyNHg0NzcucG5nXCIgYWx0PVwiXCIgY2xhc3M9XCJ3cC1pbWFnZS0yNjQwNVwiXC8+PFwvZmlndXJlPlxuPCEtLSBcL3dwOmltYWdlIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPllvdSBjYW4gc2VlIHRoYXQgcHJvYmFiaWxpc3RpYyBJREYgdGFrZXMgYSBzaGFycCBkcm9wIGZvciB0ZXJtcyB0aGF0IGFyZSBpbiBtb3N0IG9mIHRoZSBkb2N1bWVudHMuIFRoaXMgbWlnaHQgYmUgZGVzaXJhYmxlIGJlY2F1c2UgaWYgYSB0ZXJtIHJlYWxseSBleGlzdHMgaW4gOTglIG9mIHRoZSBkb2N1bWVudHMsIGl0XHUyMDE5cyBwcm9iYWJseSBhIHN0b3B3b3JkIGxpa2UgXHUyMDFjYW5kXHUyMDFkIG9yIFx1MjAxY29yXHUyMDFkIGFuZCBpdCBzaG91bGQgZ2V0IG11Y2gsIG11Y2ggbGVzcyB3ZWlnaHQgdGhhbiBhIHRlcm0gdGhhdFx1MjAxOXMgdmVyeSBjb21tb24sIGxpa2UgaW4gNzAlIG9mIHRoZSBkb2N1bWVudHMsIGJ1dCBzdGlsbCBub3QgdXR0ZXJseSB1YmlxdWl0b3VzLjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPlRoZSBjYXRjaCBpcyB0aGF0IGxvZyAoTi1ERilcL0RGIGlzIG5lZ2F0aXZlIGZvciB0ZXJtcyB0aGF0IGFyZSBpbiBtb3JlIHRoYW4gaGFsZiBvZiB0aGUgY29ycHVzLiAoUmVtZW1iZXIgdGhhdCB0aGUgbG9nIGZ1bmN0aW9uIGdvZXMgbmVnYXRpdmUgb24gdmFsdWVzIGJldHdlZW4gMCBhbmQgMS4pIFdlIGRvblx1MjAxOXQgd2FudCBuZWdhdGl2ZSB2YWx1ZXMgY29taW5nIG91dCBvZiBvdXIgcmFua2luZyBmdW5jdGlvbiBiZWNhdXNlIHRoZSBwcmVzZW5jZSBvZiBhIHF1ZXJ5IHRlcm0gaW4gYSBkb2N1bWVudCBzaG91bGQgbmV2ZXIgY291bnQgYWdhaW5zdCByZXRyaWV2YWxcdTIwMGFcdTIwMTRcdTIwMGFpdCBzaG91bGQgbmV2ZXIgY2F1c2UgYSBsb3dlciBzY29yZSB0aGFuIGlmIHRoZSB0ZXJtIHdhcyBzaW1wbHkgYWJzZW50LiBJbiBvcmRlciB0byBwcmV2ZW50IG5lZ2F0aXZlIHZhbHVlcywgTHVjZW5lXHUyMDE5cyBpbXBsZW1lbnRhdGlvbiBvZiBCTTI1IGFkZHMgYSAxIGxpa2UgdGhpczo8XC9wPlxuPCEtLSBcL3dwOnBhcmFncmFwaCAtLT5cblxuPCEtLSB3cDpwcmVmb3JtYXR0ZWQgLS0+XG48cHJlIGNsYXNzPVwid3AtYmxvY2stcHJlZm9ybWF0dGVkXCI+PG1hcmsgc3R5bGU9XCJiYWNrZ3JvdW5kLWNvbG9yOnJnYmEoMCwgMCwgMCwgMClcIiBjbGFzcz1cImhhcy1pbmxpbmUtY29sb3IgaGFzLXZpdmlkLWN5YW4tYmx1ZS1jb2xvclwiPklERiA9IGxvZyAoMSArIChOIC0gREYgKyAuNSlcLyhERiArIC41KSk8XC9tYXJrPjxcL3ByZT5cbjwhLS0gXC93cDpwcmVmb3JtYXR0ZWQgLS0+XG5cbjwhLS0gd3A6cGFyYWdyYXBoIC0tPlxuPHA+VGhpcyAxIG1pZ2h0IHNlZW0gbGlrZSBhbiBpbm5vY2VudCBtb2RpZmljYXRpb24gYnV0IGl0IHRvdGFsbHkgY2hhbmdlcyB0aGUgYmVoYXZpb3Igb2YgdGhlIGZvcm11bGEhIElmIHdlIGZvcmdldCBhZ2FpbiBhYm91dCB0aG9zZSBwZXNreSZuYnNwOy41XHUyMDE5cywgYW5kIHdlIG5vdGUgdGhhdCBhZGRpbmcgMSBpcyB0aGUgc2FtZSBhcyBhZGRpbmcgREZcL0RGLCB5b3UgY2FuIHNlZSB0aGF0IHRoZSBmb3JtdWxhIHJlZHVjZXMgdG8gdGhlIHZhbmlsbGEgdmVyc2lvbiBvZiBJREYgdGhhdCB3ZSB1c2VkIGJlZm9yZTogbG9nIChOXC9ERikuPFwvcD5cbjwhLS0gXC93cDpwYXJhZ3JhcGggLS0+XG5cbjwhLS0gd3A6cHJlZm9ybWF0dGVkIC0tPlxuPHByZSBjbGFzcz1cIndwLWJsb2NrLXByZWZvcm1hdHRlZFwiPjxtYXJrIHN0eWxlPVwiYmFja2dyb3VuZC1jb2xvcjpyZ2JhKDAsIDAsIDAsIDApXCIgY2xhc3M9XCJoYXMtaW5saW5lLWNvbG9yIGhhcy12aXZpZC1jeWFuLWJsdWUtY29sb3JcIj5sb2cgKDEgKyAoTiAtIERGICsgLjUpXC8oREYgKyAuNSkpIFx1MjI0OFxubG9nICgxICsgKE4gLSBERilcL0RGICkgPVxubG9nIChERlwvREYgKyAoTiAtIERGKVwvREYpID0gXG5sb2cgKChERiArIE4gLSBERilcL0RGKSA9IFxubG9nIChOXC9ERik8XC9tYXJrPjxcL3ByZT5cbjwhLS0gXC93cDpwcmVmb3JtYXR0ZWQgLS0+XG5cbjwhLS0gd3A6cGFyYWdyYXBoIC0tPlxuPHA+U28gYWx0aG91Z2ggaXQgbG9va3MgbGlrZSBCTTI1IGlzIHVzaW5nIGEgZmFuY3kgdmVyc2lvbiBvZiBJREYsIGluIHByYWN0aWNlIChhcyBpbXBsZW1lbnRlZCBpbiBMdWNlbmUpIGl0XHUyMDE5cyBiYXNpY2FsbHkgdXNpbmcgdGhlIHNhbWUgb2xkIHZlcnNpb24gb2YgSURGIHRoYXRcdTIwMTlzIHVzZWQgaW4gdHJhZGl0aW9uYWwgVEZcL0lERiwgd2l0aG91dCB0aGUgYWNjZWxlcmF0ZWQgZGVjbGluZSBmb3IgaGlnaCBERiB2YWx1ZXMuPFwvcD5cbjwhLS0gXC93cDpwYXJhZ3JhcGggLS0+XG5cbjwhLS0gd3A6aGVhZGluZyB7XCJsZXZlbFwiOjR9IC0tPlxuPGg0PkNhc2hpbmcgSW48XC9oND5cbjwhLS0gXC93cDpoZWFkaW5nIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPldlXHUyMDE5cmUgcmVhZHkgdG8gY2FzaCBpbiBvbiBvdXIgbmV3IHVuZGVyc3RhbmRpbmcgYnkgbG9va2luZyBhdCB0aGUgZXhwbGFpbiBvdXRwdXQgZnJvbSBhIEx1Y2VuZSBxdWVyeS4gWW91XHUyMDE5bGwgc2VlIHNvbWV0aGluZyBsaWtlIHRoaXM6PFwvcD5cbjwhLS0gXC93cDpwYXJhZ3JhcGggLS0+XG5cbjwhLS0gd3A6cHJlZm9ybWF0dGVkIC0tPlxuPHByZSBjbGFzcz1cIndwLWJsb2NrLXByZWZvcm1hdHRlZFwiPjxtYXJrIHN0eWxlPVwiYmFja2dyb3VuZC1jb2xvcjpyZ2JhKDAsIDAsIDAsIDApXCIgY2xhc3M9XCJoYXMtaW5saW5lLWNvbG9yIGhhcy12aXZpZC1jeWFuLWJsdWUtY29sb3JcIj5cdTIwMWNzY29yZShmcmVxPTMuMCksIHByb2R1Y3Qgb2Y6XHUyMDFkXG5cblx1MjAxY2lkZiwgY29tcHV0ZWQgYXMgbG9nKDEgKyAoTiBcdTIwMTQgbiArIDAuNSkgXC8gKG4gKyAwLjUpKSBmcm9tOlx1MjAxZFxuXG5cdTIwMWN0ZiwgY29tcHV0ZWQgYXMgZnJlcSBcLyAoZnJlcSArIGsxICogKDEgXHUyMDE0IGIgKyBiICogZGwgXC8gYXZnZGwpKSBmcm9tOlx1MjAxZDxcL21hcms+PFwvcHJlPlxuPCEtLSBcL3dwOnByZWZvcm1hdHRlZCAtLT5cblxuPCEtLSB3cDpwYXJhZ3JhcGggLS0+XG48cD5XZVx1MjAxOXJlIGZpbmFsbHkgcHJlcGFyZWQgdG8gdW5kZXJzdGFuZCB0aGlzIGdvYmJsZWR5Z29vay4gWW91IGNhbiBzZWUgdGhhdCBMdWNlbmUgaXMgdXNpbmcgYSBURipJREYgcHJvZHVjdCB3aGVyZSBURiBhbmQgSURGIGhhdmUgdGhlaXIgc3BlY2lhbCBCTTI1IGRlZmluaXRpb25zLiBMb3dlcmNhc2UgbiBtZWFucyBERiBoZXJlLiBUaGUgSURGIHRlcm0gaXMgdGhlIHN1cHBvc2VkbHkgZmFuY3kgdmVyc2lvbiB0aGF0IHR1cm5zIG91dCB0byBiZSB0aGUgc2FtZSBhcyB0cmFkaXRpb25hbCBJREYsIE5cL24uPFwvcD5cbjwhLS0gXC93cDpwYXJhZ3JhcGggLS0+XG5cbjwhLS0gd3A6cGFyYWdyYXBoIC0tPlxuPHA+VGhlIFRGIHRlcm0gaXMgYmFzZWQgb24gb3VyIHNhdHVyYXRpb24gdHJpY2s6IGZyZXFcLyhmcmVxICsgaykuIFRoZSB1c2Ugb2YmbmJzcDs8c3Ryb25nPmsxPFwvc3Ryb25nPiZuYnNwO2luc3RlYWQgb2YgayBpbiB0aGUgZXhwbGFpbiBvdXRwdXQgaXQgaGlzdG9yaWNhbFx1MjAwYVx1MjAxNFx1MjAwYWl0IGNvbWVzIGZyb20gYSB0aW1lIHdoZW4gdGhlcmUgd2FzIG1vcmUgdGhhbiBvbmUgayBpbiB0aGUgZm9ybXVsYS4gV2hhdCB3ZVx1MjAxOXZlIGJlZW4gY2FsbGluZyByYXcgVEYgaXMgZGVub3RlZCBhcyZuYnNwOzxzdHJvbmc+ZnJlcTxcL3N0cm9uZz4mbmJzcDtoZXJlLjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPldlIGNhbiBzZWUgdGhhdCBrMSBpcyBtdWx0aXBsaWVkIGJ5IGEgZmFjdG9yIHRoYXQgcGVuYWxpemVzIGFib3ZlLWF2ZXJhZ2UgZG9jdW1lbnQgbGVuZ3RoIHdoaWxlIHJld2FyZGluZyBiZWxvdy1hdmVyYWdlIGRvY3VtZW50IGxlbmd0aDogKDEtYiArIGIgKmRsXC9hdmdkbCkuIFdoYXQgd2VcdTIwMTl2ZSBiZWVuIGNhbGxpbmcgYWRsIGlzIGRlbm90ZWQgYXMmbmJzcDs8c3Ryb25nPmF2Z2RsPFwvc3Ryb25nPiZuYnNwO2hlcmUuPFwvcD5cbjwhLS0gXC93cDpwYXJhZ3JhcGggLS0+XG5cbjwhLS0gd3A6cGFyYWdyYXBoIC0tPlxuPHA+QW5kIG9mIGNvdXJzZSB3ZSBjYW4gc2VlIHRoYXQgdGhlcmUgYXJlIHBhcmFtZXRlcnMsIHdoaWNoIGFyZSBzZXQgdG8gaz0xLjIgYW5kIGIgPSZuYnNwOy43NSBpbiBMdWNlbmUgYnkgZGVmYXVsdC4gWW91IHByb2JhYmx5IHdvblx1MjAxOXQgbmVlZCB0byB0d2VhayB0aGVzZSwgYnV0IHlvdSBjYW4gaWYgeW91IHdhbnQuPFwvcD5cbjwhLS0gXC93cDpwYXJhZ3JhcGggLS0+XG5cbjwhLS0gd3A6cGFyYWdyYXBoIC0tPlxuPHA+PHN0cm9uZz5JbiBzdW1tYXJ5LCBzaW1wbGUgVEYtSURGIHJld2FyZHMgdGVybSBmcmVxdWVuY3kgYW5kIHBlbmFsaXplcyBkb2N1bWVudCBmcmVxdWVuY3kuIEJNMjUgZ29lcyBiZXlvbmQgdGhpcyB0byBhY2NvdW50IGZvciBkb2N1bWVudCBsZW5ndGggYW5kIHRlcm0gZnJlcXVlbmN5IHNhdHVyYXRpb24uJm5ic3A7PFwvc3Ryb25nPjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPkl0XHUyMDE5cyB3b3J0aCBub3RpbmcgdGhhdCBiZWZvcmUgTHVjZW5lIGludHJvZHVjZWQgQk0yNSBhcyB0aGUgZGVmYXVsdCByYW5raW5nIGZ1bmN0aW9uIGFzIG9mIHZlcnNpb24gNiwgaXQgaW1wbGVtZW50ZWQgVEYtSURGIHRocm91Z2ggc29tZXRoaW5nIGNhbGxlZCB0aGUmbmJzcDs8YSBocmVmPVwiaHR0cHM6XC9cL3d3dy5lbGFzdGljLmNvXC9ndWlkZVwvZW5cL2VsYXN0aWNzZWFyY2hcL2d1aWRlXC8yLnhcL3ByYWN0aWNhbC1zY29yaW5nLWZ1bmN0aW9uLmh0bWxcIj5QcmFjdGljYWwgU2NvcmluZyBGdW5jdGlvbjxcL2E+LCB3aGljaCB3YXMgYSBzZXQgb2YgZW5oYW5jZW1lbnRzIChpbmNsdWRpbmcgXHUyMDFjY29vcmRcdTIwMWQgYW5kIGZpZWxkIGxlbmd0aCBub3JtYWxpemF0aW9uKSB0aGF0IG1hZGUgVEYtSURGIG1vcmUgbGlrZSBCTTI1LiBTbyB0aGUgYmVoYXZpb3IgZGlmZmVyZW5jZSBvbmUgbWlnaHQgaGF2ZSBvYnNlcnZlZCB3aGVuIEx1Y2VuZSBtYWRlIHRoZSBzd2l0Y2ggdG8gQk0yNSB3YXMgcHJvYmFibHkgbGVzcyBkcmFtYXRpYyB0aGFuIGl0IHdvdWxkIGhhdmUgYmVlbiBpZiBMdWNlbmUgaGFkIGJlZW4gdXNpbmcgcHVyZSBURi1JREYgYWxsIGFsb25nLiBJbiBhbnkgY2FzZSwgdGhlIGNvbnNlbnN1cyBpcyB0aGF0IEJNMjUgaXMgYW4gaW1wcm92ZW1lbnQsIGFuZCBub3cgeW91IGNhbiBzZWUgd2h5LjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPklmIHlvdVx1MjAxOXJlIGEgc2VhcmNoIGVuZ2luZWVyLCB0aGUgTHVjZW5lIGV4cGxhaW4gb3V0cHV0IGlzIHRoZSBtb3N0IGxpa2VseSBwbGFjZSB3aGVyZSB5b3VcdTIwMTlsbCBlbmNvdW50ZXIgdGhlIGRldGFpbHMgb2YgdGhlIEJNMjUgZm9ybXVsYS4gSG93ZXZlciwgaWYgeW91IGRlbHZlIGludG8gdGhlb3JldGljYWwgcGFwZXJzIG9yIGNoZWNrIG91dCB0aGUmbmJzcDs8YSBocmVmPVwiaHR0cHM6XC9cL2VuLndpa2lwZWRpYS5vcmdcL3dpa2lcL09rYXBpX0JNMjVcIj5XaWtpcGVkaWEgYXJ0aWNsZSBvbiBCTTI1PFwvYT4sIHlvdVx1MjAxOWxsIHNlZSBpdCB3cml0dGVuIG91dCBhcyBhbiBlcXVhdGlvbiBsaWtlIHRoaXM6PFwvcD5cbjwhLS0gXC93cDpwYXJhZ3JhcGggLS0+XG5cbjwhLS0gd3A6aW1hZ2Uge1wiaWRcIjoyNjQwNixcInNpemVTbHVnXCI6XCJsYXJnZVwiLFwibGlua0Rlc3RpbmF0aW9uXCI6XCJub25lXCJ9IC0tPlxuPGZpZ3VyZSBjbGFzcz1cIndwLWJsb2NrLWltYWdlIHNpemUtbGFyZ2VcIj48aW1nIHNyYz1cImh0dHBzOlwvXC9rbXdsbGMuY29tXC93cC1jb250ZW50XC91cGxvYWRzXC8yMDIxXC8wNVwvYm0yNWRlbXlzdGlmaWVkLTEwMjR4NDQxLnBuZ1wiIGFsdD1cIlwiIGNsYXNzPVwid3AtaW1hZ2UtMjY0MDZcIlwvPjxcL2ZpZ3VyZT5cbjwhLS0gXC93cDppbWFnZSAtLT5cblxuPCEtLSB3cDpwYXJhZ3JhcGggLS0+XG48cD5Ib3BlZnVsbHkgdGhpcyB0b3VyIGhhcyBtYWRlIHlvdSBtb3JlIGNvbWZvcnRhYmxlIHdpdGggaG93IHRoZSB0d28gbW9zdCBwb3B1bGFyIHNlYXJjaCByYW5raW5nIGZ1bmN0aW9ucyB3b3JrLiBUaGFua3MgZm9yIGZvbGxvd2luZyBhbG9uZyE8XC9wPlxuPCEtLSBcL3dwOnBhcmFncmFwaCAtLT5cblxuPCEtLSB3cDpoZWFkaW5nIHtcImxldmVsXCI6NH0gLS0+XG48aDQ+RnVydGhlciBSZWFkaW5nPFwvaDQ+XG48IS0tIFwvd3A6aGVhZGluZyAtLT5cblxuPCEtLSB3cDpwYXJhZ3JhcGggLS0+XG48cD5UaGlzIGFydGljbGUgZm9sbG93cyBpbiB0aGUgZm9vdHN0ZXBzIG9mIHNvbWUgb3RoZXIgZ3JlYXQgdG91cnMgb2YgQk0yNSB0aGF0IGFyZSBvdXQgdGhlcmUuIFRoZXNlIHR3byBhcmUgaGlnaGx5IHJlY29tbWVuZGVkOjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPjxhIGhyZWY9XCJodHRwczpcL1wvb3BlbnNvdXJjZWNvbm5lY3Rpb25zLmNvbVwvYmxvZ1wvMjAxNVwvMTBcLzE2XC9ibTI1LXRoZS1uZXh0LWdlbmVyYXRpb24tb2YtbHVjZW5lLXJlbGV2YXRpb25cL1wiPjxzdHJvbmc+Qk0yNSBUaGUgTmV4dCBHZW5lcmF0aW9uIG9mIEx1Y2VuZSBSZWxldmFuY2UgYnkgRG91ZyBUdXJuYnVsbDxcL3N0cm9uZz48XC9hPjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPjxhIGhyZWY9XCJodHRwczpcL1wvd3d3LmVsYXN0aWMuY29cL2Jsb2dcL3ByYWN0aWNhbC1ibTI1LXBhcnQtMi10aGUtYm0yNS1hbGdvcml0aG0tYW5kLWl0cy12YXJpYWJsZXNcIj48c3Ryb25nPlByYWN0aWNhbCBCTTI1IFx1MjAxMyBQYXJ0IDI6IFRoZSBCTTI1IEFsZ29yaXRobSBhbmQgaXRzIFZhcmlhYmxlcyBieSBTaGFuZSBDb25uZWxseTxcL3N0cm9uZz48XC9hPjxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPlRoZXJlIGFyZSBtYW55IHRoZW9yZXRpY2FsIHRyZWF0bWVudHMgb2YgcmFua2luZyBvdXQgdGhlcmUuIEEgZ29vZCBzdGFydGluZyBwbGFjZSBpcyZuYnNwOzxhIGhyZWY9XCJodHRwOlwvXC93d3cuc3RhZmYuY2l0eS5hYy51a1wvfnNiMzE3XC9wYXBlcnNcL2ZvdW5kYXRpb25zX2JtMjVfcmV2aWV3LnBkZlwiIHRhcmdldD1cIl9ibGFua1wiIHJlbD1cIm5vcmVmZXJyZXIgbm9vcGVuZXJcIj5cdTIwMWNUaGUgUHJvYmFiaWxpc3RpYyBSZWxldmFuY2UgRnJhbWV3b3JrOiBCTTI1IGFuZCBCZXlvbmRcdTIwMWQgYnkgUm9iZXJ0c29uIGFuZCBaYXJhZ29zYTxcL2E+LiZuYnNwOzxcL3A+XG48IS0tIFwvd3A6cGFyYWdyYXBoIC0tPlxuXG48IS0tIHdwOnBhcmFncmFwaCAtLT5cbjxwPlNlZSBhbHNvIHRoZSBwYXBlciZuYnNwOzxhIGhyZWY9XCJodHRwczpcL1wvd3d3Lm1pY3Jvc29mdC5jb21cL2VuLXVzXC9yZXNlYXJjaFwvd3AtY29udGVudFwvdXBsb2Fkc1wvMjAxNlwvMDJcL29rYXBpX3RyZWMzLnBkZlwiIHRhcmdldD1cIl9ibGFua1wiIHJlbD1cIm5vcmVmZXJyZXIgbm9vcGVuZXJcIj5cdTIwMWNPa2FwaSBhdCBUUkVDLTNcdTIwMWQ8XC9hPiZuYnNwO3doZXJlIEJNMjUgd2FzIGZpcnN0IGludHJvZHVjZWQuPFwvcD5cbjwhLS0gXC93cDpwYXJhZ3JhcGggLS0+In0sImVsZW1lbnRzIjpbXSwid2lkZ2V0VHlwZSI6InRleHQtZWRpdG9yIn0=\\\"]\\t\\t\\t<\\\/div>\\n\\t\\t<\\\/div>\\n\\t\\t\\t\\t\\t<\\\/div><\\\/div>\\r\\n\\t\\t<\\\/section>\\r\\n\\t\\t\\t\\t<section class=\\\"elementor-section elementor-top-section elementor-element elementor-element-e0c25bc elementor-section-boxed elementor-section-height-default elementor-section-height-default\\\" data-id=\\\"e0c25bc\\\" data-element_type=\\\"section\\\" data-e-type=\\\"section\\\">\\r\\n\\t\\t\\t\\t\\t\\t<div class=\\\"elementor-container elementor-column-gap-thegem\\\"><div class=\\\"elementor-row\\\">\\r\\n\\t\\t\\t\\t\\t<div class=\\\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-a5e6f8a\\\" data-id=\\\"a5e6f8a\\\" data-element_type=\\\"column\\\" data-e-type=\\\"column\\\">\\n\\t\\t\\t<div class=\\\"elementor-widget-wrap elementor-element-populated\\\">\\n\\t\\t\\t\\t[elementor-element k=\\\"9109a976d8649ee6d2c8fef8daebbb8b\\\" data=\\\"eyJpZCI6IjYzN2ZkYTUiLCJlbFR5cGUiOiJ3aWRnZXQiLCJzZXR0aW5ncyI6eyJwcmV2X2xhYmVsIjoiUHJldmlvdXMgUG9zdCIsIm5leHRfbGFiZWwiOiJOZXh0IFBvc3QiLCJzaG93X2JvcmRlcnMiOiIiLCJ0aXRsZV90eXBvZ3JhcGh5X3R5cG9ncmFwaHkiOiJjdXN0b20iLCJ0aXRsZV90eXBvZ3JhcGh5X2ZvbnRfc2l6ZSI6eyJ1bml0IjoicHgiLCJzaXplIjoxNCwic2l6ZXMiOltdfSwidGl0bGVfdHlwb2dyYXBoeV9mb250X3dlaWdodCI6IjcwMCIsIl9fZ2xvYmFsc19fIjp7ImFycm93X2NvbG9yIjoiZ2xvYmFsc1wvY29sb3JzP2lkPXByaW1hcnkiLCJsYWJlbF9jb2xvciI6Imdsb2JhbHNcL2NvbG9ycz9pZD1zZWNvbmRhcnkifSwiYXJyb3ciOiJmYSBmYS1jYXJldC1sZWZ0Iiwic2hvd19hcnJvdyI6IiJ9LCJlbGVtZW50cyI6W10sIndpZGdldFR5cGUiOiJnbG9iYWwiLCJ0ZW1wbGF0ZUlEIjoiMjgwODMifQ==\\\"]\\t\\t\\t<\\\/div>\\n\\t\\t<\\\/div>\\n\\t\\t\\t\\t\\t<\\\/div><\\\/div>\\r\\n\\t\\t<\\\/section>\\r\\n\\t\\t\",\"scripts\":[],\"styles\":[]}}"]},"jetpack_featured_media_url":"https:\/\/kmwllc.com\/wp-content\/uploads\/2020\/03\/blog_understandingTFIDF1200x900.png","menu_order":0,"_links":{"self":[{"href":"https:\/\/kmwllc.com\/index.php\/wp-json\/wp\/v2\/posts\/26186","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/kmwllc.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kmwllc.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kmwllc.com\/index.php\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/kmwllc.com\/index.php\/wp-json\/wp\/v2\/comments?post=26186"}],"version-history":[{"count":10,"href":"https:\/\/kmwllc.com\/index.php\/wp-json\/wp\/v2\/posts\/26186\/revisions"}],"predecessor-version":[{"id":29719,"href":"https:\/\/kmwllc.com\/index.php\/wp-json\/wp\/v2\/posts\/26186\/revisions\/29719"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/kmwllc.com\/index.php\/wp-json\/wp\/v2\/media\/29716"}],"wp:attachment":[{"href":"https:\/\/kmwllc.com\/index.php\/wp-json\/wp\/v2\/media?parent=26186"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kmwllc.com\/index.php\/wp-json\/wp\/v2\/categories?post=26186"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kmwllc.com\/index.php\/wp-json\/wp\/v2\/tags?post=26186"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}