public class GND extends NXYSignificanceHeuristic
| Modifier and Type | Class and Description |
|---|---|
static class |
GND
|
static class |
GND
|
NXYSignificanceHeuristic.Frequencies , NXYSignificanceHeuristic.NXYBuilder , NXYSignificanceHeuristic.NXYParser | Modifier and Type | Field and Description |
|---|---|
protected static ParseField |
NAMES_FIELD
|
static SignificanceHeuristicStreams |
STREAM
|
BACKGROUND_IS_SUPERSET, backgroundIsSuperset, INCLUDE_NEGATIVES_FIELD, includeNegatives, SCORE_ERROR_MESSAGE| Constructor and Description |
|---|
GND(boolean backgroundIsSuperset)
|
| Modifier and Type | Method and Description |
|---|---|
boolean |
equals(Object
|
double |
getScore(long subsetFreq, long subsetSize, long supersetFreq, long supersetSize)
Calculates Google Normalized Distance, as described in "The Google Similarity Distance", Cilibrasi and Vitanyi, 2007 link: http://arxiv.org/pdf/cs/0412098v3.pdf
|
int |
hashCode()
|
void |
writeTo(StreamOutput
|
checkFrequencies, computeNxyscheckFrequencyValidity, initializeprotected static final ParseFieldNAMES_FIELD
public static final SignificanceHeuristicStreams.Stream STREAM
public boolean equals(Objectother)
public int hashCode()
public double getScore(long subsetFreq,
long subsetSize,
long supersetFreq,
long supersetSize)
getScore in class
SignificanceHeuristic
subsetFreq - The frequency of the term in the selected sample
subsetSize - The size of the selected sample (typically number of docs)
supersetFreq - The frequency of the term in the superset from which the sample was taken
supersetSize - The size of the superset from which the sample was taken (typically number of docs)
public void writeTo(StreamOutputout) throws IOException
writeTo in class
NXYSignificanceHeuristic
IOException