org.das2.qds.util.AutoHistogram
Self-configuring histogram dynamically adjusts range and bin size as data
is added. Also it tries to identify outlier points, which are available
as a {@code Map} going from value to number observed. Also for
each bin, we keep track of a running mean and variance, which are useful for
identifying continuous bins and total moments. Introduced to support
automatic cadence algorithm, should be generally useful in data discovery.
AutoHistogram( )
USER_PROP_BIN_START
USER_PROP_BIN_WIDTH
USER_PROP_INVALID_COUNT
USER_PROP_OUTLIERS
USER_PROP_MIN_GT_ZERO
USER_PROP_TOTAL
Long, total number of valid points.
addPropertyChangeListener
addPropertyChangeListener( java.beans.PropertyChangeListener listener ) → void
Parameters
listener - a PropertyChangeListener
Returns:
void (returns nothing)
[search for examples]
[view on GitHub]
[view on old javadoc]
[view source]
binOf
binOf( QDataSet hist, double d ) → int
convenient method for getting the bin location of a value from a completed
histogram's metadata. Note this is inefficient since it must do HashMap
lookups to get the bin width and bin start, so use this carefully.
Parameters
hist - a QDataSet
d - a double
Returns:
the index of the bin for the point.
[search for examples]
[view on GitHub]
[view on old javadoc]
[view source]
doit
doit( QDataSet ds ) → QDataSet
Parameters
ds - a QDataSet
Returns:
org.das2.qds.QDataSet
[search for examples]
[view on GitHub]
[view on old javadoc]
[view source]
doit
doit( QDataSet ds, QDataSet wds ) → QDataSet
do a histogram, dynamically shifting the bins and changing the bin size.
returns a QDataSet with the planes:
each bin has the number of points in each bin.
total total of values in the bin
depend_0 is the bin names.
The property QDataSet.USER_PROPERTIES contains a map with the following
keys:
binStart, Double, the left side of the first bin.
binWidth, Double, the bin width.
total, Integer, the number of points in the distribution.
outliers, Map, outliers and the count of each outlier.
Parameters
ds - a QDataSet
wds - WeightsDataSet or null.
Returns:
a QDataSet
[search for examples]
[view on GitHub]
[view on old javadoc]
[view source]
getHistogram
getHistogram( ) → org.das2.qds.DDataSet
get the histogram of the data accumulated thus far.
Returns:
an org.das2.qds.DDataSet
[search for examples]
[view on GitHub]
[view on old javadoc]
[view source]
mean
mean( QDataSet hist ) → org.das2.qds.RankZeroDataSet
returns the mean of the dataset that has been histogrammed.
Parameters
hist - a QDataSet
Returns:
an org.das2.qds.RankZeroDataSet
[search for examples]
[view on GitHub]
[view on old javadoc]
[view source]
moments
moments( QDataSet hist ) → org.das2.qds.RankZeroDataSet
returns the mean of the dataset that has been histogrammed.
Parameters
hist - a rank 1 dataset with each bin containing the count in each bin. DEPEND_0 are the labels for
each bin. The property "means" returns a rank 1 dataset containing the means for each bin. The
property "stddevs" contains the standard deviation within each bin.
Returns:
rank 0 dataset (a Datum) whose value is the mean, and the property("stddev") contains the standard deviation
[search for examples]
[view on GitHub]
[view on old javadoc]
[view source]
monoExtent
monoExtent( QDataSet dep0 ) → QDataSet
fast extent only works when monotonic.
Returns null if there is no valid data.
Parameters
dep0 - a QDataSet
Returns:
rank 1 bins dataset or null
[search for examples]
[view on GitHub]
[view on old javadoc]
[view source]
peakIds
peakIds( QDataSet hist ) → QDataSet
return a list of all the peaks in the histogram. A peak is defined as a
local maximum, then including the adjacent bins consistent with the peak
population, and not belonging to another peak.
Parameters
hist - the output of autohistogram, which has "means" and "stddevs" properties.
Returns:
QDataSet which varies with hist.
[search for examples]
[view on GitHub]
[view on old javadoc]
[view source]
peaks
peaks( QDataSet hist ) → QDataSet
return a list of all the peaks in the histogram. See peakIds to see
how peaks are identified. Once the bins of a peak
have been identified, then the mean and stddev of each peak is returned.
Note the stddev typically reads low, probably because the tails have been removed.
Parameters
hist - the result of AutoHistogram
Returns:
QDataSet rank 1 dataset with length equal to the number of identified peaks
[search for examples]
[view on GitHub]
[view on old javadoc]
[view source]
removePropertyChangeListener
removePropertyChangeListener( java.beans.PropertyChangeListener listener ) → void
Parameters
listener - a PropertyChangeListener
Returns:
void (returns nothing)
[search for examples]
[view on GitHub]
[view on old javadoc]
[view source]
reset
reset( ) → void
Returns:
void (returns nothing)
[search for examples]
[view on GitHub]
[view on old javadoc]
[view source]
simpleRange
simpleRange( QDataSet hist2 ) → QDataSet
returns the simple range, the min and the max containing the data.
Parameters
hist2 - the result of autoHistogram.
Returns:
rank 1 bins dataset showing the min and max. value(0) is the
min, value(1) is the max.
[search for examples]
[view on GitHub]
[view on old javadoc]
[view source]