Module histogram

Basic Histogram.

Copyright © 2011-2014 Zuse Institute Berlin

Version: $Id$

Authors: Thorsten Schuett (schuett@zib.de).

Description

Basic Histogram. Yael Ben-Haim and Elad Tom-Tov, "A streaming parallel decision tree algorithm", J. Machine Learning Research 11 (2010), pp. 849--872.

Data Types

data_item()

data_item() = {value(), pos_integer()}

data_list()

data_list() = [data_item()]

histogram()

abstract datatype: histogram()

value()

value() = number()

Function Index

add/2
add/3
create/1Creates an empty Size sized histogram.
find_smallest_interval/1Finds the smallest interval between two consecutive values and returns the position of the first value (in the list's order).
foldl_until/2Traverses the histogram until TargetCount entries have been found and returns the value at this position.
foldr_until/2Like foldl_until but traverses the list from the right.
get_data/1
get_num_elements/1
get_num_inserts/1
get_size/1
merge/2Merges the given two histograms by adding every data point of Hist2 to Hist1.
merge_interval/2Merges two consecutive values if the first one of them is at PosMinValue.
merge_weighted/3Merges Hist2 into Hist1 and applies a weight to the Count of Hist2.
normalize_count/2Normalizes the Count by a normalization constant N.
tester_create_histogram/2
tester_is_valid_histogram/1

Function Details

create/1

create(Size :: non_neg_integer()) -> histogram()

Creates an empty Size sized histogram

add/2

add(Value :: value(), Histogram :: histogram()) -> histogram()

add/3

add(Value :: value(),
    Count :: pos_integer(),
    Histogram :: histogram()) ->
       histogram()

get_data/1

get_data(Histogram :: histogram()) -> data_list()

get_size/1

get_size(Histogram :: histogram()) -> non_neg_integer()

get_num_elements/1

get_num_elements(Histogram :: histogram()) -> non_neg_integer()

get_num_inserts/1

get_num_inserts(Histogram :: histogram()) -> non_neg_integer()

merge/2

merge(Hist1 :: histogram(), Hist2 :: histogram()) -> histogram()

Merges the given two histograms by adding every data point of Hist2 to Hist1.

merge_weighted/3

merge_weighted(Hist1 :: histogram(),
               Hist2 :: histogram(),
               Weight :: pos_integer()) ->
                  histogram()

Merges Hist2 into Hist1 and applies a weight to the Count of Hist2

normalize_count/2

normalize_count(N :: pos_integer(), Histogram :: histogram()) ->
                   histogram()

Normalizes the Count by a normalization constant N

foldl_until/2

foldl_until(TargetCount :: non_neg_integer(),
            Histogram :: histogram()) ->
               {fail,
                Value :: value() | nil,
                SumSoFar :: non_neg_integer()} |
               {ok,
                Value :: value() | nil,
                Sum :: non_neg_integer()}

Traverses the histogram until TargetCount entries have been found and returns the value at this position. TODO change this to expect non empty histogram

foldr_until/2

foldr_until(TargetCount :: non_neg_integer(),
            Histogram :: histogram()) ->
               {fail,
                Value :: value() | nil,
                SumSoFar :: non_neg_integer()} |
               {ok,
                Value :: value() | nil,
                Sum :: non_neg_integer()}

Like foldl_until but traverses the list from the right

find_smallest_interval/1

find_smallest_interval(Data :: data_list()) ->
                          MinFirstValue :: value()

Finds the smallest interval between two consecutive values and returns the position of the first value (in the list's order). Returning the position instead of the value ensures that the correct items are merged when duplicate entries are in the histogram. PRE: length(Data) >= 2

merge_interval/2

merge_interval(MinFirstValue :: value(), Data :: data_list()) ->
                  data_list()

Merges two consecutive values if the first one of them is at PosMinValue. Stops after the first match. PRE: length(Data) >= 2, two consecutive values with the given difference

tester_create_histogram/2

tester_create_histogram(Size :: non_neg_integer(),
                        Data :: data_list()) ->
                           histogram()

tester_is_valid_histogram/1

tester_is_valid_histogram(X :: term()) -> boolean()


Generated by EDoc, Sep 12 2019, 16:35:04.