Страница Искандера Шафикова
Суббота, 2024-04-20, 12:01
Меню сайта

Форма входа

Категории раздела
My files [12]

Поиск

Друзья сайта
  • Официальный блог
  • Сообщество uCoz
  • FAQ по системе
  • Инструкции для uCoz

  • Статистика

    Онлайн всего: 1
    Гостей: 1
    Пользователей: 0

    Главная » Файлы » My files

    ANN - Approximate Nearest Neighbor Library wrapper
    [ ] 2009-10-05, 15:16

    ANN - Approximate Nearest Neighbor Library wrapper



    This easy tool is a wrapper I wrote in Borland C++ Builder for David M. Mount and Sunil Arya's free ANN library used to solve the KNN problem popular in statistics / numerical algorithms. Some options that were previously not accessible in the library are now open for change.

    Features include:

    • Working with multiple-dimension data (from 1 to 100) to calculate distances
    • Input of sample data and query data from text files with any extension: columns of data (i.e. 'dimensions') must be separated by TABs
    • Calculation of distances in any metric: L1 (Manhattan / Cityblock distance), L2 (Euclidean), LP (custom), and L-infinity (Chebyshev)
    • Ability to indicate max dimensions and data points
    • Automatic calculation of max neighbors, dimension space, and data point count
    • Using error bound (Epsilon) to approximate KNN search (default = 0)
    • 3 search modes: brute force (full loop), unbalanced (kd) tree, balanced (bd) tree -- the latter is good for large searches
    • 3 search 'ranges': standard, priority, and fixed radius
    • Ability to tweak split and shrink rules for search trees
    • Ability to stop search after reaching a specified visit count (early termination)
    • Displaying comprehensive descriptive statistics for data and query arrays (min, max, mean, median, count, sum, SOS, variance, standard deviation, skewness, kurtosis)
    • Displaying quick search statistics (options selected)
    • Displaying 2D graph of data, query, and nearest neighbors (dot lines) -- (only for 2 dimensions!)

    As yet this is a rather small application, but with additional effort and time it can be converted into a full-fledged data interpolation / simulation app. For that one needs to implement:

    • Covariance functions (variograms, cov properties: sill, nugget, range, angles etc.)
    • Probability calculation (expected values, total / conditional / unconditional probabilities, Bayesian probability)
    • Probability distributions (normal, Poisson's, Student's, Pearson's X-square, exponential, etc. + corresponding prob density distributions)
    • Some linear algebra and matrix calculus (matrix manipulation, Gaussian elimination, LU / Cholesky / QR etc. decomposition)
    • Kriging (simple, ordinary, co-kriging, universal, indicator)
    • Random path generation
    • Advanced data handling (reading and writing various file formats, lists and vectors in lieu of C arrays etc.)
    • Maybe have to use additional libraries (boost libs for LA, smart pointers, etc.)
    • Simulation models (sequential Gaussian -- SGSIM, sequential normal-equation -- SNESIM, indicator -- SIS, truncated etc.)

    It is clear now that I am still so far from calling this tool 'a useful tool for statistics and data interpolation'. But it's a start...

    EXE + Data file samples

    Категория: My files | Добавил: S0mbre
    Просмотров: 1658 | Загрузок: 0 | Комментарии: 1 | Рейтинг: 0.0/0 |
    Всего комментариев: 0
    Имя *:
    Email *:
    Код *:
    Copyright MyCorp © 2024
    Бесплатный хостинг uCoz