Feature Selection Toolbox
Feature Selection Toolbox | |
---|---|
![]() The screenshot illustrates the full user interface of FST1 with log window on the left showing feature selection results, result table window in center right and a graphic projection of data and mixture model components in bottom right. On top of it is the dialog for setting parameters of optimal subset search methods. | |
Developer(s) | UTIA, Czech Academy of Sciences |
Stable release | 3.0.1
/ 2010/11/02 |
Operating system | Cross-platform (v3) |
Type | Machine Learning, Pattern Recognition |
License | Free for non-commercial use |
Website | http://fst.utia.cz |
Feature Selection Toolbox (FST) is a machine learning software focusing primarily on the feature selection[1] problem, written in C++, developed at the Institute of Information Theory and Automation (UTIA), of the Czech Academy of Sciences.
Feature Selection Toolbox 1[2]
The first generation of the software (FST1) is a Windows application with user interface allowing users to apply several sub-optimal, optimal and mixture-based feature selection methods on data stored in a trivial proprietary textual flat file format. FST1 is publicly available and free for non-commercial use.
Feature Selection Toolbox 3[3]
The third generation of the software (Feature Selection Toolbox 3) is a library without user interface, written to be more efficient and versatile than the original FST1. FST3 is publicly available and free for non-commercial use.
FST3 supports several standard data mining tasks, more specifically, data preprocessing and classification, but its main focus is on feature selection. In feature selection context it implements several less usual techniques, including oscillating search (a form of hill-climbing) suitable for very-high-dimensional problems, feature selection with pre-specified feature weights, criteria ensembles, hybrid methods, detection of all equivalent solutions, or two-criterion optimization. FST3 is more narrowly specialized than popular software like WEKA or PRTools.
By default FST's techniques are predicated on the assumption that the data is available as a single flat file in a simple proprietary format or in WEKA format ARFF, where each data point is described by a fixed number of numeric attributes. FST3 is provided without user interface, and is meant to be used by users familiar both with machine learning and C++ programming. The older FST1 software is more suitable for simple experimenting or educational purposes because it can be used without necessity to code in C++.
History
- In 1999, the development of the first Feature Selection Toolbox version started at UTIA as part of a Ph.D. thesis. It was originally developed in Optima++ (later known under the name Power++) RAD C++ environment.
- In 2002, the development of the first FST generation has been suspended, mainly due to end of Sybase's support of the then used development environment.
- In 2002-2008, FST kernel has been re-coded and used for research experimentation within UTIA only.
- In 2009, 3rd FST kernel re-coding from scratch has started.
- In 2010, FST3 has been made publicly available in form of a C++ library without GUI. The accompanying web page collects feature selection related links, references, documentation and the original FST1 available for download.
See also
- Feature Selection
- Pattern Recognition
- Machine Learning
- Data Mining
- WEKA (comprehensive and popular Java open-source software from University of Waikato)
- PRTools of the Delft University of Technology
- RapidMiner (formerly YALE (Yet Another Learning Environment)) open-source machine learning framework implemented in Java fully integrating Weka
- List of numerical analysis software
References
- ^ Petr Somol (2010). "Efficient Feature Subset Selection and Subset Size Optimization" (PDF). Pattern Recognition Recent Advances, INTECH, ISBN 978-953-7619-90-9. pp. 75–97.
{{cite web}}
: Unknown parameter|coauthors=
ignored (|author=
suggested) (help) - ^ Petr Somol (2002). "Feature Selection toolbox" (PDF). Pattern Recognition vol.35, no.12, Elsevier. pp. 2749–2759.
{{cite web}}
: Unknown parameter|coauthors=
ignored (|author=
suggested) (help) - ^ Petr Somol (2010). "Introduction to Feature Selection Toolbox 3 -- The C++ Library for Subset Search, Data Modeling and Classification" (PDF). UTIA Tech. Report No. 2287. pp. 1–12. Retrieved 2010-11-02.
{{cite web}}
: Unknown parameter|coauthors=
ignored (|author=
suggested) (help)