The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In the presence of huge high dimensional datasets, it is important to investigate and visualize the connectivity of patterns in huge arbitrary shaped clusters. While density or distance-relatedness based clustering algorithms are used to efficiently discover clusters of arbitrary shapes and densities, classical (yet less efficient) clustering algorithms can be used to analyze the internal cluster...
Several supervised and unsupervised methods have been applied to the field of character recognition. In this research we focus on the unsupervised methods used to group similar characters together. Instead of using the traditional clustering algorithms, which are mainly restricted to globular-shaped clusters, we use an efficient distance based clustering that identifies the natural shapes of clusters...
Selecting the most discriminative genes/miRNAs has been raised as an important task in bioinformatics to enhance disease classifiers and to mitigate the dimensionality curse problem. Original feature selection methods choose genes/miRNAs based on their individual features regardless of how they perform together. Considering group features instead of individual ones provides a better view for selecting...
A number of attempts to classify cancer samples using miRNA/gene expression profiles are known in literature. However, semi-supervised learning models have only been recently introduced to exploit the huge unlabeled expression profiles in enhancing sample classification. It is important to combine both miRNA and gene expression sets as that provides more information on the characteristics of cancer...
Ensembles of classifiers were shown to provide better accuracy than single classifiers. However, the classification robustness is an important performance measure for classifiers and ensembles, besides accuracy, that should be considered. Increasing the robustness of classification systems results in reducing the probability of over-fitting. The robustness, as defined in this study, has not been studied...
While finding natural clusters in high dimensional data is in itself a challenge, the dynamic nature of data adds another greater challenge. Many applications such as Data Warehouses and WWW demand the presence of efficient incremental clustering algorithms to handle their dynamic data. So far, numerous useful incremental clustering algorithms have been developed for large datasets such as incremental...
Ensembles of classifiers have recently proved their efficiency in cancer diagnosis based on microarray datasets. The main performance indicators, namely, accuracy and diversity, present the main focus of study when designing an ensemble. One other important performance indicator is classification robustness. In an attempt to improve the performance of an ensemble, the proposed algorithm presents a...
Document clustering has become inevitable for applications that aim to extract information from huge corpuses. Such applications face two main challenges; one is the efficient representation of the documents, along with using an efficient similarity measure, and the second is dealing with the dynamic nature of the corpus. In this paper, an efficient document clustering model is introduced for incrementally...
The availability of streaming data in different fields and in various forms increases the importance of streaming data analysis. The huge size of a continuously flowing data has put forward a number of challenges in data stream analysis. Exploration of the structure of streamed data represented a major challenge that resulted in introducing various clustering algorithms. However, current clustering...
Gene expression arrays provide a rich source of information on the behaviour of thousands of genes for several clinical conditions in a particular tumor/cancer. Such expression sets when integrated with functional classification of genes enrich information provided from both sources. Stemming from the need to score relations between functional groups of genes and multiple clinical types associated...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.