The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Clustering is a classic topic in optimization with k-means being one of the most fundamental such problems. In the absence of any restrictions on the input, the best known algorithm for k-means with a provable guarantee is a simple local search heuristic yielding an approximation guarantee of 9+≥ilon, a ratio that is known to be tight with respect to such methods.We overcome this barrier...
This paper is concerned with event clustering for short text streams, which aims to divide constantly arriving short texts into several dynamic event-based clusters. A widely adopted approach is based on the Vector Space Models (VSMs) such as bag of words. However, these models have limitations in that not only the semantic relationships between words are largely ignored, the term weighting may also...
We propose a new superpixel algorithm based on exploiting the boundary information of an image, as objects in images can generally be described by their boundaries. Our proposed approach initially estimates the boundaries and uses them to place superpixel seeds in the areas in which they are more dense. Afterwards, we minimize an energy function in order to expand the seeds into full superpixels....
Superpixel decomposition methods are generally used as a pre-processing step to speed up image processing tasks. They group the pixels of an image into homogeneous regions while trying to respect existing contours. For all state-of-the-art superpixel decomposition methods, a trade-off is made between 1) computational time, 2) adherence to image contours and 3) regularity and compactness of the decomposition...
Distance-based and density-based clustering algorithmsare often used on large spatial and arbitrary shape ofdata sets. However, some well-known clustering algorithms havetroubles when distribution of objects in the dataset varies, andthis may lead to a bad clustering result. Such bad performancesare more dramatically significant on high-dimensional dataset. Recently, Rodriguez and Laio proposed an...
In large-scale environments, robots should have proper internal representation of the surroundings for achieving tasks such as localization, navigation, and exploration. Internal representations could be categorized in two ways: metric (grid-based) map and topological map. In this paper, we aim to generate a topological map representation (collision-free graph) of the large-scale environment from...
Multi-tenant storage management environments typically manage multiple enterprise accounts with heterogeneous storage footprints. Identifying and grouping accounts with similar storage footprints into clusters reduces account management overhead, and provides a framework for data-driven storage recommendation services. This paper describes a method for the clustering of accounts in multi-tenant storage...
Different aspects of usage of electronic devices significantly vary person to person, and therefore, rigorous usage analysis exhibits its prospect in identifying a user in road to secure the devices. Different state-of-the-art approaches have investigated different aspects of the usage, such as typing speed and dwelling time, in isolation for identifying a user. However, investigation of multiple...
Online Social Networks (OSNs) heavily rely on community detection algorithms to support many of their core services. Common functions such as friend recommendation, and timeline personalization all require the fast discovery of communities over some massive graph(s). For such applications, scalability, flexibility and speed are much more important than marginal improvement in the theoretical quality...
Nowadays, cardiovascular disease (CVD) has become a disease of the majority. As an important instrument for diagnosing CVD, electrocardiography (ECG) is used to extract useful information about the functioning status of the heart. In the domain of ECG analysis, cluster analysis is a commonly applied approach to gain an overview of the data, detect outliers or pre-process before further analysis. In...
A number of biclustering algorithms have been introduced to discover local gene expression patterns in micro array data. Also, High-throughput biological techniques such as ChIP-seq have generated massive genome-wide data and offered ideal opportunities where biclustering can help unveil underlying biological mechanisms. Chromatin immunoprecipitation with massively parallel sequencing (ChIP-seq) has...
The size and interconnectedness of social networks continues to increase. As a result, finding communities or subsets of like nodes within these large networks has become a resource-intensive endeavor. In this paper, we characterize community-finding organized on the basis of network/set properties, and describe an agglomerative algorithm called egocentric community finding. The primary contribution...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.