The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
ID3 decision tree data mining is a popular and widely studied data analysis technique for a range of applications. In this paper, we focus on the privacy-preserving ID3 decision tree algorithm on horizontally partitioned datasets. In such a scenario, data owners wish to learn the decision tree result from a collective data set but disclose minimal information about their own sensitive data. In this...
The emergence and development of the Internet resulted in the generation of huge amounts of data, which are often distributed among different sites. Many organizations and companies attempted to mine the data with cloud computing. However, given the rise of various privacy issues, sensitive data (e.g., medical records) need to be encrypted before outsourcing to the cloud. To process data mining, such...
Recently, the significance of data mining and machine learning have been highlighted in diversified application scenarios. Various data mining and machine learning techniques are often used to analyze the gigantic amount of data to create more commercial values in high-end enterprise systems. However, the advancement of technologies has made data mining and machine learning possible on low-end systems,...
Relation extraction is a challenging task in biomedical text mining due to the complex of sentences in the biomedical literature. In this paper, we address multi-class relationship extraction problem from biomedical literature using Maximum Entropy model with simple word features. The proposed method is applied to extract the protein-protein interactions. Experiments show the method achieves an accuracy...
Relation extraction is the detection and classification of interactions between named entities. Recently, a lot of studies have focused on the relation extraction from biomedical literatures. Although various approaches have been applied on this area, most methods are either pre-requires biomedical lexicons or parsing templates which are not suitable for complexities of biomedical literatures. In...
Identification of long disordered regions in protein sequence is important for understanding protein function. In this work, a class of novel propensities at profile level is presented, namely, the order profile disorder propensities, which use the evolutionary information of profile for protein long disorder prediction. These propensities, combined with position-specific scoring matrices, are inputted...
Many researches focus on semantic analysis in opinion mining and get effective results. However, dealing with context dependent opinions is still a challenge. Existing methods have used linguistic rules to cope with this problem, however, when the opinion is irrelevant to its adjacent sentences, the linguistic rules will not work. These special opinions, in this paper, called context indistinct-dependent...
User information need detection is a fundamental issue in automatic question answering systems. Based on real questions collected from on-line question answering communities, this paper proposes a three-level question type taxonomy to model user information need. The three levels are based on interrogative patterns, hidden user intentions and specific answer expectations. One question can have multiple...
Hidden Markov model (HMM) is successfully used in speech recognition. However, there is an unavoidable flaw in assuming strong independence for sequences labeling in HMM. While conditional random fields (CRFs) can relax this assumption for HMM, and can also solve the label bias problem efficiently. In this paper, we investigate CRFs for Chinese syllable recognition in continuous speech due to its...
There exist problems of slow convergence and local optimum in standard Q-learning algorithm. Truncated TD estimate returns efficiency and simulated annealing algorithm increase the chance of exploration. To accelerate the algorithm convergence speed and to avoid results in local optimum, this paper combines Q-learning algorithm, truncated TD estimation and simulated annealing algorithm. We apply improved...
Classification based on predictive association rules (CPAR) is a kind of association classification methods which combines the advantages of both associative classification and traditional rule-based classification. For rule generation, CPAR is more efficient than traditional rule-based classification because much repeated calculation is avoided and multiple literals can be selected to generate multiple...
Conditional random fields (CRFs) have been used for many sequence labeling tasks and got excellent results. Further, the supervised model strongly depends on the huge training data. Active learning is a different way rather than relying on a large amount random sampling. However, random sampling constructively participates in the optimal choosing training examples. Based on different query strategies,...
3D model retrieval emerges as an important part of multimedia information retrieval. Current researches in 3D model retrieval concentrate on the shape-based way. However, its performance isn't satisfying because of the semantic gap. The paper explores the semantic-based 3D model retrieval method based on semantic tree and the hybrid method based on content and semantic. First, the semantic tree is...
It is of great importance to predict future property of shelf aging ethylene propylene rubber (EPR) as sealing materials since it is often delayed to be applied in engineering practice. Tensile strength data collected from each re-inspection in the past few years is dispersive so that current methods to predict future property of EPR are not accurate. A new method utilizing reliability evaluation...
Thermally stimulated current (TSC) is developed and a new method for direct determination of trap level distribution in polymer film is proposed. In this method, a new function is defined to weight the contribution of a trap level to the current at any temperature. The demarcation energy is used to study the trap empty process. Analysis shows that only electrons with trap levels very close to the...
Recently, the Web has been the data repository. In order to obtain the relevant information from the repository, many research have been made. The typical function of Web news extraction is to locate the useful content text and filter the noises , both main issues result in Web news extraction that is an open research problem. In this paper , we describe an approach that can cluster the pages which...
In 1998, Blaze, Bleumer, and Strauss proposed a kind of cryptographic primitive called proxy re-encryption. In proxy re-encryption, a proxy can transform a ciphertext computed under Alice's public key into one that can be opened under Bob's decryption key. They predicated that proxy re-encryption and re-signature will play an important role in our life. In 2007, Matsuo proposed the concept of four...
In this paper, a relatively specific teaching program was designed and put into practice. Its real effects on students were studied through using empirical methods. The results show that cooperative English learning in network environment can really improve studentspsila English listening proficiency and raise their autonomous learning awareness.
We present a ME (Maximum Entropy) model for Semantic Chunk Annotation in a Chinese Question and Answer (Q&A) system. The model was derived from a corpus of real world questions, which are collected from some discussion groups on the Internet. The questions are supposed to be answered by other people, so the questions are very complex. The semantic chunks were introduced. Feature for the model...
Clustering techniques can be adopted to analyze 3D model database and improve the retrieval performance. However, 3D model database lack valuable prior knowledge. Thus, it becomes difficult for the clustering methods to pre-decide the appropriate parameter's value. Moreover, clustering methods are short at handling outliers by treating outliers as "noise". The paper introduces a robust hierarchical...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.