The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Visualization of the micro video big data refers to the intuitive display of the obtained data on micro videos across the Internet for the purpose of helping users to understand the message in the data. This paper describes the implementation of a micro video big data visualization system in detail, which has four steps: determine the visualization objective, choose data based on the objective, display...
The use of technology on criminal data has proven to be a valuable tool in forecasting criminal activity. Crime prediction is one of the approaches that help reduce and deter crimes. In this paper, we perform geospatial analysis using the kernel density estimation in ArcGIS 10 to identify the spatio-temporal hotspots in Manila, the most densely populated city in the Philippines. We also compared the...
Mission spacecraft telemetry data formats have some complex characteristics. The formats have hierarchical and nested structures which need to be processed cross frames. The formats also have complex parameter dependencies and change frequently. So we design Multi-Channel-Protocol(MCP) Model. This model has the advantage of strong expressiveness, good versatility and scalability. This model is utilized...
Shenyang was a representative heavily polluted industrial city in northeast China. Two types of urban area with significant spatial position difference were divided by arterial traffic as the surrounding and the core urban area. To reduce the impact of disturbing factors such as firecrackers in the traditional Chinese festivals, the observation period of the monitoring data was selected from November...
Hadoop distributed file system (HDFS) is an open source framework that has been usually used for cloud storage systems. Due to the arts and craft development and the consideration of price, today's storage system usually contains both fast devices (such as SSDs) and slow devices (like Hard disks). In order to optimize the performance, we should store the frequently accessed data in fast devices, and...
This paper introduces a map display system with front and back end based on visualization technology and front end cache technology. This system is designed to solve the problem of low performance in massive data loaded situation. Combined with the characteristics of power network data, this paper designed a front-end cache category from three aspects of influence vulnerability topology structure...
The emergence of many heterogeneous sensor systems on the Web has produced huge amounts of sensor data. This requires building the customized and automated processing methods, which can be formed by connecting the services on the Web. Based on the semantic Web services markup language named OWL-S, the algorithms for processing sensor data on the Web are represented as the service ontologies and become...
To solve the problems of heterogeneous data types and large amount of calculation in making decision for big data, an optimized distributed OLAP system for big data is proposed in this paper. The system provides data acquisition for different data sources, and supports two types of OLAP engines, Impala and Kylin. First of all, the architecture of the system is proposed, consisting of four modules,...
This paper presents a novel adaptive resampling algorithm based on the clustering by fast search and find of density peaks (CFSFDP) algorithm and the synthetic minority oversampling technique (SMOTE), named DP-SMOTE. The essential idea of the proposed method is to use the improved CFSFDP algorithm to find the subclasses and removing noisy data automatically, and then to generate the minority samples...
Big data processing system has a good ability of data storage and data processing, can satisfy the demand of the processing of large amounts of data, efficient and high performance processing large data sets, the value of the data. In order to perform more efficient high-performance data retrieval, proposed to Spark system optimization, based on the Spark system, extend Spark SQL components and absorbing...
Aiming at the problem that the semantic explanation of the existing topic model is poor and the accuracy is not high, a semi-supervised topic learning and representation method based on association rules and metadata is proposed. First, we used the metadata as a priori knowledge to guide the topic learning, and got the probability distribution of the term in the document. Then, we got the frequent...
With the advance of mobile electronic devices and the development of positioning technology, a large volume of spatio-temopral data are collected in the form of desultorily data streams, which contain a lot of potential information. In this study, we focus on discovering the composition relationships between observation moving objects in a long period. Such research can be widely used in military...
Keystroke dynamics, which is a biometric characteristic that depends on typing style of users. In the past thirty years, dozens of classifiers have been proposed for distinguishing people using keystroke dynamics; many have obtained excellent results in evaluation. However, a more common case is that only normal instances are available and none of the rare classes are observed. It leads us to use...
Twitter and social media as a whole has great potential as a source of disease surveillance data however the general messiness of tweets presents several challenges for standard information extraction methods. Current methods for disease surveillance on twitter rely on inflexible keyword based approaches that require messages to be pre-filtered on the basis of a disease name which is supplied a priori...
Generative Adversarial Networks (GANs) are efficient frameworks for estimating generative model via adversarial process. However, GAN has known for suffering from training instability. Wasserstein GAN (WGAN) improves the training stability significantly but also brings an additional Lipschitz requirement for the critic network. To enforce the Lipschitz constraint, instead of weight clipping strategy,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.