The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper introduces one of the largest Romanian speech datasets freely available for both academic and commercial use. The dataset comprises speech data recorded over the last year from 12 speakers, along with 5 other speakers previously recorded in a separate environment. The data was manually segmented at utterance-level and semi-automatically labelled at phone-level. The resulting corpus amounts...
The paper aims to analyze the frequency of the posts in case of earthquakes and of the word associations included in such Social Media (SM) posts. Since important posts are shared by users in SM, the purpose was to identify the variation of a number of posts having unique content that occurred over a period of time in Social Media for a particular topic. The present study uses messages generated by...
This paper provides an analysis of tweets and of their vocabulary in a specific emergency situation — earthquakes, moreover, the correlations between several words from messages and on the linear regressions between word usages and the intensity of the earthquakes. We analyzed the vocabulary used on tweets about Romanian earthquakes with the vocabulary of tweets used for other European earthquakes.
The goal of this work is to present some possible intruder detection systems and the influence of impulse-like signals upon the overall classification accuracy. Two different scenarios are used: in the first scenario five sound classes are considered (last class belong to impulsive sounds — gunshots), while in the second scenario we dropped out the impulsive sound class. More classifiers are used...
This paper presents a rule-based approach for generating a large phonetic database for Romanian. The knowledge base is developed by means of the GRAALAN (Grammar Abstract Language) system. By inspecting dictionaries and corpora, we generate a phonetic database over 100,000 lemmas. Our database has a high degree of accuracy ensured by our rule-based method applied for generating phonetic transcriptions.
The field of digital audio forensics has been driving a sustained research effort in the last decade. Current digital audio authentication frameworks include Electric Network Frequency (ENF) criterion as a must. The ENF-based techniques benefit greatly from the availability of reference databases, which are built using extraction mechanisms that continuously analyze the power line signal. To find...
The paper explains the relation between prosodic phrases and Information Structure (IS) by decomposing phrases into hierarchies of embedded contrast/communicative units (CUs). At any level of hierarchies, CUs contains IS partitions supported by two contrasted functional constituents. The functional categories are defined by using a two level IS model. Topic-Focus and CU_predicate-CU_argument are the...
Recognizing emotions using natural or spontaneous speech are extremely difficult compared to doing the same for acted or elicited speeches. Speech emotion recognition for real conversation such as spontaneous speech requires linguistic information of the speech to be included in the speech emotion recognition component to achieve a high recognition rate. However, with the lack of digital speech resources...
This paper describes a study of the evolution of Romanian language, belonging to 18h and 19h centuries, from geographical domain, in order to develop an automatic recognition and interpretative transcription of Romanian historical heritage writings from Cyrillic into Latin, in printed forms. It is well known that the operation of interpretative transcription of texts written in Cyrillic is extremely...
This paper describes a data-driven approach to handling natural language interaction between humans and devices. This approach enables example-based definition and tuning of interaction scenarios. Actions and parameters can be easily configured, requiring no prior knowledge of natural language processing and no previous experience with this type of systems. The platform requires a small amount of...
Automatic Speaker Analysis has largely focused on single aspects of a speaker such as her ID, gender, emotion, personality, or health state. This broadly ignores the interdependency of all the different states and traits impacting on the one single voice production mechanism available to a human speaker. In other words, sometimes we may sound depressed, but we simply have a flu, and hardly find the...
The convolutional deep neural network component applied frequently in current speech recognizers is trained on a context of consecutive spectral feature vectors. Here, we investigate whether we can extend the time span of this input and reduce the number of spectral features at the same time by using a multi-resolution spectrum as input. In the proposed multi-resolution scheme, the network processes...
The need of progress implies the need of time. Daily tasks have been automated to solve time issues but they still need the input of a user. The need for interaction with different applications may endanger the user's life. The simplest way for these automatizations to be “life-saving” is to fully support speech recognition. Although, right now, this is done in an acceptable manner, the main problem...
This paper presents the work done towards developing a speech corpus for Romanian, for automatic speech recognition for the banking domain. This work is done in the context of the Speech2Process project, which aims at creating a system which allows interaction between customers and agents in the contact center much easier. The application to use the banking corpus will provide automatic response to...
This paper introduces a novel open access resource, the machine-readable phonetic dictionary for Romanian — MaRePhoR. It contains over 70,000 word entries, and their manually performed phonetic transcription. The paper describes the dictionary format and statistics, as well as an initial use of the phonetic transcription entries by building a grapheme to phoneme converter based on decision trees....
Multimedia files, either video or audio, could greatly influence the final verdict of a trial when accepted as evidence. The abundance of free editing software available nowadays make forgeries a very easy operation. Audio messages, even if authentic, in some cases, can be heavily masked by other signals and declared unusable. This paper presents the investigations on the performance of the affine...
This paper presents the architecture and technologies used to develop a voice controlled system for home automation named Cassandra. We start with the goals of the project and a system description, then focusing on the main components and the way they interact with each other. We exemplify with a scenario where we ask the house to turn the lights off, going step-by-step over the communication sequence...
This paper presents the work done in the context of the Speech2Process project for Speech Dialogue System applied in call-centers, specifically in the banking domain. In our proposed solution, the client communicates with the system by natural language sentences, which will be automatically recognized and semantically analysed. The paper describes innovative features of the selected approach, which...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.