Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Journal Issue Information

Archive

Year

Volume(Issue)

Issues

Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
Issue Info: 
  • Year: 

    2019
  • Volume: 

    16
  • Issue: 

    1 (39)
  • Pages: 

    3-20
Measures: 
  • Citations: 

    0
  • Views: 

    485
  • Downloads: 

    0
Abstract: 

Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction which aims at extracting semantic relations among entities from natural language text. Traditional relation extraction techniques were relation-specific, producing new instances of relations determined a priori. While effective, this model is not applicable in cases where the relations are not defined a priori or when the number of relations is high. Open Relation Extraction (ORE) methods were developed to elicit instances of arbitrary relations while requiring fewer training examples. Since ORE systems are employed by the applications depended on large-scale relation extraction, high performance and low computational cost are major requirements for ORE methods. This is particularly important in the large scales such as the Web. Many OIE systems have been proposed in recent years. These approaches range from shallow (such as part-of-speech tagging) to deep (such as semantic role labeling), therefore they differ in their performance level and computational cost. In this paper, we use the state-of-the-art shallow NLP tools to extract instances of relations. A supervised log-linear model for OIE is presented which is based on using advantages of shallow NLP tools, as they are fast and lead to a low computational time. Extractor which is the main core of proposed approach integrates a high performance subset of the shallow NLP tools with the strength of the deep NLP tools by using a supervised log linear model and produces a high performance method that is scalable. This causes efficient use of time and therefore reduces computational cost and increases precision. Proposed approach achieves higher precision and recall than ReVerb, one of the most successful shallow OIE system.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 485

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2019
  • Volume: 

    16
  • Issue: 

    1 (39)
  • Pages: 

    21-40
Measures: 
  • Citations: 

    0
  • Views: 

    715
  • Downloads: 

    0
Abstract: 

The imbalance data can be seen in various areas such as text classification, credit card fraud detection, risk management, web page classification, image classification, medical diagnosis/monitoring, and biological data analysis. The classification algorithms have more tendencies to the large class and might even deal with the minority class data as the outlier data. The text data is one of the areas where the imbalance occurs. The amount of text information is rapidly increasing in the form of books, reports, and papers. The fast and precise processing of this amount of information requires efficient automatic methods. One of the key processing tools is the text classification. Also, one of the problems with text classification is the high dimensional data that lead to the impractical learning algorithms. The problem becomes larger when the text data are also imbalance. The imbalance data distribution reduces the performance of classifiers. The various solutions proposed for this problem are divided into several categories, where the sampling-based methods and algorithm-based methods are among the most important methods. Feature selection is also considered as one of the solutions to the imbalance problem. In this research, a new method of one-way feature selection is presented for the imbalance data classification. The proposed method calculates the indicator rate of the feature using the feature distribution. In the proposed method, the one-figure documents are divided in different parts, based on whether they contain a feature or not, and also if they belong to the positive-class or not. According to this classification, a new method is suggested for feature selection. In the proposed method, the following items are used. If a feature is repeated in most positive-class documents, this feature is a good indicator for the positive-class; therefore, this feature should have a high score for this class. This point can be shown as a proportion of positive-class documents that contain this feature. Besides, if most of the documents containing this feature are belonged to the positive-class, a high score should be considered for this feature as the class indicator. This point can be shown by a proportion of documents containing feature that belong to the positive-class. If most of the documents that do not contain a feature are not in the positive-class, a high score should be considered for this feature as the representative of this class. Moreover, if most of the documents that are not in the positive class do not contain this feature, a high score should be considered for this feature. Using the proposed method, the score of features is specified. Finally, the features are sorted in descending order based on score, and the necessary number of required features is selected from the beginning of the feature list. In order to evaluate the performance of the proposed method, different feature selection methods such as the Gini, DFS, MI and FAST were implemented. To assess the proposed method, the decision tree C4. 5 and Naive Bayes were used. The results of tests on Reuters-21875 and WebKB figures per Micro F, Macro F and G-mean criteria show that the proposed method has considerably improved the efficiency of the classifiers than other methods.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 715

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2019
  • Volume: 

    16
  • Issue: 

    1 (39)
  • Pages: 

    41-56
Measures: 
  • Citations: 

    0
  • Views: 

    5583
  • Downloads: 

    0
Abstract: 

The aim of the short term load forecasting is to forecast the electric power load for unit commitment, evaluating the reliability of the system, economic dispatch, and so on. Short term load forecasting obviously plays an important role in traditional non-cooperative power systems. Moreover, in a restructured power system a generator company (GENCO) should predict the system demand and its corresponding price for efficient decision making. The task of a forecasting engine is to find the relation of the inputs and outputs of the system and also predicts the outputs for a given inputs. Therefore, the accuracy of forecasting is highly affected by the inputs of the forecasting engine. This effect can be studied from two points of view; First, extracting the more informative inputs and second, reducing the dimension of input space, both make it possible to learn the forecasting network via more simple models with more generalization. As a result, a reduced informative input space leads to lower prediction error. In many previous load forecasting methods, the inputs have been selected empirically. In this manner, the more correlative factors with the load in the forecasting day have been chosen as the inputs. They are generally a combination of load history and weather conditions. Several researches are focused on mathematical approaches of the input selection which are mainly based on principal component analysis (PCA) method as well as some intelligent algorithms. In this paper, a manifold learning method namely Locally Linear Embedding (LLE) is proposed, aiming to extract more informative inputs and to reduce the dimension of input space for short term load forecasting. Among all methods based on manifold learning, it can be seen that LLE performs very well in extracting the electric load curve features. The aim of this paper is to analyze the features of the load curve for estimating this curve in future. The extensive computational experiments show that the extracted features by LLE results in less prediction error than two other methods. Furthermore, LLE acts faster and makes input dimension lower than the two other methods. In the following section we will discuss the LLE method. The LLE method finds the nonlinear relationships among features by mapping a locally linear manifold in the feature space. Extracting the more informative inputs by extracting the combinational features by finding the nonlinear dependences of the features, results in reducing the dimension of input space. The resulted inputs from feature extraction and dimension reduction are utilized for load forecasting. To examine the effect of the proposed feature extraction method on load prediction error, a hybrid prediction system is proposed which is a combination of a radial basis function (RBF) network and a fuzzy system. The RBF network is the core of the prediction engine and works with historical load data as its inputs. The fuzzy inference system is combined with the RBF network to incorporate the impact of temperature on load. The case studies are carried out on the real data of electric power load of Mazandaran area in Iran. The efficiency of the proposed forecasting engine is compared with three benchmarks, the artificial neural network, time series and neuro-fuzzy methods. Furthermore, the proposed input selection method (LLE) is compared with principal component analysis (PCA) and empirical selection of inputs. Simulation results with statistical significance analysis show that the LLE method with the proposed forecasting engine is superior to other input selection methods and forecasting engines in sense of lower input dimension and lower prediction error.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 5583

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2019
  • Volume: 

    16
  • Issue: 

    1 (39)
  • Pages: 

    57-74
Measures: 
  • Citations: 

    0
  • Views: 

    1022
  • Downloads: 

    0
Abstract: 

Automatic colorization of gray scale images poses a unique challenge in Information Retrieval. The goal of this field is to colorize images which have lost some color channels (such as the RGB channels or the AB channels in the LAB color space) while only having the brightness channel available, which is usually the case in a vast array of old photos and portraits. Having the ability to colorize such images would give us a multitude of possibilities ranging from colorizing old and historic images to providing alternate colorizations for real images or artistic creations. Be that as it may, the progress in this field is trivial compared to what the professionals are able to do using special-purpose applications such as Photoshop or GIMP. On the other hand, losing the information stored in color channels and having only access to the primary brightness channel, makes this problem a unique challenge, since the main aim of automatic colorization is not to find the image’ s “ real” color but to colorize it in such a way that makes it “ seem real” as the original color information is lost forever and the only way to colorize it, is to provide a somewhat “ proper” estimation. In this research we propose a model to automatically colorize gray human portraits. We start by reviewing the methods used for the task of image colorization and provide an explanation as to why most of them collapse to a situation known as “ Averaging” . To counteract this effect, we design our end-to-end model with two separate deep neural networks forming a Generative Adversarial Network (GAN), one to colorize the images and the other to evaluate the colorization of the first network and guide it towards the proper distribution. The results show improvements over other proposed methods in this field especially in the case of colorizing human portraits along faster train times. This method not only works on real human portraits but also on non-human and artistic portraits that can be leveraged to colorize hand-drawn images some of which may take minutes up to hours by hand.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 1022

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Author(s): 

Amintoosi Mahmood

Issue Info: 
  • Year: 

    2019
  • Volume: 

    16
  • Issue: 

    1 (39)
  • Pages: 

    75-89
Measures: 
  • Citations: 

    0
  • Views: 

    661
  • Downloads: 

    0
Abstract: 

The problem of accurate foreground estimation in images is called Image Matting. In image matting methods, a map is used as learning data, which is produced by those pixels that are definitely foreground, definitely background, and unknown. This three-level pixel map is often referred to as a trimap, which is produced manually in alpha matte datasets. The true class of unknown pixels will be estimated by minimizing of an objective function. Several methods for image matting has been proposed. The learning– based method is one the pioneering works which is the basis of many other approaches in the field of image matting. In this method it is assumed that each pixel’ s alpha value is a linear combination of its associated neighboring pixels. A Laplacian matrix in the objective function shows the similarity of the pixels. The coefficients of the linear combination are estimated with a local learning process by minimizing a quadratic cost function. The method of Lagrange multiplier and ridge regression technique are used for estimation of alpha values. In this objective function the violation of the predefined training pixels’ alpha values from their true values is controlled by a penalty term. Considering this coefficient as infinity, forces the matte (alpha) value to be 1 for the labeled foreground pixels and 0 for background. The weight of this penalty term still was taken equal for all training samples. In this paper the performance of the matting method is increased by considering different weights for different learning pixels. The good performance of the proposed method is demonstrated in two applications. The first application is improving the quality of a text extraction method and the second application is enhancement of an eye retinal segmentation system. In the first application, a Persian text which is fused onto a textured background is extracted by a thresholding method. After that the segmented output is enhanced by the proposed matting method. In the second application, segmentation is done with an existing vessel extraction method. The edges’ pixels of detected vessels that may be classified inaccurately are classified by the proposed image matting method. Subjective and objective comparisons show the better performance of the proposed method.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 661

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2019
  • Volume: 

    16
  • Issue: 

    1 (39)
  • Pages: 

    91-109
Measures: 
  • Citations: 

    0
  • Views: 

    758
  • Downloads: 

    0
Abstract: 

The goal in the named entity recognition task is to classify proper nouns of a piece of text into classes such as person, location, and organization. Named entity recognition is an important preprocessing step in many natural language processing tasks such as question-answering and summarization. Although many research studies have been conducted in this area in English and the state-of-the-art NER systems have reached performances of higher than 90 percent in terms of F1 measure, there are very few research studies on this task in Persian. One of the main important reasons for this may be the lack of a standard Persian NER dataset to train and test the NER systems. In this research we create a standard tagged Persian NER dataset which will be distributed freely for research purposes. In order to construct this standard dataset, we studied the existing standard NER datasets in English and came to the conclusion that almost all of these datasets are constructed using news data. Thus we collected documents from ten news websites in Persian. In the next step, in order to provide the annotators with guidelines to tag these documents, we studied the guidelines used for constructing CoNLL and MUC English datasets and created our own guidelines considering the Persian linguistic rules. Using these guidelines, all words in documents can be labeled as person, location, organization, time, date, percent, currency, or other (words that are not in any of these 7 classes). We use IOB encoding for annotating named entities in documents, like most of the existing English NER datasets. Using this encoding, the first token of a named entity is labeled with B, and the next tokens (if exist) are labeled with I. The words that are not part of any named entity are labeled with O. The constructed corpus, named PAYMA, consists of 709 documents and includes 302530 tokens. 41148 tokens out of these tokens are labeled as named entities and the others are labeled as O. In order to determine the inter-annotator agreement, 160 documents were labeled by a second annotator. Kappa statistic was estimated as 95% using words that are labeled as named entities. After creating the dataset, we used the dataset to design a hybrid system for named entity recognition. We trained a statistical system based on the CRF algorithm, and used its output as a feature to train a bidirectional LSTM recurrent neural network. Moreover, we used the k-means word clustering method to cluster the words and fed the cluster number of each word to the LSTM neural network. This form of combining CRF with neural networks and using the cluster number for each word is the novelty of this research work. Experimental results show that the final model can reach an F1 score of 87% at word-level and 80% at phrase level.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 758

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2019
  • Volume: 

    16
  • Issue: 

    1 (39)
  • Pages: 

    111-123
Measures: 
  • Citations: 

    0
  • Views: 

    1287
  • Downloads: 

    0
Abstract: 

for recognizing various types of plants, so automatic image recognition algorithms can extract to classify plant species and apply these features. Fast and accurate recognition of plants can have a significant impact on biodiversity management and increasing the effectiveness of the studies in this regard. These automatic methods have involved the development of recognition techniques and digital image processing pattern. Most of the previous studies on the classification and identification of plant species from leaf images are based on the shape, texture and color features. There were also different methods of data modeling which have been used to leave plant recognition. In this paper, we investigate a novel approach for the recognition of plant species using texture feature GIST to extract general features. In the classification step, Patternnet feed forward neural network algorithm has been applied. Essentially, the GIST feature has been designed to be employed for image classification. In this study, GIST feature vectors are considered as the basis of the leaves’ classification. The GIST descriptor of an image is computed by the first filtering of an image by a filter bank of Gabor filters, and then averaging the responses of filters in each block on a no overlapping grid. For evaluation of our approach, we have applied the algorithm on scan and pseudo-scan images of two famous different datasets Image CLEF2012 and Leaf snap with a high various. The results show that in comparison to some widely used algorithms, our approach outperforms in the case of time and also the accuracy of classification. Substantial results can be achieved when the image of the plants are aligned with one another and when we deal with pseudo scan images. The detection of combinations of leaves that have jagged edges is an important contribution of this study. In many of the previous algorithms, the computational complexity of this detection is high. While by using the GIST feature vector, these types of images are processed simply and precisely (above 90%). precisely (above 90%).

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 1287

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2019
  • Volume: 

    16
  • Issue: 

    1 (39)
  • Pages: 

    125-142
Measures: 
  • Citations: 

    0
  • Views: 

    505
  • Downloads: 

    0
Abstract: 

Increasing the number of cores in order to the demand of more computing power has led to increasing the processor temperature of a multi-core system. One of the main approaches for reducing temperature is the dynamic thermal management techniques. These methods divided into two classes, reactive and proactive. Proactive methods manage the processor temperature, by forecasting the temperature before reaching the threshold temperature. In this paper, the effects of using proper features for processor thermal management have been considered. In this regard, three models have been proposed for temperature prediction, control response estimation, and thermal management, respectively. A multi-layered perceptron neural network is used to predict the temperature and to control the response. Also, an adaptive neuro-fuzzy inference system is utilized for controlling temperature. An appropriate data set, which includes a variety of processor temperature variations, has been created to train each model. Some features of the dataset are collected by monitoring the thermal sensors and performance counters. In addition, a number of features are created by proposing processes to increase the accuracy of each model. Then, the features of each model are selected by the proposed method. The evaluation of the proposed model for predicting and controlling the processor temperature for different time distances is below 0. 6 ° C.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 505

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2019
  • Volume: 

    16
  • Issue: 

    1 (39)
  • Pages: 

    143-156
Measures: 
  • Citations: 

    0
  • Views: 

    680
  • Downloads: 

    0
Abstract: 

Nowadays high usage of users from virtual environments and their connection via social networks like Facebook, Instagram, and Twitter shows the necessity of finding out shared subjects in this environment more than before. There are several applications that benefit from reliable methods for inferring age and gender of users in social media. Such applications exist across a wide area of fields, from personalized advertising to law enforcement of reputation management. Text posts represent a large portion of user generated content, and contain information which can be relevant to discovering undisclosed user attributes, or investigating the honesty of self-reported age and gender. Because the highest rate of information exchanges is in text format, author identification from the aspects like age, gender, political and religious opinions from these contents will seem more considerable. Gender identification that could be useful in security and marketing, also answers the following question: given a short text document, can we identify if the author is a male or a female? This question is motivated by recent events where people faked their gender on the Internet. In this paper, author gender identification in blog’ s data is investigated. In this regard, four groups of features include syntactic features, word-based features, character-based features, and function words are employed. In addition, character n-gram features is used for improving the accuracy of classification. For evaluation of the proposed method, 3212 texts were collected from Technorati. com and blogger. com. Experimental results demonstrate that these types of features are practical. furthermore, a new classification method called "Bayesian Random Forest" is introduced. Each tree in Bayesian Random Forest is a Bayes tree. The results of experiment show that this method attains noticeable results in comparison with other classification algorithms such as Naï ve Bayes, Naï ve Bayes Tree, and Random Forest and it increases accuracy of gender identification to 89. 5%.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 680

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2019
  • Volume: 

    16
  • Issue: 

    1 (39)
  • Pages: 

    157-171
Measures: 
  • Citations: 

    0
  • Views: 

    674
  • Downloads: 

    0
Abstract: 

Dimensionality reduction methods transform or select a low dimensional feature space to efficiently represent the original high dimensional feature space of data. Feature reduction techniques are an important step in many pattern recognition problems in different fields especially in analyzing of high dimensional data. Hyperspectral images are acquired by remote sensors and human face images are one of the high dimensional data types. Because of limitation in the number of training samples, feature reduction is the important preprocessing step for classification of these types of data. Face recognition is one of the main interesting studies in human computer interaction applications. Face is among the most significant biometric characteristics which are used for identification of individuals. Before face recognition, feature reduction is an important processing step. In this paper, we apply the new feature extraction methods, which have been firstly proposed for feature reduction of hyperspectral imagery remote sensing, on the face databases for the first time. In this research, we compare the performance of seven new feature extraction methods with four state-of-the-art feature extraction methods. The proposed methods are Nonparametric Supervised Feature Extraction (NSFE), Clustering Based Feature Extraction (CBFE), Feature Extraction Using Attraction Points (FEUAP), Cluster Space Linear Discriminant Analysis (CSLDA), Feature Space Discriminant Analysis (FSDA), Feature Extraction using Weighted Training samples (FEWT), and Discriminant Analysis-Principal Component 1 (DA-PC1). The experimental results on two face databases, Yale and ORL, show the better performance of some new feature extraction methods, from the recognition accuracy point of view compared to methods such as linear discriminant analysis (LDA), non-parametric weighted feature extraction (NWFE), median-mean line discriminant analysis (MMLDA), and supervised locality preserving projection (LPP).

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 674

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
telegram sharing button
whatsapp sharing button
linkedin sharing button
twitter sharing button
email sharing button
email sharing button
email sharing button
sharethis sharing button