Página 8 dos resultados de 471 itens digitais encontrados em 0.068 segundos

Class-specific metrics for multidimensional data projection applied to CBIR

Jóia Filho, Paulo; Gomez-Nieto, Erick Mauricio; Casaca, Wallace Correa de Oliveira; Botelho, Glenda Michele; Paiva Neto, Afonso; Nonato, Luis Gustavo
Fonte: Springer-Verlag; Heidelberg Publicador: Springer-Verlag; Heidelberg
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.636226%
Content-based image retrieval is still a challenging issue due to the inherent complexity of images and choice of the most discriminant descriptors. Recent developments in the field have introduced multidimensional projections to burst accuracy in the retrieval process, but many issues such as introduction of pattern recognition tasks and deeper user intervention to assist the process of choosing the most discriminant features still remain unaddressed. In this paper, we present a novel framework to CBIR that combines pattern recognition tasks, class-specific metrics, and multidimensional projection to devise an effective and interactive image retrieval system. User interaction plays an essential role in the computation of the final multidimensional projection from which image retrieval will be attained. Results have shown that the proposed approach outperforms existing methods, turning out to be a very attractive alternative for managing image data sets.; FAPESP; CAPES-Brazil

Automated interface for retrieving reusable software components

Dolgoff, Scott Joel
Fonte: Monterey, California. Naval Postgraduate School Publicador: Monterey, California. Naval Postgraduate School
Tipo: Tese de Doutorado
Português
Relevância na Pesquisa
37.60493%
The Computer Aided Prototyping System (CAPS) software base contains software components described by formal specifications written in the Prototype System Description Language (PSDL). One problem addressed by this thesis is to develop a retrieval mechanism for extracting components that match user-provided PSDL specifications. Another problem addressed is the integration of a retrieved component into a software prototype. The approach taken was to match specifications by comparing operations and parameter types to include indirect subtype relations. Integrating a selected software base component required generating mappings to account for different operation and parameter orderings and, for generic components, automatic instantiation. The result was a tool which implements automated assistance for finding reusable components in a large software repository. Methods were developed for parameter and operator mapping, parameter type matching, and ensuring instantiation of a generic was possible. Upon receipt of a PSDL specification query, these methods are employed to automate the retrieval of all matching components and the integration of the selected component into the software prototype. This has been fully implemented for operator components and partially implemented for type components. The retrieval mechanism...

DIMUSE: An integrated framework for distributed multimedia system with database management and security support

Zhao, Na
Fonte: FIU Digital Commons Publicador: FIU Digital Commons
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.636226%
With the recent explosion in the complexity and amount of digital multimedia data, there has been a huge impact on the operations of various organizations in distinct areas, such as government services, education, medical care, business, entertainment, etc. To satisfy the growing demand of multimedia data management systems, an integrated framework called DIMUSE is proposed and deployed for distributed multimedia applications to offer a full scope of multimedia related tools and provide appealing experiences for the users.^ This research mainly focuses on video database modeling and retrieval by addressing a set of core challenges. First, a comprehensive multimedia database modeling mechanism called Hierarchical Markov Model Mediator (HMMM) is proposed to model high dimensional media data including video objects, low-level visual/audio features, as well as historical access patterns and frequencies. The associated retrieval and ranking algorithms are designed to support not only the general queries, but also the complicated temporal event pattern queries. Second, system training and learning methodologies are incorporated such that user interests are mined efficiently to improve the retrieval performance. Third, video clustering techniques are proposed to continuously increase the searching speed and accuracy by architecting a more efficient multimedia database structure. A distributed video management and retrieval system is designed and implemented to demonstrate the overall performance. The proposed approach is further customized for a mobile-based video retrieval system to solve the perception subjectivity issue by considering individual user's profile. Moreover...

Adaptive Nonparametric Image Parsing

Nguyen, Tam V.; Lu, Canyi; Sepulveda, Jose; Yan, Shuicheng
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 06/05/2015 Português
Relevância na Pesquisa
37.60493%
In this paper, we present an adaptive nonparametric solution to the image parsing task, namely annotating each image pixel with its corresponding category label. For a given test image, first, a locality-aware retrieval set is extracted from the training data based on super-pixel matching similarities, which are augmented with feature extraction for better differentiation of local super-pixels. Then, the category of each super-pixel is initialized by the majority vote of the $k$-nearest-neighbor super-pixels in the retrieval set. Instead of fixing $k$ as in traditional non-parametric approaches, here we propose a novel adaptive nonparametric approach which determines the sample-specific k for each test image. In particular, $k$ is adaptively set to be the number of the fewest nearest super-pixels which the images in the retrieval set can use to get the best category prediction. Finally, the initial super-pixel labels are further refined by contextual smoothing. Extensive experiments on challenging datasets demonstrate the superiority of the new solution over other state-of-the-art nonparametric solutions.; Comment: 11 pages

Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models

Plummer, Bryan A.; Wang, Liwei; Cervantes, Chris M.; Caicedo, Juan C.; Hockenmaier, Julia; Lazebnik, Svetlana
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.684624%
The Flickr30k dataset has become a standard benchmark for sentence-based image description. This paper presents Flickr30k Entities, which augments the 158k captions from Flickr30k with 244k coreference chains linking mentions of the same entities in images, as well as 276k manually annotated bounding boxes corresponding to each entity. Such annotation is essential for continued progress in automatic image description and grounded language understanding. We present experiments demonstrating the usefulness of our annotations for text-to-image reference resolution, or the task of localizing textual entity mentions in an image, and for bidirectional image-sentence retrieval. These experiments confirm that we can further improve the accuracy of state-of-the-art retrieval methods by training with explicit region-to-phrase correspondence, but at the same time, they show that accurately inferring this correspondence given an image and a caption remains really challenging.

RBIR Based on Signature Graph

Van, Thanh The; Le, Thanh Manh
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 16/07/2015 Português
Relevância na Pesquisa
37.790637%
This paper approaches the image retrieval system on the base of visual features local region RBIR (region-based image retrieval). First of all, the paper presents a method for extracting the interest points based on Harris-Laplace to create the feature region of the image. Next, in order to reduce the storage space and speed up query image, the paper builds the binary signature structure to describe the visual content of image. Based on the image's binary signature, the paper builds the SG (signature graph) to classify and store image's binary signatures. Since then, the paper builds the image retrieval algorithm on SG through the similar measure EMD (earth mover's distance) between the image's binary signatures. Last but not least, the paper gives an image retrieval model RBIR, experiments and assesses the image retrieval method on Corel image database over 10,000 images.; Comment: 4 pages, 4 figures

Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

Mao, Junhua; Xu, Wei; Yang, Yi; Wang, Jiang; Huang, Zhiheng; Yuille, Alan
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.83061%
In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel image captions. It directly models the probability distribution of generating a word given previous words and an image. Image captions are generated by sampling from this distribution. The model consists of two sub-networks: a deep recurrent neural network for sentences and a deep convolutional network for images. These two sub-networks interact with each other in a multimodal layer to form the whole m-RNN model. The effectiveness of our model is validated on four benchmark datasets (IAPR TC-12, Flickr 8K, Flickr 30K and MS COCO). Our model outperforms the state-of-the-art methods. In addition, we apply the m-RNN model to retrieval tasks for retrieving images or sentences, and achieves significant performance improvement over the state-of-the-art methods which directly optimize the ranking objective function for retrieval. The project page of this work is: www.stat.ucla.edu/~junhua.mao/m-RNN.html .; Comment: Add a simple strategy to boost the performance of image captioning task significantly. More details are shown in Section 8 of the paper. The code and related data are available at https://github.com/mjhucla/mRNN-CR ;. arXiv admin note: substantial text overlap with arXiv:1410.1090

Fast Supervised Hashing with Decision Trees for High-Dimensional Data

Lin, Guosheng; Shen, Chunhua; Shi, Qinfeng; Hengel, Anton van den; Suter, David
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.751858%
Supervised hashing aims to map the original features to compact binary codes that are able to preserve label based similarity in the Hamming space. Non-linear hash functions have demonstrated the advantage over linear ones due to their powerful generalization capability. In the literature, kernel functions are typically used to achieve non-linearity in hashing, which achieve encouraging retrieval performance at the price of slow evaluation and training time. Here we propose to use boosted decision trees for achieving non-linearity in hashing, which are fast to train and evaluate, hence more suitable for hashing with high dimensional data. In our approach, we first propose sub-modular formulations for the hashing binary code inference problem and an efficient GraphCut based block search method for solving large-scale inference. Then we learn hash functions by training boosted decision trees to fit the binary codes. Experiments demonstrate that our proposed method significantly outperforms most state-of-the-art methods in retrieval precision and training time. Especially for high-dimensional data, our method is orders of magnitude faster than many methods in terms of training time.; Comment: Appearing in Proc. IEEE Conf. Computer Vision and Pattern Recognition...

Reuse of designs: Desperately seeking an interdisciplinary cognitive approach

Visser, Willemien; Trousse, Brigitte
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 30/11/2006 Português
Relevância na Pesquisa
37.684624%
This text analyses the papers accepted for the workshop "Reuse of designs: an interdisciplinary cognitive approach". Several dimensions and questions considered as important (by the authors and/or by us) are addressed: What about the "interdisciplinary cognitive" character of the approaches adopted by the authors? Is design indeed a domain where the use of CBR is particularly suitable? Are there important distinctions between CBR and other approaches? Which types of knowledge -other than cases- is being, or might be, used in CBR systems? With respect to cases: are there different "types" of case and different types of case use? which formats are adopted for their representation? do cases have "components"? how are cases organised in the case memory? Concerning their retrieval: which types of index are used? on which types of relation is retrieval based? how does one retrieve only a selected number of cases, i.e., how does one retrieve only the "best" cases? which processes and strategies are used, by the system and by its user? Finally, some important aspects of CBR system development are shortly discussed: should CBR systems be assistance or autonomous systems? how can case knowledge be "acquired"? what about the empirical evaluation of CBR systems? The conclusion points out some lacking points: not much attention is paid to the user...

Permutation Search Methods are Efficient, Yet Faster Search is Possible

Naidan, Bilegsaikhan; Boytsov, Leonid; Nyberg, Eric
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.76338%
We survey permutation-based methods for approximate k-nearest neighbor search. In these methods, every data point is represented by a ranked list of pivots sorted by the distance to this point. Such ranked lists are called permutations. The underpinning assumption is that, for both metric and non-metric spaces, the distance between permutations is a good proxy for the distance between original points. Thus, it should be possible to efficiently retrieve most true nearest neighbors by examining only a tiny subset of data points whose permutations are similar to the permutation of a query. We further test this assumption by carrying out an extensive experimental evaluation where permutation methods are pitted against state-of-the art benchmarks (the multi-probe LSH, the VP-tree, and proximity-graph based retrieval) on a variety of realistically large data set from the image and textual domain. The focus is on the high-accuracy retrieval methods for generic spaces. Additionally, we assume that both data and indices are stored in main memory. We find permutation methods to be reasonably efficient and describe a setup where these methods are most useful. To ease reproducibility, we make our software and data sets publicly available.

Self-Organized Stigmergic Document Maps: Environment as a Mechanism for Context Learning

Ramos, Vitorino; Merelo, Juan J.
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 17/12/2004 Português
Relevância na Pesquisa
37.755273%
Social insect societies and more specifically ant colonies, are distributed systems that, in spite of the simplicity of their individuals, present a highly structured social organization. As a result of this organization, ant colonies can accomplish complex tasks that in some cases exceed the individual capabilities of a single ant. The study of ant colonies behavior and of their self-organizing capabilities is of interest to knowledge retrieval/management and decision support systems sciences, because it provides models of distributed adaptive organization which are useful to solve difficult optimization, classification, and distributed control problems, among others. In the present work we overview some models derived from the observation of real ants, emphasizing the role played by stigmergy as distributed communication paradigm, and we present a novel strategy to tackle unsupervised clustering as well as data retrieval problems. The present ant clustering system (ACLUSTER) avoids not only short-term memory based strategies, as well as the use of several artificial ant types (using different speeds), present in some recent approaches. Moreover and according to our knowledge, this is also the first application of ant systems into textual document clustering. KEYWORDS: Swarm Intelligence...

Feature Extraction Methods for Color Image Similarity

Chary, R. Venkata Ramana; Lakshmi, D. Rajya; Sunitha, K. V. N.
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 11/04/2012 Português
Relevância na Pesquisa
37.71564%
Many User interactive systems are proposed all methods are trying to implement as a user friendly and various approaches proposed but most of the systems not reached to the use specifications like user friendly systems with user interest, all proposed method implemented basic techniques some are improved methods also propose but not reaching to the user specifications. In this proposed paper we concentrated on image retrieval system with in early days many user interactive systems performed with basic concepts but such systems are not reaching to the user specifications and not attracted to the user so a lot of research interest in recent years with new specifications, recent approaches have user is interested in friendly interacted methods are expecting, many are concentrated for improvement in all methods. In this proposed system we focus on the retrieval of images within a large image collection based on color projections and different mathematical approaches are introduced and applied for retrieval of images. before Appling proposed methods images are sub grouping using threshold values, in this paper R G B color combinations considered for retrieval of images, in proposed methods are implemented and results are included, through results it is observed that we obtaining efficient results comparatively previous and existing.; Comment: 11 pages...

STIMONT: A core ontology for multimedia stimuli description

Horvat, Marko; Bogunović, Nikola; Ćosić, Krešimir
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 10/01/2014 Português
Relevância na Pesquisa
37.755273%
Affective multimedia documents such as images, sounds or videos elicit emotional responses in exposed human subjects. These stimuli are stored in affective multimedia databases and successfully used for a wide variety of research in psychology and neuroscience in areas related to attention and emotion processing. Although important all affective multimedia databases have numerous deficiencies which impair their applicability. These problems, which are brought forward in the paper, result in low recall and precision of multimedia stimuli retrieval which makes creating emotion elicitation procedures difficult and labor-intensive. To address these issues a new core ontology STIMONT is introduced. The STIMONT is written in OWL-DL formalism and extends W3C EmotionML format with an expressive and formal representation of affective concepts, high-level semantics, stimuli document metadata and the elicited physiology. The advantages of ontology in description of affective multimedia stimuli are demonstrated in a document retrieval experiment and compared against contemporary keyword-based querying methods. Also, a software tool Intelligent Stimulus Generator for retrieval of affective multimedia and construction of stimuli sequences is presented.; Comment: 27 pages...

Discriminative Functional Connectivity Measures for Brain Decoding

Firat, Orhan; Ozay, Mete; Oztekin, Ilke; Vural, Fatos T. Yarman
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.613997%
We propose a statistical learning model for classifying cognitive processes based on distributed patterns of neural activation in the brain, acquired via functional magnetic resonance imaging (fMRI). In the proposed learning method, local meshes are formed around each voxel. The distance between voxels in the mesh is determined by using a functional neighbourhood concept. In order to define the functional neighbourhood, the similarities between the time series recorded for voxels are measured and functional connectivity matrices are constructed. Then, the local mesh for each voxel is formed by including the functionally closest neighbouring voxels in the mesh. The relationship between the voxels within a mesh is estimated by using a linear regression model. These relationship vectors, called Functional Connectivity aware Local Relational Features (FC-LRF) are then used to train a statistical learning machine. The proposed method was tested on a recognition memory experiment, including data pertaining to encoding and retrieval of words belonging to ten different semantic categories. Two popular classifiers, namely k-nearest neighbour (k-nn) and Support Vector Machine (SVM), are trained in order to predict the semantic category of the item being retrieved...

Efficient Region-Based Image Querying

Sadek, S.; Al-Hamadi, A.; Michaelis, B.; Sayed, U.
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 23/06/2010 Português
Relevância na Pesquisa
37.71564%
Retrieving images from large and varied repositories using visual contents has been one of major research items, but a challenging task in the image management community. In this paper we present an efficient approach for region-based image classification and retrieval using a fast multi-level neural network model. The advantages of this neural model in image classification and retrieval domain will be highlighted. The proposed approach accomplishes its goal in three main steps. First, with the help of a mean-shift based segmentation algorithm, significant regions of the image are isolated. Secondly, color and texture features of each region are extracted by using color moments and 2D wavelets decomposition technique. Thirdly the multi-level neural classifier is trained in order to classify each region in a given image into one of five predefined categories, i.e., "Sky", "Building", "SandnRock", "Grass" and "Water". Simulation results show that the proposed method is promising in terms of classification and retrieval accuracy results. These results compare favorably with the best published results obtained by other state-of-the-art image retrieval techniques.; Comment: IEEE Publication Format, https://sites.google.com/site/journalofcomputing/

Attribute2Image: Conditional Image Generation from Visual Attributes

Yan, Xinchen; Yang, Jimei; Sohn, Kihyuk; Lee, Honglak
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 01/12/2015 Português
Relevância na Pesquisa
37.83061%
This paper investigates a problem of generating images from visual attributes. Given the prevalent research for image recognition, the conditional image generation problem is relatively under-explored due to the challenges of learning a good generative model and handling rendering uncertainties in images. To address this, we propose a variety of attribute-conditioned deep variational auto-encoders that enjoy both effective representation learning and Bayesian modeling, from which images can be generated from specified attributes and sampled latent factors. We experiment with natural face images and demonstrate that the proposed models are capable of generating realistic faces with diverse appearance. We further evaluate the proposed models by performing attribute-conditioned image progression, transfer and retrieval. In particular, our generation method achieves superior performance in the retrieval experiment against traditional nearest-neighbor-based methods both qualitatively and quantitatively.; Comment: 10 pages (main) and 1 page (supplementary material)

Convolutional Neural Associative Memories: Massive Capacity with Noise Tolerance

Karbasi, Amin; Salavati, Amir Hesam; Shokrollahi, Amin
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 24/07/2014 Português
Relevância na Pesquisa
37.755273%
The task of a neural associative memory is to retrieve a set of previously memorized patterns from their noisy versions using a network of neurons. An ideal network should have the ability to 1) learn a set of patterns as they arrive, 2) retrieve the correct patterns from noisy queries, and 3) maximize the pattern retrieval capacity while maintaining the reliability in responding to queries. The majority of work on neural associative memories has focused on designing networks capable of memorizing any set of randomly chosen patterns at the expense of limiting the retrieval capacity. In this paper, we show that if we target memorizing only those patterns that have inherent redundancy (i.e., belong to a subspace), we can obtain all the aforementioned properties. This is in sharp contrast with the previous work that could only improve one or two aspects at the expense of the third. More specifically, we propose framework based on a convolutional neural network along with an iterative algorithm that learns the redundancy among the patterns. The resulting network has a retrieval capacity that is exponential in the size of the network. Moreover, the asymptotic error correction performance of our network is linear in the size of the patterns. We then ex- tend our approach to deal with patterns lie approximately in a subspace. This extension allows us to memorize datasets containing natural patterns (e.g....

A Massively Parallel Associative Memory Based on Sparse Neural Networks

Yao, Zhe; Gripon, Vincent; Rabbat, Michael G.
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.76338%
Associative memories store content in such a way that the content can be later retrieved by presenting the memory with a small portion of the content, rather than presenting the memory with an address as in more traditional memories. Associative memories are used as building blocks for algorithms within database engines, anomaly detection systems, compression algorithms, and face recognition systems. A classical example of an associative memory is the Hopfield neural network. Recently, Gripon and Berrou have introduced an alternative construction which builds on ideas from the theory of error correcting codes and which greatly outperforms the Hopfield network in capacity, diversity, and efficiency. In this paper we implement a variation of the Gripon-Berrou associative memory on a general purpose graphical processing unit (GPU). The work of Gripon and Berrou proposes two retrieval rules, sum-of-sum and sum-of-max. The sum-of-sum rule uses only matrix-vector multiplication and is easily implemented on the GPU. The sum-of-max rule is much less straightforward to implement because it involves non-linear operations. However, the sum-of-max rule gives significantly better retrieval error rates. We propose a hybrid rule tailored for implementation on a GPU which achieves a 880-fold speedup without sacrificing any accuracy.

Seeing the Big Picture: Deep Embedding with Contextual Evidences

Zheng, Liang; Wang, Shengjin; He, Fei; Tian, Qi
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 01/06/2014 Português
Relevância na Pesquisa
37.824893%
In the Bag-of-Words (BoW) model based image retrieval task, the precision of visual matching plays a critical role in improving retrieval performance. Conventionally, local cues of a keypoint are employed. However, such strategy does not consider the contextual evidences of a keypoint, a problem which would lead to the prevalence of false matches. To address this problem, this paper defines "true match" as a pair of keypoints which are similar on three levels, i.e., local, regional, and global. Then, a principled probabilistic framework is established, which is capable of implicitly integrating discriminative cues from all these feature levels. Specifically, the Convolutional Neural Network (CNN) is employed to extract features from regional and global patches, leading to the so-called "Deep Embedding" framework. CNN has been shown to produce excellent performance on a dozen computer vision tasks such as image classification and detection, but few works have been done on BoW based image retrieval. In this paper, firstly we show that proper pre-processing techniques are necessary for effective usage of CNN feature. Then, in the attempt to fit it into our model, a novel indexing structure called "Deep Indexing" is introduced, which dramatically reduces memory usage. Extensive experiments on three benchmark datasets demonstrate that...

RBIR using Interest Regions and Binary Signatures

Van, Thanh The; Le, Thanh Manh
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 01/06/2015 Português
Relevância na Pesquisa
37.71564%
In this paper, we introduce an approach to overcome the low accuracy of the Content-Based Image Retrieval (CBIR) (when using the global features). To increase the accuracy, we use Harris-Laplace detector to identify the interest regions of image. Then, we build the Region-Based Image Retrieval (RBIR). For the efficient image storage and retrieval, we encode images into binary signatures. The binary signature of a image is created from its interest regions. Furthermore, this paper also provides an algorithm for image retrieval on S-tree by comparing the images' signatures on a metric similarly to EMD (earth mover's distance). Finally, we evaluate the created models on COREL's images.; Comment: 14 pages, 8 figures