 Research
 Open Access
 Published:
Findings on ranking evaluation functions for feature weighting in image retrieval
Journal of the Brazilian Computer Society volume 20, Article number: 7 (2014)
Abstract
Background
There are substantial benefits to be gained from ranking optimization in several information retrieval and recommendation systems. However, the analysis of ranking evaluation functions (REFs), which play a major role in many ranking optimization models, needs to be further investigated. An analysis of previous studies that investigated REFs was performed, and evidence was found which indicated that the choice of a proper REF is context sensitive.
Methods
In this study, we analyze a broad set of REFs for feature weighting aimed at increasing the image retrieval effectiveness. The REFs analyzed sums ten and includes the most successful and representative REFs from the literature. The REFs were embedded into a genetic algorithm (GA)based relevance feedback (RF) model, called WLSPC ±, aimed at improving image retrieval results through the use of learning weights for image descriptors and image regions.
Results
Analyses of precisionrecall curves in five realworld image data sets showed that one nonparameterized REF named F5, not analyzed in previous studies, overcame recommended ones, which require parameter adjustment. We also provided a computational analysis of the GAbased RF model investigated, and it was shown that it is linear in regard to the image data set cardinality.
Conclusions
We conclude that REF F5 should be investigated in other contexts and problem scenarios centered on ranking optimization, as ranking optimization techniques rely heavily on the ranking quality measure.
Background
Ranking optimization research studies have fostered widespread developments in information retrieval and recommendation systems [1–6]. Ranking optimization techniques can be grouped into three main classes: rank learning [2, 4, 5], rank aggregation (also known as data fusion) [7–10] and ranking (or list) diversification [1, 11, 12]. Rank learning relies on supervised queries, relevance feedback or context information to achieve an adequate model to rank items like web pages, images, etc. Normally, rank aggregation is an unsupervised method that relies on multicriteria ranks and tries to combine them to produce a consensus rank. On the other hand, ranking or list diversification aims at balancing ‘precision’ and ‘diversity’ to reflect a broad spectrum of user interests concerning items.
Rank learning tasks are generally stated as optimization problems: to find the best model (or the best adjustment in a given model) according to some representation to rank items. Given its general formulation, solutions of rank learning normally apply a search method guided by some ranking evaluation function. Ranking evaluation functions (REFs) are normally computed with a basis on supervised queries or user relevance feedback (RF). These REFs evaluate models or adjustments according to the effectiveness of the ranking produced. In regard to search methods, most research studies have employed evolutionary algorithms (EAs). The EA flexibility enables the modeling of rank learning in many ways, such as through ranking function discovery [5, 13, 14], weight and parameter learning [15–19], among others. Independent to the model representation, a proper evaluation function is very important for the effectiveness and efficiency of EAs.
Although REFs have been shown to have applied a major rule to rank learning almost a decade ago [13, 15–17], in recent studies, little attention has been given to the design and selection of more appropriate ones. Researchers have chosen popular REFs and applied them to new contexts and models without any theoretical or empirical evidence about its suitableness. Moreover, few studies have focused on rank learning for image retrieval tasks, and the existing ones are not deep enough and do not cover all the spectrum of models employed in this sector.
LópezPujalte et al. [15–17] have studied the problem of adapting document descriptions through learning terms, weights and parameters in matching functions applied to information retrieval. These researches investigated mainly the issue of different REFs as fitness functions for genetic algorithms (GAs) in relevance feedback. By analyzing the mean precision in three levels of recall, these studies showed that the results effectiveness varied widely depending on which REF is used. Also, in these studies, it was found that utility theorybased ranking evaluation functions (UTBREFs) comprises the most adequate kind of REF for rank learning applications. Moreover, the REF named F4 in this present study was recommended by LópezPujalte et al. in [17] as a promising one.
Fan et al. [13] compared seven UTBREFs on ranking function discovery for Web search using genetic programming (GP). Their experiments on a large Web Corpus revealed that some UTBREFs, named F9, F7, F8 and F3 in this present study, were more effective in guiding the GP search than others which were analyzed. In a following investigation, Fan et al. [20] used the UTBREF named F10 in this paper, with the aim of increasing the precision of information retrieval in two steps: first, by discovering new ranking functions using genetic programming; second, by combining document retrieval scores of different ranking functions using genetic algorithms. The use of UTBREF F10 was justified since it is a standard performance measure used in information retrieval studies.
Torres et al. [5] used GP to discover functions to combine different descriptors for contentbased image retrieval (CBIR) tasks. Their method relies on a training set containing query images together with the relevant images to each query image and, obviously, a REF that guides the GP search towards a proper combination function. In this context, the authors tested seven UTBREFs as fitness functions in the GP  the same UTBREFs used by Fan et al. in [13]. The UTBREFs that produced the best results are named F6, F7 and F4 in this paper. Ferreira et al. [14] proposed a similar method of [5] using RF instead of a training set of queries. This study does not compare REFs and uses the UTBREF F4, due to its promising results in [5].
Stejić et al. [19] used a GAbased RF model to improve image retrieval results by applying learning weights to image descriptors and image regions (WLSPC ± model). This study presented promising approaches such as the concept of local similarity patterns (LSP) and the use of continuous positive and negative weights modeling relevance and undesirability of visual features. In spite of the promising features of the model, the authors did not provide an effective mechanism for learning a proper set of weights. The use of the Rprecision measure without any other REF analysis is the most critical aspect of the Stejić et al. research, as other studies had shown that UTBREFs are more appropriate for such ranking modeling.
Silva et al. [18] extended the WLSPC ± model by Stejić et al. [19] proposing a new UTBREF in substitution to the Rprecision measure used as the objective (fitness) function into GA. Their results showed a significant improvement in the image retrieval precision and in efficiency as the proposed UTBREF speed up the GA search in direction of optimal solutions.
As we can observe from the studies reported, there is no consensus about which is the best REF for many of the applications, and many studies have overlooked the REF analysis. Even for the same task, there is no consensus about the best REF, as we can see from the REF analysis performed in the studies by Fan et al. [13] and Torres et al. [5] that employed the same set of REFs. In this way, we will show that there is space for development in this issue and that new studies should consider the analysis of broad sets of REFs, due to the fact that a proper choice should be contextsensitive.
In this paper, we used the WLSPC ± model proposed by Stejić et al. [19] and used in [18] to investigate a broad set of REFs for feature weighting aimed at improving image retrieval performance. The choice of WLSPC ± model was motivated by its promising results. The REFs were applied as fitness functions into a specialized GA for learning weights. Analyses of precisionrecall curves in five realworld image data sets showed that the REF design applies a key role regarding the effectiveness and efficiency of the WLSPC ± model. Also, we found that the nonparameterized REF proposed in [18] and named F5 in this present paper overcame recommended ones, which require parameter adjustment. This result indicates that the REF F5 should be investigated in other contexts and problem scenarios centered on ranking optimization mainly for image retrieval, as ranking optimization techniques rely heavily on the ranking quality measure.
The remainder of this paper is organized as follows. The ‘Methods’ Section describes the methodology employed for the analysis of REFs on the WLSPC ± model. The ‘Results and discussion’ Section compares a broad set of REFs for feature weighting aimed at improving image retrieval and provides a computational complexity analysis of the model. The ‘Conclusions’ Section concludes the paper highlighting the main findings and implications of the present research.
Methods
In this study, we used the WLSPC ± model [19] to investigate a broad set of ranking evaluation functions. The weights of WLSPC ± model were optimized using a GAbased RF mechanism reported in [18]. This methodology is illustrated in Figure 1. We stored into a database all the images considered for a given image searching task. The image database is linked to the module of feature extraction. The output data of the feature extraction module is a structure containing the identification code and the feature vectors of color, shape and texture for each image of the database. These data (identification code/features) are stored in the feature database.
When the user carries out a search, feature vectors of color, shape and texture are extracted from the query image by the feature extraction module and compared, through similarity measures, found in the image feature vectors from the range of images stored in the database. The similarity measure module returns a similarity value S_{ I }(q,i) for each image in the database, in relation to the query image. Then, the images are sorted in decreasing order of similarity (ranking) and the first samples are shown to the user. Not satisfied with the result of the search, the user can provide feedback, indicating to the system the relevant images according to his/her point of view. Based on the user’s feedback, the GAbased relevance feedback mechanism adjusts the similarity measure according to the user’s criteria through image feature vector weighting (ω_{ F }) and region weighting (ω_{ R }). n_{ g } corresponds to the number of generations for the genetic algorithm.
The retrieval process is based on the local similarity pattern, where the image areas are uniformly partitioned into regions, and the similarity between images is measured by corresponding region similarities. Similarity between regions, and therefore between images, is computed through three feature vectors (F) encoding properties of color, shape and texture, represented by color moments, edge direction histogram and texture neighborhood, respectively. The distance between pairs of color feature vectors is computed by Euclidean distance, while distances between pairs of shape and texture feature vectors are computed by cityblock distance.
To make comprehension easier, we present in the next subsections a detailed description of the WLSPC ± model and the GAbased RF mechanism. Then, we describe the analyzed ranking evaluation functions and also the employed image data sets.
WLSPC ± model
Let q be the query image, I be the image data set, i be an image belonging to I, r be an image region belonging to R such that R = {r_{1},r_{2},…,r_{ m }} is given by a rectangular tiled partition of i, and f be an image feature vector. The image similarity measure is given by Equation 1, where S_{ F }(q,i,r,f) represents the similarity between the images q and i, in relation to the feature vector f in the region r; ω_{ F }(r,f) weighs with real values in range [ 1,1] the importance of f in the region r and is responsible for the S_{ F } normalization; ω_{ R }(r) weighs with real values in range [ 1,1] the importance of the image region r; and finally, S_{ I }(q,i) gives the overall image similarity between q and i.
The WLSPC ± model is optimized by fitting the weights ω_{ R }(r) and ω_{ F }(r,f), so that the retrieval accuracy according to the query image and the set of relevant images chosen by the user is maximized. As in [19] and [18], we solve this optimization problem using a realcode GA that infers weights in the range [ 1,1]. Continuous negative and positive allows for the mapping of the user’s concepts of relevance, irrelevance and undesirability of image visual properties producing superior results than positive weights alone as shown in [19]. Since we found the best results with the WLSPC ± model, we did not analyze in this study the other models proposed by Stejić et al. in [19].
The GAbased RF mechanism
Our RF mechanism relies on a GA designed and adjusted for learning weights in the paper [18]. Algorithm 1 describes the main steps of the GA. The chromosome coding is similar to the coding employed in [19]. As each image was partitioned into m regions, each chromosome (C) contains m genes (G_{1},G_{2},G_{3},…,G_{ m }). Moreover, each gene (G_{ i }) contains a vector of four weights, with the first quantifying the region importance and the other ones quantifying the importance of the color, shape and texture descriptors, respectively. We have tested m = 4, m = 9, m = 16 and m = 25. The best result obtained from these empirical tests was m = 16, which was defined as default.
Algorithm 1 GAbased RF algorithm
Ranking evaluation functions
We compared ten REFs being two not based on the utility theory (nUTBREF) and eight based on the utility theory (UTBREF). Utility theorybased fitness functions (UTBREFs) are based on the utility concept, where the score value of a relevant element in the ranking is usually inversely proportional to its position. That is, the higher the rank of a relevant element, the higher its utility. Nonutility theorybased fitness functions (nUTBREFs) are REFs that do not strictly follow the utility concept.
A REF plays the role of the GA fitness function, and it is applied as described in Algorithm 2. First, the image similarities (Equation 1) between the query image and each image in the data set are computed by employing the weights coded by the individual . Then, the images are sorted according to the similarity values which make up a ranking. Finally, a ranking evaluation function is applied to the ranking to obtain the fitness value. In the following, we describe the ranking evaluation functions analyzed, grouping them into two categories: nUTBREF and UTBREF. $\text{Fitness}(q,\mathcal{C})$ denotes the fitness value of the individual for the query q, I represents the image data set, I denotes the cardinality of I, D represents the set of images known to be relevant to a query q, D denotes the cardinality of D and pos (i) returns the position (rank) of the image i in the ranking.
Algorithm 2 Fitness function employment
Nonutility theorybased fitness functions
The nonutility theorybased fitness functions are as follows:

Fitness function F1. This fitness function is given by the Rprecision measure, which is a wellknown REF used to evaluate information retrieval effectiveness:
$$\begin{array}{c}F1(q,\mathcal{C})=R\text{precision}(q,\mathcal{C})\hfill \\ \phantom{\rule{4.2em}{0ex}}=\frac{\text{Number of relevant images retrieved}}{{n}_{R}},\hfill \end{array}$$(2) 
where n_{ R } is the number of elements considered in the query answer.

Fitness function F2. This function is based on an analysis of the numbers of true positive (Rr  relevant and retrieved items), false positive (Rn  retrieved but nonrelevant items) and false negative (Nr  nonretrieved relevant items):
$$\mathrm{F2}(q,\mathcal{C})=\left(2\rightD\left\right)+\text{Rr}\text{Rn}\text{Nr}.$$(3)
The fitness function F1 was employed in Stejić et al. models [19], and F 2 was proposed in [18].
Utility theorybased fitness functions
Utility theorybased fitness functions (UTBFFs) are fitness functions based on UTBREFs. We analyzed eight UTBFFs (F 3 to F 10) defined as follows:

Fitness function F3
$$\mathrm{F3}(q,\mathcal{C})=\frac{1}{\leftD\right}\sum _{\forall i\in D}\left(\sum _{j=\text{pos}\left(i\right)}^{\leftI\right}\frac{1}{j}\right)$$(4) 
Fitness function F4
$$\mathrm{F4}(q,\mathcal{C})=\sum _{\forall i\in D}\left(\frac{1}{A}{\left(\frac{(A1)}{A}\right)}^{\left(\text{pos}\right(i)1)}\right),$$(5) 
where A is a userdefined parameter with values larger than or equal to 2.

Fitness function F5
$$\mathrm{F5}(q,\mathcal{C})=\frac{\text{Accuracy}\phantom{\rule{1em}{0ex}}\text{value}(q,C)}{\sum _{j=1}^{\leftD\right}\frac{1}{j}},$$(6) 
where
$$\begin{array}{l}\text{Accuracy}\phantom{\rule{1em}{0ex}}\text{value}(q,\mathcal{C})=\hfill \\ \phantom{\rule{4.5em}{0ex}}\sum _{\forall i\in D}\frac{1}{\text{pos}\left(i\right)}\hfill \end{array}$$(7) 
Fitness function F6
$$\mathrm{F6}(q,\mathcal{C})=\sum _{\forall i\in D}{k}_{1}l{n}^{1}\left(\text{pos}\right(i)+{k}_{2}),$$(8) 
where k_{1} and k_{2} are userdefined parameters.

Fitness function F7
$$\mathrm{F7}(q,\mathcal{C})=\sum _{\forall i\in D}{k}_{3}{\text{log}}_{10}\left(\rightI/\text{pos}(i\left)\right),$$(9) 
where k_{3} is a userdefined parameter.

Fitness function F8
$$\mathrm{F8}(q,\mathcal{C})=\sum _{\forall i\in D}{k}_{4}^{1}({e}^{{k}_{5}\text{ln}\left(\text{pos}\right(i\left)\right)+{k}_{6}}{k}_{7}),$$(10) 
where k_{4}, k_{5}, k_{6} and k_{7} are userdefined parameters.

Fitness function F9
$$\mathrm{F9}(q,\mathcal{C})=\sum _{\forall i\in D}{k}_{8}{{k}_{9}}^{\text{pos}\left(i\right)},$$(11) 
where k_{8} and k_{9} are userdefined parameters.

Fitness function F10
$$\mathrm{F10}(q,\mathcal{C})=\frac{\sum _{\forall i\in D}\left(\frac{\sum _{j=1}^{\text{pos}\left(i\right)}r\left(arg\mathit{\text{ii}}:\text{pos}\left(\mathit{\text{ii}}\right)==j\right)}{\text{pos}\left(i\right)}\right)}{\leftD\right},$$(12) 
where r(argi i:pos(i i)==j) returns 1 if the image ii in the j th position of the ranking is relevant, otherwise it returns 0.
Fitness functions F3 and F4 were used in [17] for the learning of weights, which were structured according to the vectorial space model, in the context of textual information retrieval. The fitness function F5 was proposed in [18], and the functions F6 to F10 are used in [13] and [5] for GPbased ranking function discovery to improve textual information retrieval and CBIR tasks, respectively.
Data sets
We evaluated the REFs for the weighting of features in image retrieval on five public domain image data sets, varying from hundreds to ten thousand images. The image data sets employed are summarized in Table 1.
Results and discussion
Previous studies on rank learning methods [5, 13, 17, 20] show that, in general, UTBREFs lead to more precise information retrieval results than nUTBREFs. Moreover, these studies show that the UTBREFs’ design by itself significantly affects the information retrieval results. In our study, we performed a systematic investigation of REFs for descriptor/region weighting in image retrieval using the successful model WLSPC ± (Equation 1). Considering the comparison of REFs, although our results were in line with those reported in the literature, we found better results with the UTBREF F5, which has not been investigated in other research studies.
As can be seen in Figure 2, the UTBREF F5 was on average more precise than the other REFs, when considering low recall rates. For all data sets, the images belonging to the same category of the query image were considered as relevant, while the remaining images were considered irrelevant. The result shown in Figure 2 has high significance, since users largely emphasizes the analysis on the best ranked items. Therefore, the closer to the top ranking the relevant items appear, the better the result. As REFs play a key role in ranking optimization and given the importance of high precision in topk ranking for several applications, it is conceived that the UTBREF F5 could be effectively applied in other researches focused on ranking optimization. Moreover, the application of F5 is straightforward since it has no parameter adjustment. Table 2 shows the area under the precision recall curve referred to in Figure 2, bounded at 25%, 50% and 75% of recall. One observes that fitness F5 only loses out to the others on the BD10000 in 75% of recall, which confirms the superiority of fitness F5.
By analyzing the REF behaviour, we realize that the superiority of F5 is due to the highest relative importance that it attaches to the top positions of the ranking. According to the authors belief, this corresponds to a nearoptima utility function because when performing a query the user wants relevant documents in the first positions of the ranking. As an example, let us take a hypothetical situation of two rankings with n retrieved images: in the first ranking, we have a relevant image in the first position and another relevant image in the last position with other positions occupied by nonrelevant images; in the second ranking, we have two relevant images in the second and third positions with the other retrieved images being nonrelevant. In general, from a user’s point of view, having a relevant image in the first position is more important than having the first position occupied by a nonrelevant element followed by two relevant images. F5 is in accordance to this behaviour for all values of n. Moreover, F5 is the only function from the REFs analysed which is in accordance to this behaviour for n > 30. Table 3 shows the scores assigned to the hypothetical rankings for n = 31.
Also, in reference to Figure 2, we found that the P&R graphs obtained using UTBREFs (F3–F10) are noticeably different from those obtained using nUTBREFs (F1 and F2). One easily notes that, in general, the UTBREFs produced substantially higher precision values than the nUTBREFs (F1 and F2), when considering low recall rates. This is a very important aspect that has not been discussed by other researchers. Utility theorybased evaluation functions enable these sort of results, due to the fact that they allow for the appropriate modeling of the user requirements in regard to ranking quality.
Another important issue observed in the analyses performed is that the global computational time spent when using a proper UTBREF is significantly lower than when using a wellknown nUTBREF, such as the Rprecision measure. Once all the UTBREFs investigated take a similar computational procedure, one can choose one of them when analyzing computational time without loss of generality. We chose the UTBREF F5 and compared it against the nUTBREF F1. We evaluated the number of generations and the computational time spent by the GA during the RF process. As the maximum feasible fitness value is sometimes not achieved by the GA, it was considered that individuals could evolve up to 350 generations. For assessment, we carried out 100 queries in the DB10000 data set by random selection of 10% of the images for each category coming from the Corel1000 data set, and we reported the average values obtained. The system was fed back with the first ten relevant images of the initial ranking for both methods. One can see in Table 4 that the computational time when using the UTBREF F5 was on average 2.8 times faster than when using the nUTBREF F1. Also, one can see that when using the UTBREF F5, the GA spent on average 3.4 less generation than when using F1. In summary, Table 4 shows that in spite of UTBREF being a little more expensive computationally, the GAbased RF process needed a significant smaller number of generations to obtain results of greater superiority than when using the nUTBREF F1. All experiments were executed in a Windows 7 64bit OS using an Intel Core 2 Duo 2.2GHz processor with 4GB RAM. The prototype was implemented in ANSI C.
We also found, for all data sets, that the GAbased RF technique produced P&R graph results far superior than a similar RF technique employing multistart (MS) search instead of GA search. This result is shown in Figure 3. The number of random solutions of MS search was set to the same number of fitness evaluations performed by the GA in all the comparative experiments carried out, i.e. S_{ p }(1+p_{ c }(n_{ G }  1), where S_{ p } is the population size, p_{ c } is the crossover rate and n_{ G } is the number of generations of the GA search. For both these search techniques, GA and MS search, the fitness function F5 was employed as the evaluation criterion. MS search may be naturally compared with GA search, since both employ random mechanisms. This result shows the strength of GA for this sort of optimization.
Finally, we provided a study for the computational complexity of the RF technique, and we found that it is linear regarding the number of images in the data set. We analyzed the number of similarity operations (Equation 1) computed by the fitness function during the evolutive process, as the similarity calculus is the most expensive operation in the RF process.
In Algorithm 1, step 1 has complexity O(1), as it does not depend on the number of images in the data set. In step 2, the fitness score for each individual is computed employing Algorithm 2. Analyzing the Algorithm 2, it is trivial to find out that the image similarity operation (step 2) takes time O(n), where n is the number of images in the data set. Step 3 is O(n logn) – time for sorting the similarity values of n images. However, the image similarity operation takes significantly larger computational time than value comparisons and exchanges of sorting algorithms, even for considered unthinkably large image data sets today (containing several million or more elements). Thus, we consider as the main operation of Algorithm 2, i.e. the time unit, the number of operations performed by the similarity query process that increases in O(n).
Returning to Algorithm 1, any of the steps 3 to 7 has complexity O(1) for the same reason as step 1. In summary, as the fitness function is applied a constant number of times, depending on the population size, generation number and crossover rate, the GAbased RF algorithm is O(1)O(n), i.e., linear. It is important to remember that the constant term O(1) can be significantly high, depending on the GA parameters. However, the fitness operations can be performed in a parallel fashion in each GA generation.
Conclusions
As known from many research studies, the objective function plays a crucial role in ranking optimization. In this study, we present an uptodate investigation of ranking evaluation functions (REFs), a special class of objective function employed in rank learning methods aimed at providing precise information retrieval. Using a GAbased RF method as a rank learning mechanism for image retrieval, we analyzed ten REFs, which includes the most successful REFs employed in previous studies regarding comparison of REFs adding some functions not investigated.
We performed an analysis of precisionrecall curves in five realworld image data sets. Although our results were in line with those reported in the literature, showing that the REF design has a decisive hole in rank learning, we found that the UTBREF named here F5, which is not included in previous studies that compared REFs, provided better results than the recommended REFs. Additionally, the computation of F5 does not require any parameter, to the contrary of previously recommended REFs. Also, we found that UTBREF is the most appropriate class of REF for topranking optimization. Another important issue noticed is that the time spent in the ranking optimization process when using a proper UTBREF, such as F5, is significantly lower than when using a wellknown nUTBREF, such as the Rprecision measure. Showing the strength of GA search for the optimization task, we compared and found that GA significantly overcame multistart (MS) search. This result shows that GA search is effective for learning weights through RF aiming at optimizing image retrieval results.
Our results added to those from the literature, showing a categorization and a systematic analysis of REFs and confirming that the REF design plays a key role in rank learning. To the best of our knowledge, this is the first study carried out to investigate the importance of REFs in feature weighting for CBIR tasks.
As REFs play a key role in many ranking optimization tasks, our results indicate that REF F5 could be effectively applied in other contexts and applications focused on ranking optimization, such as recommender systems: the idea here is to provide recommendations sorted according to their expected utility, such as user rating and/or similarity according to the user’s interests. Also, we put together and compared a broad set of REFs that can be used for future research in the ranking optimization field.
References
 1.
Adomavicius G: Improving aggregate recommendation diversity using rankingbased techniques. IEEE Trans Knowl Data Eng 2012, 24(5):896–911.
 2.
Liu TY: Learning to rank for information retrieval. Foundations Trends Inf Retrieval 2009, 3(3):225–231.
 3.
Pedronette D, Torres R: Exploiting contextual information for image reranking and rank aggregation. Int J Multimedia Inf Retrieval 2012, 1: 1–14. 10.1007/s1373501200091
 4.
Qin T, Liu TY, Xu J, Li H: LETOR: a benchmark collection for research on learning to rank for information retrieval. Inf Retrieval 2010, 13(4):346–374. 10.1007/s107910099123y
 5.
Torres RS, Falcão AX, Gonçalves MA, Papa JP, Zang B, Fan W, Fox EA: A genetic programming framework for contentbased image retrieval. Pattern Recognit 2009, 42(2):283–292. 10.1016/j.patcog.2008.04.010
 6.
Vargas S, Castells P: Rank and relevance in novelty and diversity metrics for recommender systems. In Proceedings of the fifth ACM conference on recommender systems. Chicago, 23–27 October 2011; 2011:109–116.
 7.
AhPine J: On data fusion in information retrieval using different aggregation operators. Web Intell Agent Syst 2011, 9: 43–55.
 8.
Ailon N: Aggregation of partial rankings, pratings and topm lists. Algorithmica 2008, 57(2):284–300.
 9.
Lin S: Rank aggregation methods. Wiley Interdiscip Rev: Comput Stat 2010, 2(5):555–570. 10.1002/wics.111
 10.
Nuray R, Can F: Automatic ranking of information retrieval systems using data fusion. Inf Process Manag 2006, 42(3):595–614. 10.1016/j.ipm.2005.03.023
 11.
Drosou M, Pitoura E: Search result diversification. ACM SIGMOD Rec 2010, 39(1):41–47. 10.1145/1860702.1860709
 12.
Santos R, Macdonald C, Ounis I: Exploiting query reformulations for web search result diversification. Proceedings of the 19th international conference on World Wide Web, WWW ’10, Raleigh, 26–30 April 2010 2010, 881–890.
 13.
Fan W, Fox EA, Pathak P, Wu H: The effects of fitness functions on genetic programmingbased ranking discovery for web search. J Am Soc Inf Sci Technol 2004, 55(7):628–636. 10.1002/asi.20009
 14.
Ferreira C, Santos J, Torres RS, Gonçalves M, Rezende R, Fan W: Relevance feedback based on genetic programming for image retrieval. Pattern Recognit Lett 2011, 32(1):27–37. 10.1016/j.patrec.2010.05.015
 15.
LópezPujalte C, GuerreroBote VP, De MoyaAnegón F: Genetic algorithms in relevance feedback: a second test and new contributions. Inf Process Manag 2003, 39(5):669–687. 10.1016/S03064573(02)000444
 16.
LópezPujalte C, Guerrero Bote VP, MoyaAnegón F: A test of genetic algorithms in relevance feedback. Inf Process Manag 2002, 38(6):793–805. 10.1016/S03064573(01)000619
 17.
LópezPujalte C, GuerreroBote VP, MoyaAnegón F: Orderbased fitness functions for genetic algorithms applied to relevance feedback. J Am Soc Inf Sci 2003, 54(2):152–160. 10.1002/asi.10179
 18.
Silva SF, Barcelos CAZ, Batista MA: Adaptive image retrieval through the use of a genetic algorithm. Proceedings of IEEE international conference on tools with artificial intelligence (ICTAI), Patras, 29–31 October 2007 2007, 557–564.
 19.
Stejić Z, Takama Y, Hirota K: Genetic algorithms for a family of image similarity models incorporated in the relevance feedback mechanism. Appl Soft Comput 2003, 2: 306–327. 10.1016/S15684946(02)000704
 20.
Fan W, Pathak P, Zhou M: Geneticbased approaches in ranking function discovery and optimization in information retrieval—a framework. Decis Support Syst 2009, 47: 398–407. 10.1016/j.dss.2009.04.005
 21.
Massachusetts Institute of Technology Media Laboratory: Vistex database. 2005.http://vismod.media.mit.edu/pub/VisTex/ . Last accessed on 06 Feb 2014
 22.
James Z. Wang’s Research Group. Corel database: Corel Corporation, Corel Gallery 3.0. 2004.http://wang.ist.psu.edu/~jwang/test1.tar . Last accessed on 06 Feb 2014
 23.
Vision Lab. in Computer Science Department: 13 scene categories database. 2004.http://vision.stanford.edu/Datasets/SceneClass13.rar . Last accessed on 06 Feb 2014
 24.
FeiFei L, Fergus R, Perona P: Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. IEEE CVPR 2004 workshop on generativemodel based Vision (IEEE, Piscataway, 2004) 2004.
Acknowledgements
We thank CNPq, CAPES and FAPESP for the financial support.
Author information
Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
All authors have contributed to the different methodological and experimental aspects of the research. All authors read and approved the final manuscript.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
da Silva, S.F., Avalhais, L.P., Batista, M.A. et al. Findings on ranking evaluation functions for feature weighting in image retrieval. J Braz Comput Soc 20, 7 (2014). https://doi.org/10.1186/16784804207
Received:
Accepted:
Published:
Keywords
 Rank learning
 Ranking evaluation functions
 Contentbased image retrieval
 Genetic algorithms