HƯỚNG DẪN DOWNLOAD TÀI LIỆU

Bước 1:Tại trang tài liệu slideshare.vn bạn muốn tải, click vào nút Download màu xanh lá cây ở phía trên.

Bước 2: Tại liên kết tải về, bạn chọn liên kết để tải File về máy tính. Tại đây sẽ có lựa chọn tải File được lưu trên slideshare.vn

Bước 3: Một thông báo xuất hiện ở phía cuối trình duyệt, hỏi bạn muốn lưu . - Nếu click vào Save, file sẽ được lưu về máy (Quá trình tải file nhanh hay chậm phụ thuộc vào đường truyền internet, dung lượng file bạn muốn tải)

Có nhiều phần mềm hỗ trợ việc download file về máy tính với tốc độ tải file nhanh như: Internet Download Manager (IDM), Free Download Manager, ... Tùy vào sở thích của từng người mà người dùng chọn lựa phần mềm hỗ trợ download cho máy tính của mình

VNU Journal of Science: Comp. Science & Com. Eng. Vol. 30, No. 4 (2014) 29–43
Toward an Interactive Method for DMEA-II and Application to the Spam-Email Detection System
Long Nguyen1, Lam Thu Bui1, Anh Quang Tran2
1 Le Quy Don Technical University, Vietnam 2 Hanoi University, Vietnam
Abstract
Multi-Objective Evolutionary Algorithms (MOEAs) have shown a great potential in dealing with many real-world optimizationproblems. Therehasbeenapopulartrendingettingsuitablesolutionsandincreasingtheconvergence ofMOEAsbyconsiderationofDecisionMakers(DMs)duringtheoptimizationprocess(inotherwordsinteracting with DM). Activities of DM includes checking, analyzing the results and giving the preference. In this paper, we propose an interactive method for DMEA-II and apply it to a spam-email detection system. In DMEA-II, an explicit niching operator is used with a set of rays which divides the space evenly for the selection of non-dominated solutions to ﬁll the solution archive and the population of the next generation. We found that, with DMEA-II solutions will eectively converge to Pareto optimal sets under the guidance of the ray system. By this reason, we propose an interactive method using three Ray based approaches: 1) Rays Replacement: The furthest rays from DM’s preferred region are replaced by new rays that generated from set of reference points. 2) Rays Redistribution: Which redistribute the system of rays to be in DM’s preferred region. 3) Value Added Niching: Based on the distances from non-dominated solutions in archive to DM’s preferred region, the niching values for the solutions is increased to be priority selected. By those approaches for the proposal interactive method, the next generation will be guided toward the DM’s preferred region. We carried out a case study on several popular test problems and it obtained good results. We apply the proposed method for a real application in a spam-email detection system. With this system, a set of feasible trade-o solutions will be oered for choosing scores and thresholds of the ﬁlter rules.
2014 Published by VNU Journal of Science.
Manuscript communication: received 01 April 2014, accepted 08 April 2014 Corresponding author: Long Nguyen, longit76@gmail.com
Keywords: Interactive, DMEA-II, Improvement Direction, Spread Direction, Convergence Direction.
1. Introduction
Methods for multi-objective optimization can be classiﬁed into several classes including the Interactive method. With the interactive method, DM iteratively directs the search process by indicating his/her preference information over the set of solutions until DM satisﬁes or prefers to stop the process [1]. An interesting
preference. To date, many interactive techniques have been proposed for solving MOPs [2, 3, 4, 5, 6, 7, 8, 9, 10]. It is worthwhile to note that the aim of interactive methods is to ﬁnd the most suitablesolutionsinseveralconﬂictingobjectives regarding the DM’s preference. It requires a mechanism to support DM in formulating his/her preferences and identifying preferred solutions in the set of Pareto optimal solutions.
feature of interactive methods is that during the In this paper, we introduce an interactive optimization process DM is able to learn about method for DMEA-II [11], a direction-based the underlying problem as well as his/her own multi-objective evolutionary algorithm. With
30 L. Nguyen et al. / VNU Journal of Science: Comp. Science & Com. Eng. Vol. 30, No. 4 (2014) 29–43
this proposal, we allow DM to specify a set of referencepointsrepresentingtheareaofinterests. Based on those reference points we propose three approaches to be used in the proposal
2. Reference-point interactive approaches
2.1. Concepts
In this section we summarize the reference
interactive method. The ﬁrst approach, the point interactive method, which is the most rays are generated from the reference points and popular one in the literature. It is suggested in
paralleled with the central line which starts from the ideal point to the centre of the hyperquadrant
[12]; and this method is known as a classical reference point approach. The idea is to control
containing POFs. In the second approach, the the search by reference points using achievement system of rays is redistributed to be in DM’s functions. Here the achievement function is preferred region. At the third approach, based constructed in such a way that if the reference
on the distances from non-dominated solutions in archive to DM’s preferred region, the niching values for the solutions is increased to be priority selected. Bytheproposalinteractivemethod,DM has more ﬂexibility to express his/her preference and the population will converge to preferred region. This is implemented via the niching mechanism in DMEA-II. If DM is not satisﬁed, he/she can specify other reference points. In our experiments, several test cases on well-known benchmark sets were carried out to demonstrate the method.
In applying the proposed method for a real application, we implemented it in a spam-email detection system (we call it as an interactive anti-
spam system). With this system, a set of feasible
point is dominated, the optimization will advance past the reference point to a non-dominated solution. A reference point z is given for an M-objective optimization problem of minimizing (f1(x);:::; fk(x))with x 2 S. Thensolveasingle-objective optimization problem as follows:
minmaxi=1[wi(fi(x) z)]
subjectto x 2 S. Thecommonstep-wisestructure
trade-o solutions are oered for choosing scores and thresholds. The two objectives for
considerationaretheSpamDetectionRate(SDR) and False Alarm Rate (FAR). For this multi-objective problem, DM has interaction with the optimization process in order to control the population converging toward his/her preferred areas.
In the remainder of the paper, section II brieﬂy describes the concepts and related works about multi-objective optimization interactive method using reference points. In section III we have a short description for DMEA-II. Section IV we propose our methodology for an interactive with DMEA-II. Section V presents simulation results
on several well-known test problems. The results
Fig. 1. Altering the reference point, Here ZA, ZB are reference points,w is chosen weight vector used for scalarizing the objectives.
of the interactive method as follows:
Step 1: Present information to the DM. Set h=1.
Step 2: Ask the DM to specify a reference point z.
Step 3: Minimize achievement function. Present zh to the DM.
for applying the proposed method for Spam Step 4: Calculate k other solutions with
EmailDetectionSystemareshownonsectionVI. Finally, the conclusion of this paper is outlined in section VII.
reference points.
z(i) = zh + dhei where dh = jjzh zhjj and ei is the ith unit vector.
L. Nguyen et al. / VNU Journal of Science: Comp. Science & Com. Eng. Vol. 30, No. 4 (2014) 29–43 31
Step 5: If the DM can select the ﬁnal preference-based evolutionary approach that
solution,stop. Otherwise,askDMtospecify zh+1. Set h = h+1 and go to Step 3.
Here h is the number that DM speciﬁes a reference point during process. By the way of using the series of reference points, DM actually tries to evaluate the region of Pareto Optimality, instead of one particular Pareto-optimal point. However DM usually deals with two situations:
1. The reference point is feasible and not a Pareto-optimal solution, DM is interested in knowing solutions which are Pareto-optimal ones and near the reference point.
2. DM ﬁnds Pareto-optimal solutions which is near the supplied reference point.
can be used as an integral part of an interactive algorithm. At each iteration, the DM is asked to give preference information in terms of his/her reference point consisting of desirable aspiration levels for objective functions. The information is used in an evolutionary algorithm to generate a new population by combining the ﬁtness function and an achievement scalarizing function. In multi-objective optimization, achievement scalarizing functions are widely used to project a given reference point into the Pareto optimal set. In the proposal method, the next population is thus more concentrated in the area where more preferred alternatives are assumed to lie and the whole Pareto optimal set does not have to be generated with equal accuracy.
2.2. Related interactive MOEAs In papers [9] and [10], two reference
In this section, we summarize several typical works on this area. In [4], authors proposed an interactive MOEA using a concept of the reference point and ﬁnding a set of preferred Pareto optimal solutions near the regions of interest to a DM. The authors suggest two approaches: The ﬁrst is to modify a well-known MOEA called NSGA-II, for eectively solving 10-objective. The other is to use hybrid-MOEA methodology in allowing DM to solve multi-objective optimization problems better and with more conﬁdence.
The authors proposed in [7], a trade-o analysis tool that was used to oer the DM a way to analyze solution candidates. The ideas proposed here are directed to users of both classiﬁcation and reference point based methods. The motivation here is that DM in certain cases miss additional local trade-o information so that they could get to know how values of objectives are changing, in other words, in which directions to direct the solution process so that they could avoid trial-and-error, that is, specifysomepreferenceinformationsothatmore preferred solutions will be generated.
In [1], the idea of incorporating preference information into evolutionary multi-objective optimization is discussed and proposed a
point interactive methods are proposed to use single or multi reference points with multi-objective optimization based on decomposition-based MOEA (MOEA/D). In this method, a single point or a set of reference points are used in objective space to represent for DM’s preferred region. The aggregated point from set of reference points (in case of multi-point) or the reference point is used in optimal process by two ways: replace or combine the current ideal point at the loop.
In paper [13], authors present a multiple reference point approach for multi-objective optimization problems of discrete and combinatorial nature. The reference points can be uniformly distributed within a region that coversthePareto OptimalFront. An evolutionary algorithm is based on an achievement scalarizing function that does not impose any restrictions with respect to the location of the reference points in the objective space. Authors dealt with the design of a parallelization strategy to eciently approximate the Pareto Optimal Front. Multiple reference points were used to uniformly divide the objective space into dierent areas. For each reference point, a set of approximate ecient solutions was found independently, so that the computation was performed in parallel.
32 L. Nguyen et al. / VNU Journal of Science: Comp. Science & Com. Eng. Vol. 30, No. 4 (2014) 29–43
3. DMEA-II
In this section, we summarize DMEA-II with the main ideas [11]. In DMEA-II, osprings are produced by using directions of improvement to perturb randomly-selected parental solutions. Two types of directional information are used to perturb the parental solutions prior to ospring production: convergence and spread (see Fig. 2):
Convergence direction (CD). In general deﬁned as the direction from a solution to a better one, CD in MOP is a normalized vector that points from a dominated solution to non-dominated one.
Spread direction (SD). Generally deﬁned as the direction between two equivalent
emit into a “hyperquadrant” of objective space, i.e. the sub space that is bounded by the k hyperplanes fi = fi;min, i 2 f1;2;:::;kg and described by fi fi;min8i 2 f1;2;:::;kg where fi;min minallA1;A2;::: fi with A1;A2;::: being the solutions stored in the current archive. By theirconstruction, thehyperquadrantcontainsthe estimated POF. A niching operator is used to the main population. From the second generation onward, the population is divided into two equal parts: one part for convergence, and one part for diversity. The ﬁrst part is ﬁlled by non-dominated solutions up to a maximum of n=2 solutions from the combined population, where n is the population size. This ﬁlling task is based
on niching information in the decision space.
solutions, SD in MOP is an unnormalized vector that points from one non-dominated solution to another.
3.2. General structure of algorithm
The step-wise structure of the DMEA-II algorithm [11] as follows:
Step1. Initializethemainpopulation Pwith size n.
Step 2. Evaluate the population P.
Step 3. Copy non-dominated solutions to the archive A.
Step 4. Generate an interim mixed population (M) of the same size n as P
Fig.2. Illustrationofconvergence(blackarrowsinobjective space - top left ﬁgure) and spread (hollow arrows - top right graph in decision variable space). Two types of ray distribution: parallel and non-parallel (bottom right and left graphs).
3.1. Niching information
A characteristic of solution quality in MOP is the even spread of non-dominated solutions across the POF [14]. In DMEA a bundle of rays are used to emit randomly from the estimated ideal point into the part of objective space that contains the POF estimate, (Fig. 2). The number of rays equals the number of non-
– Calculate nCD and nSD – Loop f
Select a random parent Par If (the number of CD < nCD)
Generate a CD and then generate a solution S1 by perturbing Par with CD
Add Add S1 and to M. End if
If (the number of SD < nSD)
Generate a SD and then generate a solution S2 by perturbing Par with SD.
Add Ss and to M.
dominated solutions wanted by the user. Rays End if
L. Nguyen et al. / VNU Journal of Science: Comp. Science & Com. Eng. Vol. 30, No. 4 (2014) 29–43 33
g Until (the mixed population is full).
Step 5. Perform the polynomial mutation operator [14] on the mixed population M with a small rate.
Step 6. Evaluate the mixed population M.
Step 7. Identify the estimated ideal point of the non-dominated solutions in M and determine a system of n rays R (starting from the ideal point and emitting uniformly intothehyperquadrantthatcontainsthenon-dominated solutions of M)
Step 8. Combine the interim mixed population M with the current archive A to formacombinedpopulationC (i.e. M+A ! C).
Step 9: Create new members of the archive A by copying non-dominated solutions from the combined population C
– Set counter i=0 – Loopf
Select a ray R(i).
In C, ﬁnd the non-dominated solution whose distance to R(i) is minimum.
Select this solution and copy it to the archive.
i = i+1
g Until (all rays are scanned)
Step 10: Determine the new population P for the next generation.
– Determine the number m of non-dominated solutions in C.
If m < n=2, select all non-dominated solutions from C and copy to P.
Else,
Sort non-dominated solutions in C according to niching values.
Copy the n=2 solutions with highest niching value to P.
– Repeatedly scan all rays copy maxfn
m;n=2g solutions to P.
Step 11: Go to Step 4 if stopping criterion is not satisﬁed.
In DMEA-II, the selection of non-dominated solutions to ﬁll the archive and the next population is assisted by a ray based technique of explicit niching in the objective space by using a system of straight lines or rays starting from the current estimation of the ideal point and dividing thespaceevenly. Eachrayisinchargeoflocating a non-dominated solution, for that reason, a ray has an important role in the optimization process. Bythisreason,weproposeaninteractive method using three Ray based approaches: Rays Replacement, Rays Redistribution and Value Added Niching approach. The details for the approaches will be described in next section. The proposed interactive MOEA bases on the system of ray is called the Ray based interactive method usingDMEA-II.Inourexperiments,theraysstart from generated points and paralleled with the central line of the top right hypequadrant.
4. Methodology
Due to the conﬂicts among the objectives in MOPs, the total number of Pareto optimal solutions might be very large or even inﬁnite. However, the DM may be only interested in preferred solutions instead of all Pareto optimal solutions. To ﬁnd the preferred solutions, the preference information is needed to guide the search towards the region of the PF of interest to the DM. Based on the role of the DM in the solution process, In an interactive method, the intermediate search results are presented to the DM to investigate; then the DM can understand the problem better and provide more preference
Determine density-based information for guiding the search. In this niching value for all non- paper proposed two guiding techniques used in dominated solutions in C. interactive method with MOEAs.
34 L. Nguyen et al. / VNU Journal of Science: Comp. Science & Com. Eng. Vol. 30, No. 4 (2014) 29–43
4.1. A ray-based interactive method
Thissection,aninteractivemethodforDMEA-
Step 5: Apply a niching to control external population(thearchive)andnextgeneration.
II [11] is introduced. With this proposal, DMs
are allowed to specify a set of reference points. With each reference point, a ray is generated by the similar way to building the system of rays in the original DMEA-II : the rays are generated from control points (which might be the reference points) and paralleled with the central line which starts from the ideal point to centre of the hyperquadrant containing POFs). In this way, DM has more ﬂexibility to express his preference. Among several methods for taking set information, we propose to deﬁne reference points by using three ray-based approaches: 1) Generate new rays and use them to replace some existing rays; 2) Redistribute the system of rays towards DM’s preferred region and 3) Increasing the niching values for non-dominated solutions based on their distance to DM’s preferred region. Those techniques are used to control the population to be convergeed to the DM’s preferred region. We hypothesise that by those techniques we have a good way to express DM’s preferences. After DM has speciﬁed a set of reference points, those techniques are applied and the Pareto optimal solutions are found that best corresponds to preferred region in objective space. If DM is not satisﬁed, he/she can specify
Fig. 3. Illustration of proposed ray based interactive method forDMEAina2-dimMOP.Threereferencepointsaregiven by DM: p1, p2, p3. pc is the central point of DM’s preferred region, there are three new rays (added rays) replace three ones (removed rays).
When DM interactive into the optimal process, we replace Step 7 in DMEA-II (see Section 3) with an interactive function is shown in Algorithm 1.
4.1.2. Rays Redistribution
This approach, the system of rays is oset by new DM’s referred region (see Fig: 4). The approach for interactive method as following
steps:
other reference points.
4.1.1. Rays Replacement
The approach for interactive method are
Step1: AskDMtoinputnp referencepoints whicharetheirpreferredregionsinobjective space.
described as following steps: Step 2: Calculate the boundary of DM’s
Step1: AskDMtoinputnp referencepoints whicharetheirpreferredregionsinobjective space.
Step 2: Generate np rays from reference points which paralleled with the central line.
Step 3: Calculate the central point of DM’s preferred region Pc.
Step 4: Find np rays which are farthest from Pc by np new ones are generated from Step 2.
preferred region DMbd.
Step 3: Oset the control points (that generate the system of rays) by DMbd.
Step 4: Generate a new system of rays by new list of control points.
Step 4: Apply a niching to control external population(thearchive)andnextgeneration.
When DM interactive into the optimal process, the Step 7 in DMEA-II (see Section 3) with an interactive function is shown in Algorithm 2.
L. Nguyen et al. / VNU Journal of Science: Comp. Science & Com. Eng. Vol. 30, No. 4 (2014) 29–43 35
Algorithm 1: Rays Replacement Function.
Input: Number of reference points np Output: New system of rays
for i 0 to n do
(1) Generate a ray ri from reference point pi ( ri through pi and paralleled with the
central line (see Fig. 2). Fig. 4. Illustration of proposed ray based interactive method forDMEAina2-dimMOP.Threereferencepointsaregiven
(2) Make a boundary of reference points by DM: p1, p2, p3. bThe system of rays is oset by DM’s (DM’s preferred region) and ﬁnd the central
point pc.
for j 0 to n (The number of rays) do
(3) Calculate the Euclid distance from ray(j) to pc.
(4) Sort the index of rays in decrease of Euclid distance values in (3) (Using the QuickSort).
Algorithm 2: Rays Redistribution Function.
Input: Number of reference points np Output: New system of rays
(1) Make a boundary of reference points (DM’s preferred region) DMbd.
(2) Calculate the ratio between DMbd and current boundary of the hyperquadrant which contains the POF r.
(5) Replace top np rays in the Sorted ray indexes with np ray from (1).
for j 0 to n (The number of control points) do
return n rays.;
4.1.3. Value Added Niching
In DMEA-II, the archive is used to store non-dominated solutions during evolutionary process, those solutions are calculated the distance to DM’s preferred region. These values are kept and add to niching values after calculation of niching values at Step 10 (see Section 3). The approach for interactive method as following steps:
Step1: AskDMtoinputnp referencepoints whicharetheirpreferredregionsinobjective space.
(3) Oset current control point with ratio r.
(4) Generate a new system of rays by the new list of control points.
return n rays.;
Step 4: Normalize the values of lv to be in [0,0.5].
Step 5: Adding values in lv after calculate the niching values in Step 10.
Step 6: Apply a niching (with additional values) to control external population (the archive) and next generation.
Step 2: Calculate the central point of DM’s When DM interactive into the optimal preferred region Pc. process, we replace Step 7 in DMEA-II (see Section 3) with an interactive function is shown
Step 3: Calculate the distance of each in Algorithm 3. Then the list is created above is
solution in archive to Pc and store these values to a list lv.
used to add to niching values in Step 10 during generations.
36 L. Nguyen et al. / VNU Journal of Science: Comp. Science & Com. Eng. Vol. 30, No. 4 (2014) 29–43
Algorithm 3: Value Added Niching Function.
ZDT3: It has a Pareto-optimal front disconnected and convex:
Input: Number of reference points np Output: A list of values in [0,0.5]
(1) Make a boundary of reference points (DM’s preferred region) and ﬁnd the central point pc.
for j 0 to popsize (The archive’s size) do
(2) Calculate the Euclid distance from solution(j) to pc.
(3) Normalize the distances