top of page

On-going research

A Membership Probability Based Undersampling Algorithm for Imbalanced Data

Classifiers for a highly imbalanced dataset tend to bias in majority classes and, as a result, the minority class samples are usually misclassified as majority class. To overcome this, a proper undersampling technique that removes some majority samples can be an alternative. We propose an efficient and simple undersampling method for imbalanced datasets and show the proposed method outperforms others with respect to four different performance measures by several illustrative experiments, especially for highly imbalanced datasets.

Feature Selection Algorithm to Find Potential Feature Subset for Binary Text Classification

Most feature selection algorithms compose a feature subset by selecting individually good features. They naturally ignore a feature which is not individually good, while it can effectively improve classification performance when it is combined with other features. This kind of feature is called potential feature, and the subset that leads to good classification performance even though some elements in the subset are not individually good is called potential feature subset in this paper. We propose a method to calculate class-relevance of two or more binary features to find the potential features, and develop potential feature selection algorithm (PFSA) for binary text classification. We experimentally show that the proposed algorithm outperforms the classical and state-of-the-art feature selection algorithms in terms of average accuracy.

Task Allocation Algorithm Considering Collaboration Potential in Cloud Manufacturing

 

Cloud manufacturing (CM) provides a manufacturing platform in which enterprises share their resource and cooperate with each other to deal with customer requests. Allocate each task to proper group of enterprises is a challenging problem addressed in several researches. However, there is no previous research considering collaboration potential among enterprises when allocating or scheduling problems in CM although collaboration is the main character of CM. The scheduling problem considering collaboration among the enterprises retaining different manufacturing resources such as design, production and logistics is investigated in this paper. Afterwards, The mathematical model for the problem whose objective is to maximize the sum of expected customer ratings for tasks and constraints are each enterprise cannot use its manufacturing resource beyond  its capacity (capacity constraint) and each task must be completed before its due date (due date constraint) is established. Finally, a heuristic algorithm for solving the problem, called SCP algorithm, is developed and its applying process is demonstrated.

bottom of page