Issue No. 05 - May (2013 vol. 35)
Jianhui Chen , GE Global Res., San Ramon, CA, USA
Lei Tang , Walmart Labs., San Bruno, CA, USA
Jun Liu , Siemens Corp. Res., Princeton, NJ, USA
Jieping Ye , Dept. of Comput. Sci. & Eng., Arizona State Univ., Tempe, AZ, USA
In this paper, we consider the problem of learning from multiple related tasks for improved generalization performance by extracting their shared structures. The alternating structure optimization (ASO) algorithm, which couples all tasks using a shared feature representation, has been successfully applied in various multitask learning problems. However, ASO is nonconvex and the alternating algorithm only finds a local solution. We first present an improved ASO formulation (iASO) for multitask learning based on a new regularizer. We then convert iASO, a nonconvex formulation, into a relaxed convex one (rASO). Interestingly, our theoretical analysis reveals that rASO finds a globally optimal solution to its nonconvex counterpart iASO under certain conditions. rASO can be equivalently reformulated as a semidefinite program (SDP), which is, however, not scalable to large datasets. We propose to employ the block coordinate descent (BCD) method and the accelerated projected gradient (APG) algorithm separately to find the globally optimal solution to rASO; we also develop efficient algorithms for solving the key subproblems involved in BCD and APG. The experiments on the Yahoo webpages datasets and the Drosophila gene expression pattern images datasets demonstrate the effectiveness and efficiency of the proposed algorithms and confirm our theoretical analysis.
Optimization, Algorithm design and analysis, Vectors, Fasteners, Complexity theory, Prediction algorithms, Acceleration, accelerated projected gradient, Multitask learning, shared predictive structure, alternating structure optimization
Jun Liu, Jieping Ye, Jianhui Chen and Lei Tang, "A Convex Formulation for Learning a Shared Predictive Structure from Multiple Tasks," in IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 35, no. , pp. 1025-1038, 2013.