


Run Gaussian process model - wrapper for gpml toolbox
FORMAT output = prt_machine_gpml(d,args)
Inputs:
d - structure with data information, with mandatory fields:
.train - training data (cell array of matrices of row vectors,
each [Ntr x D]). each matrix contains one representation
of the data. This is useful for approaches such as
multiple kernel learning.
.test - testing data (cell array of matrices row vectors, each
[Nte x D])
.testcov - testing covariance (cell array of matrices row vectors,
each [Nte x Nte])
.tr_targets - training labels (for classification) or values (for
regression) (column vector, [Ntr x 1])
.use_kernel - flag, is data in form of kernel matrices (true) or in
form of features (false)
args - argument string, where
-h - optimise hyperparameters (otherwise don't)
-f iter - max # iterations for optimiser (ignored if -h not set)
-l likfun - likelihood function:
'likErf' - erf/probit likelihood (binary only)
-c covfun - covariance function:
'covLINkcell' - simple dot product
'covLINglm' - construct a GLM
-m meanfun - mean function:
'meanConstcell' - suitable for dot product
'meanConstglm' - suitable for GLM
-i inffun - inference function:
'prt_infEP' - Expectation Propagation
experimental args (use at your own risk):
-p - use priors for the hyperparameters. If specified, this
indicates that a maximum a posteriori (MAP) approach
will be used to set covariance function
hyperparameters. The priors are obtained by calling
prt_gp_priors('covFuncName')
N.B.: for the arguments specifying functions, pass in a string, not
a function handle. This script will generate a function handle
Output:
output - output of machine (struct).
* Mandatory fields:
.predictions - predictions of classification or regression [Nte x D]
* Optional fields:
.type - which type of machine this is (here, 'classifier')
.func_val - predictive probabilties
.mu - test latent means
.s2 - test latent variances
.loghyper - log hyperparameters
.nlml - negative log marginal likelihood
.alpha - GP weighting coefficients
.sW - likelihood matrix (see Rasmussen & Williams, 2006)
.L - Cholesky factor
__________________________________________________________________________
Copyright (C) 2011 Machine Learning & Neuroimaging Laboratory

0001 function output = prt_machine_gpml(d,args) 0002 % Run Gaussian process model - wrapper for gpml toolbox 0003 % FORMAT output = prt_machine_gpml(d,args) 0004 % Inputs: 0005 % d - structure with data information, with mandatory fields: 0006 % .train - training data (cell array of matrices of row vectors, 0007 % each [Ntr x D]). each matrix contains one representation 0008 % of the data. This is useful for approaches such as 0009 % multiple kernel learning. 0010 % .test - testing data (cell array of matrices row vectors, each 0011 % [Nte x D]) 0012 % .testcov - testing covariance (cell array of matrices row vectors, 0013 % each [Nte x Nte]) 0014 % .tr_targets - training labels (for classification) or values (for 0015 % regression) (column vector, [Ntr x 1]) 0016 % .use_kernel - flag, is data in form of kernel matrices (true) or in 0017 % form of features (false) 0018 % args - argument string, where 0019 % -h - optimise hyperparameters (otherwise don't) 0020 % -f iter - max # iterations for optimiser (ignored if -h not set) 0021 % -l likfun - likelihood function: 0022 % 'likErf' - erf/probit likelihood (binary only) 0023 % -c covfun - covariance function: 0024 % 'covLINkcell' - simple dot product 0025 % 'covLINglm' - construct a GLM 0026 % -m meanfun - mean function: 0027 % 'meanConstcell' - suitable for dot product 0028 % 'meanConstglm' - suitable for GLM 0029 % -i inffun - inference function: 0030 % 'prt_infEP' - Expectation Propagation 0031 % experimental args (use at your own risk): 0032 % -p - use priors for the hyperparameters. If specified, this 0033 % indicates that a maximum a posteriori (MAP) approach 0034 % will be used to set covariance function 0035 % hyperparameters. The priors are obtained by calling 0036 % prt_gp_priors('covFuncName') 0037 % 0038 % N.B.: for the arguments specifying functions, pass in a string, not 0039 % a function handle. This script will generate a function handle 0040 % 0041 % Output: 0042 % output - output of machine (struct). 0043 % * Mandatory fields: 0044 % .predictions - predictions of classification or regression [Nte x D] 0045 % * Optional fields: 0046 % .type - which type of machine this is (here, 'classifier') 0047 % .func_val - predictive probabilties 0048 % .mu - test latent means 0049 % .s2 - test latent variances 0050 % .loghyper - log hyperparameters 0051 % .nlml - negative log marginal likelihood 0052 % .alpha - GP weighting coefficients 0053 % .sW - likelihood matrix (see Rasmussen & Williams, 2006) 0054 % .L - Cholesky factor 0055 %__________________________________________________________________________ 0056 % Copyright (C) 2011 Machine Learning & Neuroimaging Laboratory 0057 0058 % Written by A Marquand 0059 % $Id: prt_machine_gpml.m 530 2012-05-17 13:50:42Z amarquan $ 0060 0061 % configure default parameters for GP optimisation 0062 meanfunc = @meanConstcell; 0063 covfunc = @covLINkcell; 0064 maxeval = -20; 0065 if strcmp(d.pred_type,'classification') 0066 mode = 'classifier'; % it's good to be consistent! 0067 likfunc = @likErf; 0068 inffunc = @prt_infEP; 0069 else 0070 mode = 'regression'; 0071 likfunc = @likGauss; 0072 inffunc = @prt_infExact; 0073 end 0074 0075 % Error checks 0076 % ------------------------------------------------------------------------- 0077 SANITYCHECK=true; % can turn off for "speed". Expert only. 0078 0079 if SANITYCHECK==true 0080 % args should be a string (empty or otherwise) 0081 if ~ischar(args) 0082 error('prt_machine_gpml:argsNotString',['Error: gpml'... 0083 ' args should be a string. ' ... 0084 ' SOLUTION: Please do XXX']); 0085 end 0086 0087 % check we can reach the binary library 0088 if ~exist('prt_gp','file') 0089 error('prt_machine_gpml:libNotFound',['Error:'... 0090 ' ''prt_gp'' function could not be found !' ... 0091 ' SOLUTION: Please check your path.']); 0092 end 0093 % check whether it is a two-class classification problem 0094 uTL=unique(d.tr_targets(:)); 0095 k=numel(uTL); % number of classes 0096 if strcmp(mode,'classifier') && k > 2 0097 warning('prt_machine_gpml:classificationWithMoreThanTwoClasses',... 0098 ['Classification specified with > 2 classes. ',... 0099 'Defaulting to multiclass Laplace approximation.']); 0100 output = prt_machine_gpclap(d,args); 0101 return; 0102 end 0103 % are we using a kernel ? 0104 if ~d.use_kernel 0105 error('prt_machine_gpml:useKernelIsFalse',['Error:'... 0106 ' This machine is currently only implemented for kernel data ' ... 0107 'SOLUTION: Please set use_kernel to true']); 0108 end 0109 end 0110 0111 % parse input arguments 0112 % ------------------------------------------------------------------------- 0113 % hyperparameters 0114 if ~isempty(regexp(args,'-h','once')) 0115 optimise_theta = true; 0116 eargs = regexp(args,'-f\s+[0-9]*','match'); 0117 if ~isempty(eargs) 0118 eargs = regexp(cell2mat(eargs),'-f\s+','split'); 0119 maxeval = str2num(['-',cell2mat(eargs(2))]); 0120 end 0121 else 0122 optimise_theta = false; 0123 end 0124 % likelihood function 0125 largs = regexp(args,'-l\s+[a-zA-Z0-9_]*','match'); 0126 if ~isempty(largs) 0127 largs = regexp(cell2mat(largs),'-l\s+','split'); 0128 likfunc = str2func(cell2mat(largs(2))); 0129 if strcmpi(cell2mat(largs(2)),'Erf') 0130 likfunc = @likErf; 0131 end 0132 end 0133 % covariance function 0134 cargs = regexp(args,'-c\s+[a-zA-Z0-9_]*','match'); 0135 if ~isempty(cargs) 0136 cargs = regexp(cell2mat(cargs),'-c\s+','split'); 0137 covfunc = str2func(cell2mat(cargs(2))); 0138 end 0139 % mean function 0140 margs = regexp(args,'-m\s+[a-zA-Z0-9_]*','match'); 0141 if ~isempty(margs) 0142 margs = regexp(cell2mat(margs),'-m\s+','split'); 0143 meanfunc = str2func(cell2mat(margs(2))); 0144 end 0145 % inference function 0146 iargs = regexp(args,'-i\s+[a-zA-Z0-9_]*','match'); 0147 if ~isempty(iargs) 0148 iargs = regexp(cell2mat(iargs),'-i\s+','split'); 0149 inffunc = str2func(cell2mat(iargs(2))); 0150 end 0151 % priors 0152 if ~isempty(regexp(args,'-p','once')) 0153 disp('Empirical priors specified. Using MAP for hyperparameters') 0154 priors = prt_gp_priors(func2str(covfunc)); 0155 map = true; 0156 else 0157 map = false; 0158 end 0159 0160 % Set default hyperparameters 0161 % ------------------------------------------------------------------------- 0162 nhyp = str2num([feval(covfunc); feval(likfunc); feval(meanfunc)]); 0163 if nhyp(1) > 0 0164 hyp.cov = zeros(nhyp(1),1); 0165 end 0166 if nhyp(2) > 0 0167 hyp.lik = zeros(nhyp(2),1); 0168 end 0169 if nhyp(3) > 0 0170 hyp.mean = zeros(nhyp(3),1); 0171 end 0172 0173 % Assemble data matrices 0174 % ------------------------------------------------------------------------- 0175 % handle the glm as a special case (for now) 0176 if strcmpi(func2str(covfunc),'covLINglm') || strcmpi(func2str(covfunc),'covLINglm_2class') 0177 % configure covariances 0178 K = [d.train(:)' {d.tr_param}]; 0179 Ks = [d.test(:)' {d.te_param}]; 0180 Kss = [d.testcov(:)' {d.te_param}]; 0181 0182 % get default hyperparamter values 0183 hyp.cov = log(prt_glm_design); 0184 0185 [tmp1 tmp2 tmp3 tr_lbs] = prt_glm_design(hyp.cov, d.tr_param); 0186 [tmp1 tmp2 tmp3 te_lbs] = prt_glm_design(hyp.cov, d.te_param); 0187 else 0188 % configure covariances 0189 K = d.train; 0190 Ks = d.test; 0191 Kss = d.testcov; 0192 0193 tr_lbs = d.tr_targets; 0194 te_lbs = d.te_targets; 0195 end 0196 0197 % configure targets 0198 if strcmp(mode,'classifier') 0199 % convert targets to +1/-1 0200 y = -1*(2 * tr_lbs - 3); 0201 else 0202 y = tr_lbs; 0203 end 0204 0205 % Train and test GP model 0206 % ------------------------------------------------------------------------- 0207 % train 0208 if optimise_theta 0209 if map 0210 [hyp,nlmls] = minimize(hyp, @prt_gp_map, maxeval, inffunc, meanfunc, covfunc, likfunc, K, y, priors); 0211 else 0212 [hyp nlmls] = minimize(hyp, @prt_gp, maxeval, inffunc, meanfunc, covfunc, likfunc, K, y); 0213 end 0214 else 0215 nlmls = prt_gp(hyp, inffunc, meanfunc, covfunc, likfunc, K, y); 0216 end 0217 0218 % make predictions 0219 [ymu ys2 fmu fs2 lp post] = prt_gp(hyp, inffunc, meanfunc, covfunc, likfunc,K, y, Ks, zeros(size(Ks{1},1),1), Kss); 0220 0221 % Outputs 0222 % ------------------------------------------------------------------------- 0223 if strcmp(mode,'classifier') 0224 p = exp(lp); 0225 output.predictions = (1-real(p > 0.5)) + 1; 0226 output.func_val = p; 0227 else % regression 0228 output.predictions = ymu; 0229 output.func_val = output.predictions; 0230 end 0231 output.type = mode; 0232 output.loghyper = hyp; 0233 output.mu = ymu; 0234 output.s2 = ys2; 0235 output.nlml = min(nlmls); 0236 output.tr_targets = tr_lbs; 0237 output.te_targets = te_lbs; 0238 output.alpha = post.alpha; 0239 output.sW = post.sW; 0240 output.L = post.L; 0241 0242 end 0243