Be the only one, not the best one :: matlab / python | feature fusion image retrieval / CNN features (5)

matlab / python | feature fusion image retrieval / CNN features (5) - 모델 적용 test하는 거 있음

Deep Learning 2016. 9. 8. 20:44

http://www.noneface.com/2016/03/08/img_retrieval.html

MatconvNet

MatConvNet convolutional neural network tool is a matlab-based open-source package, and offers a variety of pre-trained models. More detailed information on their own google.

About MatConvNet configuration

Please refer to

Or the official website

pre-train models 选取

1.imagenet-googlenet-day.The food

2.imagenet-vgg-m.mat

3.imagenet-VGG-very deep 16.mat

Feature Extraction

Because of the need to use prior to completion of the integration of two features, so features will save all the pictures to the same file, making it easy to use.

imagenet-vgg-m.mat

run ./matconvnet-1.0-beta17/matlab/vl_setupnn
net = load('imagenet-vgg-m.mat');
addpath('LULC');
imgFiles = dir('LULC');
imgNamList = {imgFiles(~[imgFiles.isdir]).name};
imgNamList = imgNamList';
numImg = length(imgNamList);
feat = [];
for i =1:numImg
   img = imread(imgNamList{i, 1});

   if size(img, 3) == 3
       im_ = single(img) ; % note: 255 range
       im_ = imresize(im_, net.meta.normalization.imageSize(1:2)) ;
       im_ = im_ - net.meta.normalization.averageImage ;
       res = vl_simplenn(net, im_) ;
       % viesion: matconvnet-1.0-beta17
       featVec = res(17).x;
       featVec = featVec(:);
       feat = [feat; featVec'];
   else
       im_ = single(repmat(img,[1 1 3])) ; % note: 255 range
       im_ = imresize(im_, net.meta.normalization.imageSize(1:2)) ;
       im_ = im_ - net.meta.normalization.averageImage ;
       res = vl_simplenn(net, im_) ;
       
       % viesion: matconvnet-1.0-beta17
       featVec = res(17).x;
       featVec = featVec(:);
       feat = [feat; featVec'];
   end

end
resultName = 'D:\img\image-Retrieval\cnn\vgg_m.txt';
fid = fopen(resultName,'w');
[r,c] = size(feat);
for k = 1:r
    for j = 1:c
        fprintf(fid,'%f ',feat(k,j));
    
    end
    fprintf(fid,'\n');
end

imagenet-googlenet-day.The food

run ./matconvnet-1.0-beta17/matlab/vl_setupnn
modelPath = 'imagenet-googlenet-dag.mat' ;
net = dagnn.DagNN.loadobj(load(modelPath)) ;
addpath('LULC');
imgFiles = dir('LULC');
imgNamList = {imgFiles(~[imgFiles.isdir]).name};
imgNamList = imgNamList';
numImg = length(imgNamList);
feat = [];
for i =1:numImg
   img = imread(imgNamList{i, 1});

   if size(img, 3) == 3
       im_ = single(img) ; % note: 255 range
       im_ = imresize(im_, net.meta.normalization.imageSize(1:2)) ;
       im_ = im_ - net.meta.normalization.averageImage ;
       net.eval({'data', im_}) ;
       % viesion: matconvnet-1.0-beta17
       featVec = net.vars(152).value;
       featVec = featVec(:);
       feat = [feat; featVec'];
   else
       im_ = single(repmat(img,[1 1 3])) ; % note: 255 range
       im_ = imresize(im_, net.meta.normalization.imageSize(1:2)) ;
       im_ = im_ - net.meta.normalization.averageImage ;
       net.eval({'data', im_}) ;
       % viesion: matconvnet-1.0-beta17
       featVec = net.vars(152).value;
       featVec = featVec(:);
       feat = [feat; featVec'];
   end
end
resultName = 'D:\img\image-Retrieval\cnn\googlenet.txt';
fid = fopen(resultName,'w');
[r,c] = size(feat);
for k = 1:r
    for j = 1:c
        fprintf(fid,'%f ',feat(k,j));
    
    end
    fprintf(fid,'\n');
end

imagenet-VGG-very deep 16.mat

run ./matconvnet-1.0-beta17/matlab/vl_setupnn
net = load('imagenet-vgg-verydeep-16.mat');
addpath('LULC');
imgFiles = dir('LULC');
imgNamList = {imgFiles(~[imgFiles.isdir]).name};
imgNamList = imgNamList';
numImg = length(imgNamList);
feat = [];
for i =1:numImg
   img = imread(imgNamList{i, 1});
    fprintf('%s is extract cnn.\n',imgNamList{i,1});
   if size(img, 3) == 3
       im_ = single(img) ; % note: 255 range
       im_ = imresize(im_, net.meta.normalization.imageSize(1:2)) ;
       im_ = bsxfun(@minus,im_,net.meta.normalization.averageImage) ;
       res = vl_simplenn(net, im_) ;
       % viesion: matconvnet-1.0-beta17
       featVec = res(33).x;
       featVec = featVec(:);
       feat = [feat; featVec'];
   else
       im_ = single(repmat(img,[1 1 3])) ; % note: 255 range
       im_ = imresize(im_, net.meta.normalization.imageSize(1:2)) ;
       im_ = bsxfun(@minus,im_,net.meta.normalization.averageImage) ;
       res = vl_simplenn(net, im_) ;
       
       % viesion: matconvnet-1.0-beta17
       featVec = res(33).x;
       featVec = featVec(:);
       feat = [feat; featVec'];
   end
end
resultName = 'D:\img\image-Retrieval\cnn\vgg_vd.txt';
fid = fopen(resultName,'w');
[r,c] = size(feat);
for k = 1:r
    for j = 1:c
        fprintf(fid,'%f ',feat(k,j));
    
    end
    fprintf(fid,'\n');
end

Retrieval

#coding:utf8
import os
import numpy as np
import re
from sklearn import preprocessing

def load_features():

	fobj = open('vgg_vd.txt')
	im_features = []
	for line in fobj:
		line = line.rstrip()
		line = line.split()	
		
		im_feature = []
		for l in line:
			im_feature.append(float(l))
		
		im_features.append(im_feature)

	im_features = np.array(im_features)
	im_features = preprocessing.normalize(im_features, norm='l2')
	return im_features

def match_all(query_feature,im_features):
	score = np.dot(query_feature, im_features.T)
	rank_ID = np.argsort(-score)
	return rank_ID

def get_img_id():

	filename = "AllimgName.txt" # 所有图片文件名的txt文件
	fobj = open(filename)
	AllimgName = []
	for line in fobj:
		line = line.rstrip()
		AllimgName.append(line)
	return AllimgName

if __name__ == '__main__':
	path = 'result'
	AllimgName = get_img_id()
	feat = load_features()
	im_features = feat
	a = 0
	for im in feat:
		rank_ID = match_all(im,im_features)
		name = AllimgName[a]
		real_name = re.sub(r'.tif','.txt',name)
		id_name = re.sub(r'.tif','_id.txt',name)

		real_name = path+'\\'+ 'vgg_vd\\'+'name' +'\\'+ real_name
		id_name = path +'\\'+ 'vgg_vd\\'+'id' +'\\'+ id_name
		fobj1 = open(real_name,"w")
		fobj2 = open(id_name,"w")

		for i in rank_ID:
			 fobj1.write(AllimgName[i]+'\n')
			 fobj2.write(str(i)+' ')
		fobj1.close()
		fobj2.close()
		a += 1

Since then we need to use Graph integration, and Graph integration need to use the photo id, so the way the id is also preserved. About retrieve additional models directly modify the model and the path to save the final result loaded ok.

result

Merge

Based on previous fusion code, perform Graph and adaptive integration.

Fine tuning

1. Build your own data imdb.mat

In matconvnet, the official specified a file format. However, in the examples in matconvnet, I do not understand how it is generated imdb.mat file. After google to a related paper , we found a method of generating.

function imdb =setup_data(averageImage)
%code for Computer Vision, Georgia Tech by James Hays

%This path is assumed to contain 'test' and 'train' which each contain 15
%subdirectories. The train folder has 100 samples of each category and the
%test has an arbitrary amount of each category. This is the exact data and
%train/test split used in Project 4.
SceneJPGsPath = 'data/LULC/';

num_train_per_category = 80;
num_test_per_category  = 20; %can be up to 110
total_images = 21*num_train_per_category + 21 * num_test_per_category;

image_size = [224 224]; %downsampling data for speed and because it hurts
% accuracy surprisingly little

imdb.images.data   = zeros(image_size(1), image_size(2), 1, total_images, 'single');
imdb.images.labels = zeros(1, total_images, 'single');
imdb.images.set    = zeros(1, total_images, 'uint8');
image_counter = 1;

categories = {'agricultural', 'airplane', 'baseballdiamond', 'beach', ...
              'buildings', 'chaparral', 'denseresidential', ...
              'forest', 'freeway', 'golfcourse', 'harbor', ...
              'intersection', 'mediumresidential', 'mobilehomepark', 'overpass',...
              'parkinglot','river','runway','sparseresidential','storagetanks','tenniscourt'};
          
sets = {'train', 'test'};

fprintf('Loading %d train and %d test images from each category\n', ...
          num_train_per_category, num_test_per_category)
fprintf('Each image will be resized to %d by %d\n', image_size(1),image_size(2));

%Read each image and resize it to 224x224
for set = 1:length(sets)
    for category = 1:length(categories)
        cur_path = fullfile( SceneJPGsPath, sets{set}, categories{category});
        cur_images = dir( fullfile( cur_path,  '*.tif') );
        
        if(set == 1)
            fprintf('Taking %d out of %d images in %s\n', num_train_per_category, length(cur_images), cur_path);
            cur_images = cur_images(1:num_train_per_category);
        elseif(set == 2)
            fprintf('Taking %d out of %d images in %s\n', num_test_per_category, length(cur_images), cur_path);
            cur_images = cur_images(1:num_test_per_category);
        end

        for i = 1:length(cur_images)

            cur_image = imread(fullfile(cur_path, cur_images(i).name));
            cur_image = single(cur_image);
            cur_image = imresize(cur_image, image_size);
            
            %cur_image = bsxfun(@minus,cur_image,averageImage) ;
            cur_image = cur_image - averageImage;
            
            if(size(cur_image,3) > 1)
                fprintf('color image found %s\n', fullfile(cur_path, cur_images(i).name));
                cur_image = rgb2gray(cur_image);
                
            end
           
            
            % Stack images into a large 224 x 224 x 1 x total_images matrix
            % images.data
            imdb.images.data(:,:,1,image_counter) = cur_image; 
            
            imdb.images.labels(  1,image_counter) = category;
            imdb.images.set(     1,image_counter) = set; %1 for train, 2 for test (val?)
            
            image_counter = image_counter + 1;
        end
    end
end

The function returns imdb save ok.

Warning 下面的微调代码存在问题！！！

2.fine tune

With regard to the pre-trained models were fine tune, currently only completed imagenet-vgg-verydeep-16.mat and imagenet-vgg-m.mat two models of fine-tuning, this model corresponds googlenet, then any problem.

function [net,info] = fine_tune()
run ./matconvnet-1.0-beta17/matlab/vl_setupnn;

imdb = load('imdb.mat');
net = load('imagenet-vgg-verydeep-16.mat');
opts.train.expDir = fullfile('data','vd0.005') ;

net.layers = net.layers(1:end-2);
net.layers{end+1} = struct('type', 'conv', ...
'weights', , ...
'learningRate', [0.005,0.002], ...
'stride', [1 1], ...
'pad', [0 0 0 0]) ;

opts.train.batchSize = 20 ;
opts.train.learningRate = logspace(-4, -5.5, 300) ;
opts.trian.numEpochs = numel(opts.train.learningRate) ;
opts.train.continue = true ;

net.layers{end+1} = struct('type', 'softmaxloss') ;
    
[net, info] = cnn_train(net, imdb, @getBatch, ...
    opts.train,...
    'val', find(imdb.images.set == 2)) ;
save('fine_tune.mat','-struct', 'net')
end

function [im, labels] = getBatch(imdb, batch)
%getBatch is called by cnn_train.
im = imdb.images.data(:,:,:,batch) ;

labels = imdb.images.labels(1,batch) ;
end

More fine tune information please refer to

result

Relatively speaking, the result of fine-tuning to enhance the rate did not produce a good result adaptive fusion.

to sum up

About two fusion

My understanding is that the fusion of the two methods, that the result of this adaptive fusion method is more suitable for the calculation of the standard mAP fusion, the final results from the point of view is the same.

In Graph fusion, the presence of the most deadly thing is, in the integration process, the need for the fusion of two or more results is calculated once a common sub-graph, then looking at the process, not necessarily include all of the pictures (2100 pictures) sort (due mAP is the picture sort dispersed throughout the relevant search results, about mAP refer to ). This will to some extent limit its mAP results.

after that

For image retrieval this one do about it, and he sent a model trimming on googlenet not completed and has been submitted to an issues on github, it has not been restored.

Once this is done I can use them to do something?

I think they have spare time to do a little project: submit a photo and tell you what the image content inside Yes.

Related knowledge:

Estimate must first obtain a certain amount of good pictures already marked, such as crawling enough data from wikipedia or Baidu Encyclopedia, and then image feature extraction data to construct its own database.

Then also free of these written code review and rewrite the current write readable code is still very poor, can not read from time to time to write their own code.

EOF

저작자표시 비영리 동일조건 (새창열림)

'Deep Learning' 카테고리의 다른 글

Free Deep Learning Books (0)	2016.11.28
Keras Tutorials (0)	2016.11.11
Deep Learning Resources (1)	2016.09.08
How to use the network trained using cnn_mnist example in MatConvNet? (0)	2016.09.06
How to Start Learning Deep Learning (0)	2016.08.24

Posted by uniqueone

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Be the only one, not the best one