matlab / python | feature fusion image retrieval / CNN features (5) - 모델 적용 test하는 거 있음
Deep Learning 2016. 9. 8. 20:44
MatconvNet
MatConvNet convolutional neural network tool is a matlab-based open-source package, and offers a variety of pre-trained models. More detailed information on their own google.
About MatConvNet configuration
pre-train models 选取
1.imagenet-googlenet-day.The food
2.imagenet-vgg-m.mat
3.imagenet-VGG-very deep 16.mat
Feature Extraction
Because of the need to use prior to completion of the integration of two features, so features will save all the pictures to the same file, making it easy to use.
imagenet-vgg-m.mat
run ./matconvnet-1.0-beta17/matlab/vl_setupnn
net = load('imagenet-vgg-m.mat');
addpath('LULC');
imgFiles = dir('LULC');
imgNamList = {imgFiles(~[imgFiles.isdir]).name};
imgNamList = imgNamList';
numImg = length(imgNamList);
feat = [];
for i =1:numImg
img = imread(imgNamList{i, 1});
if size(img, 3) == 3
im_ = single(img) ; % note: 255 range
im_ = imresize(im_, net.meta.normalization.imageSize(1:2)) ;
im_ = im_ - net.meta.normalization.averageImage ;
res = vl_simplenn(net, im_) ;
% viesion: matconvnet-1.0-beta17
featVec = res(17).x;
featVec = featVec(:);
feat = [feat; featVec'];
else
im_ = single(repmat(img,[1 1 3])) ; % note: 255 range
im_ = imresize(im_, net.meta.normalization.imageSize(1:2)) ;
im_ = im_ - net.meta.normalization.averageImage ;
res = vl_simplenn(net, im_) ;
% viesion: matconvnet-1.0-beta17
featVec = res(17).x;
featVec = featVec(:);
feat = [feat; featVec'];
end
end
resultName = 'D:\img\image-Retrieval\cnn\vgg_m.txt';
fid = fopen(resultName,'w');
[r,c] = size(feat);
for k = 1:r
for j = 1:c
fprintf(fid,'%f ',feat(k,j));
end
fprintf(fid,'\n');
end
imagenet-googlenet-day.The food
run ./matconvnet-1.0-beta17/matlab/vl_setupnn
modelPath = 'imagenet-googlenet-dag.mat' ;
net = dagnn.DagNN.loadobj(load(modelPath)) ;
addpath('LULC');
imgFiles = dir('LULC');
imgNamList = {imgFiles(~[imgFiles.isdir]).name};
imgNamList = imgNamList';
numImg = length(imgNamList);
feat = [];
for i =1:numImg
img = imread(imgNamList{i, 1});
if size(img, 3) == 3
im_ = single(img) ; % note: 255 range
im_ = imresize(im_, net.meta.normalization.imageSize(1:2)) ;
im_ = im_ - net.meta.normalization.averageImage ;
net.eval({'data', im_}) ;
% viesion: matconvnet-1.0-beta17
featVec = net.vars(152).value;
featVec = featVec(:);
feat = [feat; featVec'];
else
im_ = single(repmat(img,[1 1 3])) ; % note: 255 range
im_ = imresize(im_, net.meta.normalization.imageSize(1:2)) ;
im_ = im_ - net.meta.normalization.averageImage ;
net.eval({'data', im_}) ;
% viesion: matconvnet-1.0-beta17
featVec = net.vars(152).value;
featVec = featVec(:);
feat = [feat; featVec'];
end
end
resultName = 'D:\img\image-Retrieval\cnn\googlenet.txt';
fid = fopen(resultName,'w');
[r,c] = size(feat);
for k = 1:r
for j = 1:c
fprintf(fid,'%f ',feat(k,j));
end
fprintf(fid,'\n');
end
imagenet-VGG-very deep 16.mat
run ./matconvnet-1.0-beta17/matlab/vl_setupnn
net = load('imagenet-vgg-verydeep-16.mat');
addpath('LULC');
imgFiles = dir('LULC');
imgNamList = {imgFiles(~[imgFiles.isdir]).name};
imgNamList = imgNamList';
numImg = length(imgNamList);
feat = [];
for i =1:numImg
img = imread(imgNamList{i, 1});
fprintf('%s is extract cnn.\n',imgNamList{i,1});
if size(img, 3) == 3
im_ = single(img) ; % note: 255 range
im_ = imresize(im_, net.meta.normalization.imageSize(1:2)) ;
im_ = bsxfun(@minus,im_,net.meta.normalization.averageImage) ;
res = vl_simplenn(net, im_) ;
% viesion: matconvnet-1.0-beta17
featVec = res(33).x;
featVec = featVec(:);
feat = [feat; featVec'];
else
im_ = single(repmat(img,[1 1 3])) ; % note: 255 range
im_ = imresize(im_, net.meta.normalization.imageSize(1:2)) ;
im_ = bsxfun(@minus,im_,net.meta.normalization.averageImage) ;
res = vl_simplenn(net, im_) ;
% viesion: matconvnet-1.0-beta17
featVec = res(33).x;
featVec = featVec(:);
feat = [feat; featVec'];
end
end
resultName = 'D:\img\image-Retrieval\cnn\vgg_vd.txt';
fid = fopen(resultName,'w');
[r,c] = size(feat);
for k = 1:r
for j = 1:c
fprintf(fid,'%f ',feat(k,j));
end
fprintf(fid,'\n');
end
Retrieval
#coding:utf8
import os
import numpy as np
import re
from sklearn import preprocessing
def load_features():
fobj = open('vgg_vd.txt')
im_features = []
for line in fobj:
line = line.rstrip()
line = line.split()
im_feature = []
for l in line:
im_feature.append(float(l))
im_features.append(im_feature)
im_features = np.array(im_features)
im_features = preprocessing.normalize(im_features, norm='l2')
return im_features
def match_all(query_feature,im_features):
score = np.dot(query_feature, im_features.T)
rank_ID = np.argsort(-score)
return rank_ID
def get_img_id():
filename = "AllimgName.txt" # 所有图片文件名的txt文件
fobj = open(filename)
AllimgName = []
for line in fobj:
line = line.rstrip()
AllimgName.append(line)
return AllimgName
if __name__ == '__main__':
path = 'result'
AllimgName = get_img_id()
feat = load_features()
im_features = feat
a = 0
for im in feat:
rank_ID = match_all(im,im_features)
name = AllimgName[a]
real_name = re.sub(r'.tif','.txt',name)
id_name = re.sub(r'.tif','_id.txt',name)
real_name = path+'\\'+ 'vgg_vd\\'+'name' +'\\'+ real_name
id_name = path +'\\'+ 'vgg_vd\\'+'id' +'\\'+ id_name
fobj1 = open(real_name,"w")
fobj2 = open(id_name,"w")
for i in rank_ID:
fobj1.write(AllimgName[i]+'\n')
fobj2.write(str(i)+' ')
fobj1.close()
fobj2.close()
a += 1
Since then we need to use Graph integration, and Graph integration need to use the photo id, so the way the id is also preserved. About retrieve additional models directly modify the model and the path to save the final result loaded ok.
result
Merge
Based on previous fusion code, perform Graph and adaptive integration.
Fine tuning
1. Build your own data imdb.mat
In matconvnet, the official specified a file format. However, in the examples in matconvnet, I do not understand how it is generated imdb.mat file. After google to a related paper , we found a method of generating.
function imdb =setup_data(averageImage)
%code for Computer Vision, Georgia Tech by James Hays
%This path is assumed to contain 'test' and 'train' which each contain 15
%subdirectories. The train folder has 100 samples of each category and the
%test has an arbitrary amount of each category. This is the exact data and
%train/test split used in Project 4.
SceneJPGsPath = 'data/LULC/';
num_train_per_category = 80;
num_test_per_category = 20; %can be up to 110
total_images = 21*num_train_per_category + 21 * num_test_per_category;
image_size = [224 224]; %downsampling data for speed and because it hurts
% accuracy surprisingly little
imdb.images.data = zeros(image_size(1), image_size(2), 1, total_images, 'single');
imdb.images.labels = zeros(1, total_images, 'single');
imdb.images.set = zeros(1, total_images, 'uint8');
image_counter = 1;
categories = {'agricultural', 'airplane', 'baseballdiamond', 'beach', ...
'buildings', 'chaparral', 'denseresidential', ...
'forest', 'freeway', 'golfcourse', 'harbor', ...
'intersection', 'mediumresidential', 'mobilehomepark', 'overpass',...
'parkinglot','river','runway','sparseresidential','storagetanks','tenniscourt'};
sets = {'train', 'test'};
fprintf('Loading %d train and %d test images from each category\n', ...
num_train_per_category, num_test_per_category)
fprintf('Each image will be resized to %d by %d\n', image_size(1),image_size(2));
%Read each image and resize it to 224x224
for set = 1:length(sets)
for category = 1:length(categories)
cur_path = fullfile( SceneJPGsPath, sets{set}, categories{category});
cur_images = dir( fullfile( cur_path, '*.tif') );
if(set == 1)
fprintf('Taking %d out of %d images in %s\n', num_train_per_category, length(cur_images), cur_path);
cur_images = cur_images(1:num_train_per_category);
elseif(set == 2)
fprintf('Taking %d out of %d images in %s\n', num_test_per_category, length(cur_images), cur_path);
cur_images = cur_images(1:num_test_per_category);
end
for i = 1:length(cur_images)
cur_image = imread(fullfile(cur_path, cur_images(i).name));
cur_image = single(cur_image);
cur_image = imresize(cur_image, image_size);
%cur_image = bsxfun(@minus,cur_image,averageImage) ;
cur_image = cur_image - averageImage;
if(size(cur_image,3) > 1)
fprintf('color image found %s\n', fullfile(cur_path, cur_images(i).name));
cur_image = rgb2gray(cur_image);
end
% Stack images into a large 224 x 224 x 1 x total_images matrix
% images.data
imdb.images.data(:,:,1,image_counter) = cur_image;
imdb.images.labels( 1,image_counter) = category;
imdb.images.set( 1,image_counter) = set; %1 for train, 2 for test (val?)
image_counter = image_counter + 1;
end
end
end
The function returns imdb save ok.
Warning 下面的微调代码存在问题!!!
2.fine tune
With regard to the pre-trained models were fine tune, currently only completed imagenet-vgg-verydeep-16.mat and imagenet-vgg-m.mat two models of fine-tuning, this model corresponds googlenet, then any problem.
function [net,info] = fine_tune()
run ./matconvnet-1.0-beta17/matlab/vl_setupnn;
imdb = load('imdb.mat');
net = load('imagenet-vgg-verydeep-16.mat');
opts.train.expDir = fullfile('data','vd0.005') ;
net.layers = net.layers(1:end-2);
net.layers{end+1} = struct('type', 'conv', ...
'weights', , ...
'learningRate', [0.005,0.002], ...
'stride', [1 1], ...
'pad', [0 0 0 0]) ;
opts.train.batchSize = 20 ;
opts.train.learningRate = logspace(-4, -5.5, 300) ;
opts.trian.numEpochs = numel(opts.train.learningRate) ;
opts.train.continue = true ;
net.layers{end+1} = struct('type', 'softmaxloss') ;
[net, info] = cnn_train(net, imdb, @getBatch, ...
opts.train,...
'val', find(imdb.images.set == 2)) ;
save('fine_tune.mat','-struct', 'net')
end
function [im, labels] = getBatch(imdb, batch)
%getBatch is called by cnn_train.
im = imdb.images.data(:,:,:,batch) ;
labels = imdb.images.labels(1,batch) ;
end
More fine tune information please refer to
result
Relatively speaking, the result of fine-tuning to enhance the rate did not produce a good result adaptive fusion.
to sum up
About two fusion
My understanding is that the fusion of the two methods, that the result of this adaptive fusion method is more suitable for the calculation of the standard mAP fusion, the final results from the point of view is the same.
In Graph fusion, the presence of the most deadly thing is, in the integration process, the need for the fusion of two or more results is calculated once a common sub-graph, then looking at the process, not necessarily include all of the pictures (2100 pictures) sort (due mAP is the picture sort dispersed throughout the relevant search results, about mAP refer to ). This will to some extent limit its mAP results.
after that
For image retrieval this one do about it, and he sent a model trimming on googlenet not completed and has been submitted to an issues on github, it has not been restored.
Once this is done I can use them to do something?
I think they have spare time to do a little project: submit a photo and tell you what the image content inside Yes.
Related knowledge:
Estimate must first obtain a certain amount of good pictures already marked, such as crawling enough data from wikipedia or Baidu Encyclopedia, and then image feature extraction data to construct its own database.
Then also free of these written code review and rewrite the current write readable code is still very poor, can not read from time to time to write their own code.
EOF
'Deep Learning' 카테고리의 다른 글
Free Deep Learning Books (0) | 2016.11.28 |
---|---|
Keras Tutorials (0) | 2016.11.11 |
Deep Learning Resources (1) | 2016.09.08 |
How to use the network trained using cnn_mnist example in MatConvNet? (0) | 2016.09.06 |
How to Start Learning Deep Learning (0) | 2016.08.24 |