Computer Vision:the assignments composed of 2 exercises 1- Compute VLAD and Fisher Vector Aggregation of Images, from the given VLAD and FV models, implementing the following functions. Notice that the feature nxd f need to be projected to the desired lower dimension via, f0=f*A(:,1:kd), to match the VLAD model dimension before calling this function.
2- Now benchmarking the TPR-FPR performance of various feature and aggregation scheme performance against the mini CDVS data set: https://app.box.com/s/oea1ng52b3ghac813qgry6v6xds3rpzy. For HoG feature, let us have kd=[8, 16] and number of cluster nc=[32, 64] for VLAD and FV models, and for the SIFT and DenseSIFT features, let us have kd=[24, 48], nc=[32, 64, 96]. So for each image, we will have 4 + 6 = 10 different feature + aggregation representations. For the total of N images in the mini CDVS dataset, we have M=N*(N-1)/2 total image pairs, and the matching pairs ground truth are given, we only care about the first 100 matching pairs and first 100 non-matching pairs in the fid.mat, which has two variables mp and nmp. Each has a row of two image filenames to their associated images, e.g, mp(1,:): mp_2.jpg and mp1_2.jpg are two matching pairs:Language: Python